其他分享
首页 > 其他分享> > Flink Table Api 之表函数使用

Flink Table Api 之表函数使用

作者:互联网

表函数(Table Functions)

比如,通过自定义表函数,对读取数据中的某些字段可以根据业务需求做一些特殊的处理 下面看一个具体的代码案例
package com.congge.table.api.udf;


import com.congge.source.SensorReading;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.java.StreamTableEnvironment;
import org.apache.flink.table.functions.TableFunction;
import org.apache.flink.types.Row;


public class TestTableFunction {

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

        String path = "E:\\code-self\\flink_study\\src\\main\\resources\\sensor.txt";

        // 1. 读取数据
        DataStreamSource<String> inputStream = env.readTextFile(path);

        // 2. 转换成POJO
        DataStream<SensorReading> dataStream = inputStream.map(line -> {
            String[] fields = line.split(",");
            return new SensorReading(fields[0], new Long(fields[1]), new Double(fields[2]));
        });

        // 3. 将流转换成表
        Table sensorTable = tableEnv.fromDataStream(dataStream, "id, timestamp as ts, temperature as temp");

        // 4. 自定义表函数,实现将id拆分,并输出(word, length)
        // table API
        Split split = new Split("_");

        // 需要在环境中注册UDF
        tableEnv.registerFunction("split", split);
        Table resultTable = sensorTable
                .joinLateral("split(id) as (word, length)")
                .select("id, ts, word, length");

        // SQL写法
        tableEnv.createTemporaryView("sensor", sensorTable);
        Table resultSqlTable = tableEnv.sqlQuery("select id, ts, word, length " +
                " from sensor, lateral table(split(id)) as splitid(word, length)");

        // 打印输出
        tableEnv.toAppendStream(resultTable, Row.class).print("result");
        tableEnv.toAppendStream(resultSqlTable, Row.class).print("sql");

        env.execute();
    }

    // 实现自定义TableFunction
    public static class Split extends TableFunction<Tuple2<String, Integer>>{
        // 定义属性,分隔符
        private String separator = ",";

        public Split(String separator) {
            this.separator = separator;
        }

        // 必须实现一个eval方法,没有返回值
        public void eval( String str ){
            for( String s: str.split(separator) ){
                collect(new Tuple2<>(s, s.length()));
            }
        }
    }

}

本例的需求是,通过 Flink Table Api读取原始文件数据,然后通过自定义表函数,将读取到的id字段数据通过类似侧写的方式进行输出,统计id字段长度,运行上面的代码,观察控制台打印效果;

 

标签:flink,Flink,tableEnv,之表,Api,org,apache,import,id
来源: https://blog.csdn.net/congge_study/article/details/123608099