Flink 맞춤형 Data Sources

3539 단어

source는 데이터를 읽는 곳입니다.Flink에서 소스를 가져오는 방법은 다음과 같습니다.

File-based:

readTextFile(path) - Reads text files, i.e. files that respect the TextInputFormat specification, line-by-line and returns them as Strings.

readFile(fileInputFormat, path) - Reads (once) files as dictated by the specified file input format.

readFile(fileInputFormat, path, watchType, interval, pathFilter, typeInfo)

Socket-based:

socketTextStream - Reads from a socket. Elements can be separated by a delimiter.

Collection-based:

fromCollection(Collection) - Creates a data stream from the Java Java.util.Collection. All elements in the collection must be of the same type.

fromCollection(Iterator, Class) - Creates a data stream from an iterator. The class specifies the data type of the elements returned by the iterator.

fromElements(T ...) - Creates a data stream from the given sequence of objects. All objects must be of the same type.

fromParallelCollection(SplittableIterator, Class) - Creates a data stream from an iterator, in parallel. The class specifies the data type of the elements returned by the iterator.

generateSequence(from, to) - Generates the sequence of numbers in the given interval, in parallel.

Custom:

addSource - Attach a new source function. For example, to read from Apache Kafka you can use addSource(new FlinkKafkaConsumer08<>(...)) . See connectors for more details.

사용자 정의 데이터 소스

package com.river.streaming;

import org.apache.commons.lang3.RandomUtils;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.SourceFunction;

import java.util.Random;

/**
 * @author river
 * @date 2019/4/24 12:59
 **/
public class DataSourcesTest {
    public static void main(String[] args) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
            env.addSource(new MySource())
                    .print();
        env.execute("DataSourcesTest");
    }

    public static class MySource implements SourceFunction> {
        private static final long serialVersionUID = 1L;
        private volatile boolean isRunning = true;

        public MySource() {
            this.isRunning = true;
        }

        @Override
        public void run(SourceContext> sourceContext) throws Exception {
            while (isRunning){
                Thread.sleep(1000);
                if(new Random().nextBoolean()){
                    sourceContext.collect(Tuple2.of("frank", RandomUtils.nextInt(0,100)));
                   continue;
                }
                sourceContext.collect(Tuple2.of("lucy", RandomUtils.nextInt(0,100)));
            }
        }
        @Override
        public void cancel() {
            isRunning = false;
        }
    }
}

실행 결과
1> (lucy,77) 2> (lucy,95) 3> (lucy,6) 4> (lucy,40) 5> (frank,15) 6> (lucy,11) 7> (lucy,12) .... ....
참조 주소https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/datastream_api.html#data-sources

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

다양한 언어의 JSON

JSON은 Javascript 표기법을 사용하여 데이터 구조를 레이아웃하는 데이터 형식입니다. 그러나 Javascript가 코드에서 이러한 구조를 나타낼 수 있는 유일한 언어는 아닙니다. 저는 일반적으로 '객체'{}...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다