Technology Sharing

65. Overview of Flink’s DataStream Connectors

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

1) Overview
1. Predefined Source and Sink

Predefined data sources support reading data from files, directories, sockets, as well as collections and iterators.

Predefined data sinks support writing data to files, standard output (stdout), standard error output (stderr), and sockets.

2. Included connector

Connectors can interact with a variety of third-party systems and currently support the following systems.

Apache Kafka (source/sink)
Apache Cassandra (source/sink)
Amazon DynamoDB (sink)
Amazon Kinesis Data Streams (source/sink)
Amazon Kinesis Data Firehose (sink)
DataGen (source)
Elasticsearch (sink)
Opensearch (sink)
FileSystem (sink)
RabbitMQ (source/sink)
Google PubSub (source/sink)
Hybrid Source (source)
Apache Pulsar (source)
JDBC (sink)
MongoDB (source/sink)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
3. Connectors in Apache Bahir

Flink has additional connectors released via Apache Bahir, including:

Apache ActiveMQ (source/sink)
Apache Flume (sink)
Redis (sink)
Akka (sink)
Netty (source)
  • 1
  • 2
  • 3
  • 4
  • 5
4. Other ways to connect to Flink
a) Asynchronous I/O

Using connectors is not the only way to get data into or out of Flink.

You can query data from an external database or Web service to get the initial data stream, and then Map orFlatMap For initial data stream processing, Flink provides an asynchronous I/O API to make the process simpler, more efficient, and more stable.