2024-07-12
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
Broker: Refers to a node in a Kafka cluster. A Kafka cluster consists of multiple Brokers that work together to handle the storage, transmission, and consumption of messages. Brokers manage one or more partitions.
Topic:The producer sends messages to the specified Topic, and the consumer subscribes to the Topic to get the messages. The Topic itself is just a logical grouping and does not have the concept of physical storage.
Partition: It is a subset of Topic and is the basic unit for storing and processing messages in Kafka. Each Topic can be divided into multiple Partitions, and each Partition is an ordered, immutable sequence of messages.
Replica: A partition can have multiple copies.
Leader Broker: Under multiple copies of a partition, the broker responsible for handling all read and write requests for the partition.
FollowerBroker: In the case of multiple copies of a partition, the broker responsible for synchronizing the Leader's data in the partition.
The producer sends the message (record) to Kafka, and the consumer obtains the data through the offset (similar to the subscript of an array).
At the same time, each partition will have its own log file, and Kafka uses log files to save data to disk.
The producer connects to the Kafka cluster through the Bootstrap Broker. This step is to establish the initial connection and obtain the metadata of the cluster.
Once the producer obtains this metadata, it knows who the leader broker is for each partition and can send the message directly to the correct leader broker.
A producer must specify a Topic when sending a message, but partitions are optional.
In Kafka, when a producer sends a message to a Broker, the first operation of the Broker is to record the message to disk to ensure the persistence and reliability of the message.
Consumers in Kafka usually belong to a consumer group. Each consumer group has a unique group ID. The concept of consumer group is used to achieve load balancing and parallel consumption of messages.
When multiple consumers belong to the same group, Kafka assigns topic partitions to consumers in the group.Each partition can only be consumed by one consumer in the group, so that load balancing can be achieved.
A single consumer subscribes to a Topic:
Multiple consumers belong to the same group:
Multiple consumers belong to different groups:
Kafka creates new partitions in the cluster. These new partitions are assigned to different Brokers to achieve balanced data storage and high availability. Kafka does not automatically redistribute or balance data from existing partitions to new partitions. New partitions are empty when they are created, and data is written to these new partitions only when subsequent producers send messages. The consumer group will sense the change in the number of partitions and trigger a rebalance.
Kafka allows each partition to have multiple replicas, which are stored on different brokers. One replica is called the Leader, which is responsible for processing all read and write requests, and the other replicas are Followers, which are responsible for synchronizing the Leader's data.
Among multiple replicas, only one replica can read and write at the same time. This is the Leader replica. The other replicas become Follower replicas and are used as backups.