Technology Sharing

Kafka Producer

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Producer

Producers are responsible for creating messages and then delivering them to Kafka.
insert image description here

Load Balancing

  1. Polling strategy.
  2. Random strategy.
  3. Hash according to the key.

Kafka's default partitioning strategy: If a key is specified, messages with the same key are sent to the same partition (partitions are ordered); if no key is specified, a polling strategy is used.

Send a message

acks specifies how many partition replicas must receive the message before the producer considers the message written successfully.

There is no superiority or inferiority among the three methods, but the appropriate strategy should be selected according to local conditions. A trade-off needs to be made between performance and reliability.

There are three ways to send messages:

Send and forget.

acks = 0。
After sending a message, the producer does not wait for any response from the broker. In most cases, the message can be delivered successfully because Kafka is highly available and the producer has a mechanism to automatically try to resend. However, if a non-retryable error or timeout occurs, the message will be lost and the application will not receive any information or exceptions.
With other configurations being the same, setting acks to 0 can achieve maximum throughput.

Asynchronous Send

acks=1, default value.
After the producer sends the message, as long as the leader replica of the partition successfully writes the message, it will receive a successful response from the server.

  • If the message cannot be written to the leader replica, for example, when the leader replica crashes and a new leader replica is being re-elected, the producer will receive an error response. To avoid message loss, the producer can choose to resend the message.
  • If the message is written to the leader replica and a successful response is returned to the producer, and the leader replica crashes before being pulled by other follower replicas, the message will still be lost because the newly elected leader replica does not have the corresponding message.
    Setting acks to 1 is a compromise between message reliability and throughput.
Synchronous Send

acks = -1 or acks = all.
After sending a message, the producer needs to wait for all replicas in the ISR to successfully write the message before receiving a successful response from the server.
Under the same configuration environment, setting acks to -1 (all) can achieve the strongest reliability. However, this does not mean that the message is reliable, because there may only be a leader copy in the ISR, which degenerates to the situation of acks=1.