前言

翻譯加整理~
How to choose the number of topics/partitions in a Kafka cluster

如何確定Topic需要多少個Partitions

一般情況是數據吞吐決定，這里的吞吐的單位是MB/s，這里暫時不考慮kafka服務端的單partition的吞吐瓶頸，而是考慮Producer和Consumer兩端的吞吐

Producer

生產者的吞吐和以下幾個配置有關：

batching size
compression codec
acks
replication factor

一般情況下，一個Producer的吞吐在10MB/s左右

Consumer

Consumer的吞吐和用戶邏輯強相關，所以需要consumer的業務邏輯實現方來評估consumer的吞吐能力

確定partition數目

Given:

p : producer throughput in MB/S

c : consumer throughput in MB/s

t : overall throughtput in MB/s

Result:

NumOfPartition = max(t/p, t/c)

動態增加Partitions

Partition是可以動態增加的，但是需要盡量在業務接入最初，對parttion數目做準確評估，因為不是所有的業務場景都適合做動態增加Partition數目操作。對于Keyed messge，可以配置消息會按照key的hash值做partition的路由，這也保證了相同的key的消息的消費是保序的。如果動態增加partition數目，可能會導致亂序問題。對于這樣的業務場景，一個安全的擴容方案是先停掉所有的producer， consumer全部消費完數據后，再做 add partition操作，然后在恢復producer的寫入

partition數目過多帶來的問題

增加open file handles
增加Broker宕機恢復時間
增加延遲

對每臺Broker來說，partition的數目不應該超過 100 * (num of brokers in cluster) * (replication-factor), 對于個10臺broker，replication-factor=2的集群，單機partition的數目不應該超過 2000個~

結論

確定Topic的一個合適的Partition數目很重要，太少了， producer或者consumer會出現讀寫平靜，太多了，會引起其他問題

參考文章

How to choose the number of topics/partitions in a Kafka cluster

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Kafka的Topic的partitions數目設置最佳實踐

Kafka的Topic的partitions數目設置最佳實踐

前言

如何確定Topic需要多少個Partitions

Producer

Consumer

確定partition數目

動態增加Partitions

partition數目過多帶來的問題

結論

參考文章

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Kafka的Topic的partitions數目設置最佳實踐

前言

如何確定Topic需要多少個Partitions

Producer

Consumer

確定partition數目

動態增加Partitions

partition數目過多帶來的問題

結論

參考文章

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频