Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
完成 #1236 後,紀錄實驗結果,並修正 #1236 提到錯誤的算法。
BuiltInPartitioner
選擇 partitioner 的算法,它算機率是用最大的隊伍長度來算的,當時記憶錯誤,記成是用隊伍總長度來計算。所以這裡重新解釋先前提到的例子:Built-In Partitioner 會根據 accumulator 內,batch 排隊的長度作為選擇 partition 的機率。
然後這是我們配置的 partition
這裡舉個例子來解釋:假設 B1, B2 都很壅塞,accumulator 內要發給 B1, B2 partition 的訊息都堆積了 100 筆,然後要發給 B3 partition 的訊息沒有堆積。
接下來要算每個 partition 被選擇到的機率:
B1 有 16 partitions, accumulator 內各堆積 100 筆
B2 有 16 partitions, accumulator 內各堆積 100 筆
B3 有 4 partitions, accumulator 內沒有堆積
最大堆積長度:100
每個 partition 會用 (最大堆積長度 + 1 ) 減去 堆積長度,來作為選中該 partition 的權重。
所以 B1 的每個 partition 都有 (101-100) = 1 被選中的權重
B2 的每個 partition 都有 (101-100) = 1 被選中的權重
B3 的每個 partition 都有 (101-0) = 101 被選中的權重
1*16+1*16+101*4 = 436
所以選中 B1 的 partition 的機率是 16/436
選中 B2 的 partition 的機率是 16/436
選中 B1 的 partition 的機率是 404/436
對
BuiltInPartitioner
來說, topic partition 的分佈雖然也有對機率有影響,但也不是絕對的。另外,回答之前的問題,多次實驗觀察發現,不一定只有 B3 出現較低的吞吐量 (這一次的實驗結果是 B2 吞吐量較低),目前猜測是
BuiltInPartitioner
間形成的 "動態平衡" 。