add multiple caches for accelerating available container count calculation #667

sangreal · 2023-11-19T08:26:57Z

Description...
This pr is for reopening for previous approval pr : #644

Explanation

introduce cache for ready-to-be-retried container count (could be selected to retry containers). Therefore the
actually available count = all available container count - (still pending for running retry containers count)
The caches calculation happens when containers are selected to run user function (before / after)
The available work containers will change only after the container selection, therefore it is accurate to update the cache at this point.
no extra threading needed in this solution comparing to previous one
The improvement is substantial. The performance of lessKeysThanThreads has been improvement from 01:24 -> 00:38

prev flow

current flow

Possible Question

Q : Previously the expired retry is calculated on fly which looks more accurate than new flow?
A : Actually they will be eventually same since for controlLoop, it will wait for <= (latest retry time) and it will update the caches. Previous flow, the available work container count also only update after (latest retry time)

Checklist

Documentation (if applicable)
Changelog

…queue

rkolesnev · 2024-02-09T12:53:54Z

There is a bit missing for tracking the counts in ProcessingShard.removeStaleWorkContainersFromShard() - i think it would need to remove from retryQueue, check and decrement expiredRetryContainerCnt and the main availableWorkContainerCnt.

rkolesnev · 2024-02-09T13:30:39Z

Hmm the more i am getting my head round this the more i think we should solve it in two-fold approach.
Basically we have 3 conditions that make up that count - container meets:
isNotInFlight() && !isUserFunctionSucceeded() && isDelayPassed()
We can directly track the isNotInFlight and !isUserFunctionSucceeded through the counter - as there are specific events that trigger their change - taken as work, succeeded, failed, etc.
But the last one - isDelayPassed() is time based - so we cannot really directly track its state from events - as there are no events - we really should be just checking that by observing during the call to getNumberOfWorkQueuedInShardsAwaitingSelection().

So what i am proposing is to keep the availableWorkContainerCnt and use it to track addition / removal of work containers from shard, and track the isNotInFlight and !isUserFunctionSucceeded() - which it basically does now in this PR, but at the same time remove expiredRetryContainerCnt and retryItemCnt and still scan the retryQueue - at least up to the first container that has its delay not yet expired (as its ordered by retry delay).

sangreal · 2024-02-10T03:34:26Z

@rkolesnev Thanks a lot for your review. I think your solution also a good thought but still have to scan through the retryQueue.

I totally understand your concern but if you could take a look at the above flow diagram and some explanations.
The updates in getNumberOfWorkQueuedInShardsAwaitingSelection could be covered in controlLoop. Because it will only wait for <= (latest retry time = first retryable time) and it will update all caches.
In previous flow, the available work container count also only update after (latest retry time)

https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/internal/AbstractParallelEoSStreamProcessor.java#L771

Let me give you an example:
Say now is 00:00, there are two retry items to be retried at 00:02 & 00:05. and there are no pending containers

for old flow:

it will check during every polling in BrokerPollSystem, but the count will stay same which is 0 until 00:02, the count will be 1 and after it processed, it will be back to zero. Until 00:05, the second one will be picked up, then the count will be 1 again.

for new flow:

during every polling in BrokerPollSystem, the count will be not changed since cache is unchanged.
in controlLoop , the Duration timeToBlockFor = shouldTryCommitNow ? Duration.ZERO : getTimeToBlockFor(); , therefore the timeToBlockFor will be <= min(all retry container waiting time). So at 00:02, the controlLoop will be run and updates (there is no blocking call in this loop), and count (expiredRetryContainerCnt -> 1) will be update to 1. Same applied to the second until 00:05, controlLoop will be processed to update the count to 1.

Conclusion

so as you could see, the updates timing is the same between previous and new flow

rkolesnev · 2024-02-12T11:52:05Z

@sangreal - i had tested the calculations using integration tests and they are matching my explanation below, i can provide the test if needed - i had existing test modified in place to have a large retry queue that is drained slowly and observed the getNumberOfWorkQueuedInShardsAwaitingSelection() count staying at 0 during processing of retryQueue.

The problem with updating the available work that is in retry queue and has its delay elapsed is that the expiredRetryContainerCnt is only ever increased in getWorkIfAvailable() - and that only scans up to the number of work requested workToGetDelta:

         while (workTaken.size() < workToGetDelta && iterator.hasNext()) {
            var workContainer = iterator.next().getValue();

            if (pm.couldBeTakenAsWork(workContainer)) {
                if (workContainer.isAvailableToTakeAsWork()) {
                    log.trace("Taking {} as work", workContainer);

                    // only increase the ExpiredRetryContainerCnt when this is retry due since already added in the new container creation
                    if (workContainer.isDelayExistsExpired()) {
                        expiredRetryContainerCnt.incrementAndGet();
                    }
                    workContainer.onQueueingForExecution();
                    workTaken.add(workContainer);
                } else {
                ...

For example if we have 1000 items in the retry queue, all with delay of 5 seconds - once the delay elapsed - all 1000 are available for work - but we would only update the available work as we are taking them into processing and say with 16 processing threads - we would take only 16 and the rest will still be in retry queue but not counted as available work.
On top of that - as we are taking them as work - they will get marked as inFlight and be excluded from the available work count straight away.
So effectively they will never show up as available work.
Implications of this is that BrokerPoller will be polling for more work instead of applying back-pressure and pausing consumers - which may lead to OOM / really deep queues etc in extreme / edge cases.

It all goes back to the fact that the retry delay is time based - and the only way to really know which/how many containers in retry queue have their retry delay elapsed and are available as work - is to scan the retry queue.

It does not have to be a full scan - as the retry queue is sorted by nextRetryDueAt - so it would be enough to scan up to the first container that still has the delay that is not elapsed to get the count of available / not available as work in retry queue.

In general i don't think scanning retry queue will have big performance implications - with thinking that - if the retry queue is small - then the scan is fast and if the retry queue is large - than we are in bad state anyway and probably not that concerned about the overhead introduced by scanning it - as processing is already slowed down by having a lot of messages to retry.

sangreal · 2024-02-12T14:44:11Z

@rkolesnev Thanks for the detailed explanation. I got your points.
I miss the workToGetDelta in getWorkIfAvailable which will indeed lead to some missing containers state updating.
Let me work on updates based on your suggestions.

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkContainer.java

sangreal · 2024-02-13T06:17:15Z

@rkolesnev I have updated the pr according to your suggestions, meanwhile I keep the retryItemCnt since this count is accurate and will help the accelerate the expired items calculation while expiredRetryContainerCnt is removed.
Please help review again.

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkContainer.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java

sangreal · 2024-02-14T08:21:38Z

@rkolesnev Thanks for your detailed review. I have fixed according to your comment, except for one. Please help review again

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java

sangreal · 2024-02-14T14:12:56Z

@rkolesnev

// 2. the container has been selected and it is inflight but we already slashed them from availableWorkContainerCnt, so should be counted in
This is not correct - we do not count inflight containers as available for selection.
The intent of the getCountOfWorkAwaitingSelection is to get a number for all work that is ready to be processed - to determine if Consumer should poll for more messages - or there is already enough queued for processing - so that excludes work that is already being processed / inflight.

Regarding this, getCountOfWorkAwaitingSelection indeed do get a number for all work that is ready to be processed and excludes inflight messages in the updated logic. This could be reflected in UT.
I think this pr's goal is
(1) ensure getCountOfWorkAwaitingSelection return the correct number
(2) performance improvement
(3) makes the code as concise and clean as possible.
But simple copy old logic might not be the final goal. Looking forward to hearing your idea on this one.

rkolesnev · 2024-02-14T16:34:56Z

@rkolesnev

// 2. the container has been selected and it is inflight but we already slashed them from availableWorkContainerCnt, so should be counted in
This is not correct - we do not count inflight containers as available for selection.
The intent of the getCountOfWorkAwaitingSelection is to get a number for all work that is ready to be processed - to determine if Consumer should poll for more messages - or there is already enough queued for processing - so that excludes work that is already being processed / inflight.

Regarding this, getCountOfWorkAwaitingSelection indeed do get a number for all work that is ready to be processed and excludes inflight messages in the updated logic. This could be reflected in UT. I think this pr's goal is (1) ensure getCountOfWorkAwaitingSelection return the correct number (2) performance improvement (3) makes the code as concise and clean as possible. But simple copy old logic might not be the final goal. Looking forward to hearing your idea on this one.

Sure - but we have to keep the logic uniform - we cannot exclude inflight work that was never retried, but include inflight work that is being retried - that would just give a weird number / behaviour that differs based on wether the messages were retried or not.
The check would be done only on WorkContainers that are in retry queue and are ready to be retried to determine if they are in fact still in the queue (thus available to be taken as work) or already in flight / being processed (thus not ready to be taken as work).

Let me have another look at the code - i am thinking if availableWorkContainerCnt is already being decremented when we take work inflight - regardless of is it from retry queue or not - so we maybe already accounting for them that way...

rkolesnev · 2024-02-14T17:09:28Z

Let me have another look at the code - i am thinking if availableWorkContainerCnt is already being decremented when we take work inflight - regardless of is it from retry queue or not - so we maybe already accounting for them that way...

Ok - yeah - that is already taken care of by decrementing availableWorkContainerCnt in the ProcessingShard.getWorkIfAvailable(...) call - when we take work into processing. So you are right - we do not need to exclude them again when counting items in retry queue.

rkolesnev · 2024-02-14T17:11:55Z

Ok - so i am happy enough with it - thank you very much for going back and forth with the PR with me.
Only outstanding bit left is to take care of retryQueue in ShardManager when removing stale work.

sangreal · 2024-02-15T06:03:19Z

Ok - so i am happy enough with it - thank you very much for going back and forth with the PR with me. Only outstanding bit left is to take care of retryQueue in ShardManager when removing stale work.

Thanks for your time for the review! Let me change a bit on my previous pr's code related to stale containers handling and get back to you

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java

sangreal · 2024-02-15T08:50:31Z

@rkolesnev please help check the updates on the stale container removal for retryQueue, thanks a lot for your review again.

rkolesnev · 2024-02-22T13:17:38Z

Hi @sangreal - the PR is ready to be merged - can you please sign the Contributor License Agreement (CLA)?
You can access it through license/cla status - Details.

sangreal · 2024-02-22T13:28:16Z

Hi @sangreal - the PR is ready to be merged - can you please sign the Contributor License Agreement (CLA)? You can access it through license/cla status - Details.

@rkolesnev I find it quite weird since I signed it last year already since I already contributed. And when I try to check, I could not signed since it showed I already signed. Please let me know if this is a blocker. If it is, I will revoke this one and try sign again.

sangreal added 30 commits September 8, 2023 23:30

enable retry queue and let retry records in desired order

6dd8b95

update license header

22fd161

Merge remote-tracking branch 'upstream/master' into fix/enable-retry-…

1e388fc

…queue

change to poll first record

a38a022

add null check

3dd0b0d

add retry handler and working thread to handle adding retry to mailbox

1176e74

add license header

615d43f

remove unused code

f0afad5

some improv

6e4b457

implement new map for available containers

24b8d1c

change to calc count

8b6c307

remove unused code

c1decfc

fix worker calculation and add null protection

e4cf445

remove unused code

6d41d7e

remove unused code

4f72eff

tmp ignore

b5c285f

tmp ignore

5631dbb

fix pcmetrics test

53a5a5b

1. more refactor 2. add more comments

6bd409c

fix retryhandler loop

75f9399

fix header

3b91558

fix comment in ut

4d6e51a

update comment

852ea7d

add dueMillis to avoid io / lock

7d6d110

Merge remote-tracking branch 'upstream/master' into fix/enable-retry-…

85067b6

…queue

add more comments

a01e56f

update changelog

b7f6dc9

add retryItemCnt

da26151

check dupe remove

7f2dd42

address comments

297a597

remove unused method and updates comment

a805140

1. remove expiredRetryContainerCnt 2. use iterator to get expired cnt

a99b574

sangreal commented Feb 13, 2024

View reviewed changes

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/WorkContainer.java Outdated Show resolved Hide resolved

add comment and remove unused code

2eee903

rkolesnev suggested changes Feb 13, 2024

View reviewed changes

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java Show resolved Hide resolved

sangreal added 2 commits February 13, 2024 22:46

fix according to comments

e51f169

remove unused code and fix comments

bf84f5d

rkolesnev suggested changes Feb 14, 2024

View reviewed changes

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java Outdated Show resolved Hide resolved

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java Show resolved Hide resolved

refactor to remove stale container from retry queue

a66bda4

sangreal commented Feb 15, 2024

View reviewed changes

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ProcessingShard.java Show resolved Hide resolved

sangreal commented Feb 15, 2024

View reviewed changes

parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/state/ShardManager.java Show resolved Hide resolved

rkolesnev self-requested a review February 22, 2024 13:07

rkolesnev approved these changes Feb 22, 2024

View reviewed changes

rkolesnev merged commit 20f8b27 into confluentinc:master Feb 22, 2024
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add multiple caches for accelerating available container count calculation #667

add multiple caches for accelerating available container count calculation #667

sangreal commented Nov 19, 2023 •

edited

Loading

rkolesnev commented Feb 9, 2024 •

edited

Loading

rkolesnev commented Feb 9, 2024

sangreal commented Feb 10, 2024

rkolesnev commented Feb 12, 2024 •

edited

Loading

sangreal commented Feb 12, 2024

sangreal commented Feb 13, 2024

sangreal commented Feb 14, 2024

sangreal commented Feb 14, 2024

rkolesnev commented Feb 14, 2024 •

edited

Loading

rkolesnev commented Feb 14, 2024

rkolesnev commented Feb 14, 2024

sangreal commented Feb 15, 2024

sangreal commented Feb 15, 2024

rkolesnev commented Feb 22, 2024

sangreal commented Feb 22, 2024 •

edited

Loading

add multiple caches for accelerating available container count calculation #667

add multiple caches for accelerating available container count calculation #667

Conversation

sangreal commented Nov 19, 2023 • edited Loading

Explanation

prev flow

current flow

Possible Question

Checklist

rkolesnev commented Feb 9, 2024 • edited Loading

rkolesnev commented Feb 9, 2024

sangreal commented Feb 10, 2024

for old flow:

for new flow:

Conclusion

rkolesnev commented Feb 12, 2024 • edited Loading

sangreal commented Feb 12, 2024

sangreal commented Feb 13, 2024

sangreal commented Feb 14, 2024

sangreal commented Feb 14, 2024

rkolesnev commented Feb 14, 2024 • edited Loading

rkolesnev commented Feb 14, 2024

rkolesnev commented Feb 14, 2024

sangreal commented Feb 15, 2024

sangreal commented Feb 15, 2024

rkolesnev commented Feb 22, 2024

sangreal commented Feb 22, 2024 • edited Loading

sangreal commented Nov 19, 2023 •

edited

Loading

rkolesnev commented Feb 9, 2024 •

edited

Loading

rkolesnev commented Feb 12, 2024 •

edited

Loading

rkolesnev commented Feb 14, 2024 •

edited

Loading

sangreal commented Feb 22, 2024 •

edited

Loading