Proposal: parallelise processing of messages received by memberlist #564

pracucci · 2021-12-02T09:05:09Z

Problem
We're running a large Cortex clusters (1000+ nodes) with ring based memberlist and we observe many logs like the following:

caller=memberlist_logger.go:74 level=warn msg="handler queue full, dropping message (8) from=REDACTED:7946"

The processing of received (user) messages is serialized and handled by KV.NotifyMsg(). The next received message will be picked up from the queue once the current one NotifyMsg() returns.

Because of this, the approach so far has been to optimize NotifyMsg() as much as possible. This is definitely a good approach but we still see that we can't catch up with processing all received messages after all optimizations we've done so far. We can (and should) continue to optimise it, but the serialize processing will always put a scalability limit because we can't use more than 1 CPU core to process messages.

Proposal
To my understanding, we don't need to guarantee ordering when processing received messages in NotifyMsg() because messages can be received in any order due to how propagation in memberlist works (even right now, with the large volume dropped messages we're not effectively processing messages in order because many of them are dropped).

I propose to introduce parallelization in the KV.NotifyMsg(): the idea is to put received messages in a queue (yes, another limited queue) and have a pool of workers processing received messages. With a accurate use of locks (that should be taken for the shortest time possible) we should be able to significantly increase the throughput of received messages processing.

The text was updated successfully, but these errors were encountered:

ortuman · 2022-01-04T15:43:25Z

I've opened a draft PR @ dskit#110 to enhance message processing in KV memberlist by enabling parallelization. This solution makes use of a queue (buffered channel) in which messages are enqueued for later processing by a pool of workers. Additionally, now we must copy the passed message in NotifyMsg() before passing it around, as it can be altered after the caller returns.

PTAL

pstibrany · 2022-01-05T09:18:33Z

There was discussion about this proposal on internal Slack (https://raintank-corp.slack.com/archives/C02L86H64TW/p1638435972297100), and the consensus was to go with one goroutine per KV key (and separate buffered channel for each key). Updating single key already requires lock, so having multiple concurrent goroutines working on the same key would provide little benefit.

ortuman · 2022-01-07T09:55:58Z

Thanks @pstibrany for pointing this discussion out to get all necessary context. I agree that parallelizing operations for a same key shouldn't make any difference given that Merge method will force serialization anyway. As internally concluded, going for the goroutine + buffer solution seems to be more reasonable. Let me rework the draft PR and come back to you when done.

logston · 2022-04-19T02:57:40Z

Hi All, any progress on this issue? We are running a 9 node cluster, so far from the 1000+ nodes you mention above. Still, we get this is when we run queries that as for several months of metrics. The VMs on which the pods are running are quite large (n2d-highmem-32) so we are usually left with 50% of CPU and 80% of memory headroom. Is the KV.NotifyMsg() logic in the same critical path of read queries? Or could queries effect the performance of the KV logic?

pracucci · 2022-04-19T09:57:30Z

We are running a 9 node cluster, so far from the 1000+ nodes you mention above. Still, we get this is when we run queries that as for several months of metrics.

It's unexpected for such a small cluster. Could you share a snippet of such logs? What's the frequency at which they occur?

Is the KV.NotifyMsg() logic in the same critical path of read queries?

No it's not, but see below...

Or could queries effect the performance of the KV logic?

If CPU cores are exhausted (e.g. machine is running CPU cores at 100%) then all threads are affected, including the ring changes propagation logic.

pstibrany mentioned this issue May 23, 2022

Update dskit to bring "Parallelize memberlist notified message processing" PR #1912

Merged

1 task

pstibrany closed this as completed in #1912 May 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: parallelise processing of messages received by memberlist #564

Proposal: parallelise processing of messages received by memberlist #564

pracucci commented Dec 2, 2021

ortuman commented Jan 4, 2022

pstibrany commented Jan 5, 2022

ortuman commented Jan 7, 2022 •

edited

Loading

logston commented Apr 19, 2022

pracucci commented Apr 19, 2022

Proposal: parallelise processing of messages received by memberlist #564

Proposal: parallelise processing of messages received by memberlist #564

Comments

pracucci commented Dec 2, 2021

ortuman commented Jan 4, 2022

pstibrany commented Jan 5, 2022

ortuman commented Jan 7, 2022 • edited Loading

logston commented Apr 19, 2022

pracucci commented Apr 19, 2022

ortuman commented Jan 7, 2022 •

edited

Loading