Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8] "Flow Control" How to handle full load? #709

Open
dmth opened this issue Sep 23, 2016 · 11 comments
Open

[8] "Flow Control" How to handle full load? #709

dmth opened this issue Sep 23, 2016 · 11 comments

Comments

@dmth
Copy link
Contributor

dmth commented Sep 23, 2016

On one of our machines we've tested how the system works with large amount of data. We've managed to get the system swapping, although have a rather huge amount of RAM (24GB). That happened when approx 4 Million events were running through the pipe. A lot of events were stuck at the deduplicator.

How to deal with such situations?

One approach is to stop collection of new data when high load / critical load is detected.

Keywords: Performance, Stability

@aaronkaplan
Copy link
Member

On 23 Sep 2016, at 15:17, Dustin Demuth notifications@github.com wrote:

On one of our machines we've tested how the system works with large amount of data. We've managed to get the system swapping, although have a rather huge amount of RAM (24GB). That happened when approx 4 Million events were running through the pipe. A lot of events were stuck at the deduplicator.

How to deal with such situations?

Possible solutions:

  1. improve processing speed and reduce bottlenecks in the whole pipeline. this allows for a smooth flow and less congestion
  2. implement "back-pressure": bots in a pipeline further downstream can tell upstream that they need to slow down.

It's on the list for 2.0

@sebix
Copy link
Member

sebix commented Sep 23, 2016

One approach is to stop collection of new data when high load / critical load is detected.

Stopping collection is not enough. Also the parser's activity is critical as it blows up the amount of data by least a factor of 2 or 3. The expert's impact is only a factor of 1-1.5. (Only estimates, haven't measured anything)

  1. implement "back-pressure": bots in a pipeline further downstream can tell upstream that they need to slow down.

The input pipelines (for e.g. parsers and experts) could stop serving new data if load is too high. Actually not so hard to implement, could also be done in 1.x

@aaronkaplan
Copy link
Member

On 23 Sep 2016, at 19:12, Sebastian notifications@github.com wrote:

One approach is to stop collection of new data when high load / critical load is detected.

Stopping collection is not enough. Also the parser's activity is critical as it blows up the amount of data by least a factor of 2 or 3. The expert's impact is only a factor of 1-1.5. (Only estimates, haven't measured anything)

  1. implement "back-pressure": bots in a pipeline further downstream can tell upstream that they need to slow down.

The input pipelines (for e.g. parsers and experts) could stop serving new data if load is too high. Actually not so hard to implement, could also be done in 1.x

Yes maybe. But not 1.0 please ;)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@sebix sebix added this to the v1.1 Feature release milestone Sep 24, 2016
@dmth
Copy link
Contributor Author

dmth commented Sep 26, 2016

Stopping collection is not enough. Also the parser's activity is critical as it blows up the amount of data by least a factor of 2 or 3. The expert's impact is only a factor of 1-1.5. (Only estimates, haven't measured anything)

Thank you for this very valuable piece of information. So the parsers should also slow down and only forward data when a sufficient amount of space is available. Somehow the parsers need to buffer the data...

implement "back-pressure": bots in a pipeline further downstream can tell upstream that they need to slow down.

Should bots be capable to talk to each other? Or should this be done by a daemon who watches over all bots?

@sebix
Copy link
Member

sebix commented Sep 26, 2016

Should bots be capable to talk to each other? Or should this be done by a daemon who watches over all bots?

A daemon could control the flow by throttling either with the receive_message function or directly in redis: We have a transparent queue between the previous output and next input's queue and messages are pushed from one to the other by the daemon.

@aaronkaplan
Copy link
Member

On 26.09.2016, at 19:08, Sebastian notifications@github.com wrote:

Should bots be capable to talk to each other? Or should this be done by a daemon who watches over all bots?

My gut feeling is that you are now coming up with a solution which was already solved multiple times.

In general there are solutions out there for solving this.
One of the beautiful aspects of intelmq is that we in the original architecture did NOT try to re-invent the wheel and implement everything ourselves but rather built on top of good and solid frameworks such as redis.

For flow control and message queuing systems I am sure there are other such options available.

ActiveMQ seems to have it.
Flow control in TCP is considered a solved issue.

What I am trying to say: let's take a step back, do some user requirements for phase 2 and then select good frameworks.
And if finally the answer is that we stay with redis and implement it ourselves then that's also good. But at least we thought about our requirements first.

My 2 cents.

A daemon could control the flow by throttling either with the receive_message function or directly in redis: We have a transparent queue between the previous output and next input's queue and messages are pushed from one to the other by the daemon.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@sebix
Copy link
Member

sebix commented Sep 26, 2016

Yes, I agree @aaronkaplan

But to be more realistic: We can't switch to another messaging queue in the next months after the release, but we need such a feature - not necessarily very mature, but working - to put our setups into production.

@aaronkaplan
Copy link
Member

On 26 Sep 2016, at 21:53, Sebastian notifications@github.com wrote:

Yes, I agree @aaronkaplan

But to be more realistic: We can't switch to another messaging queue in the next months after the release,

Sure, that's why I am collecting input for 2.0.

but we need such a feature - not necessarily very mature, but working - to put our setups into production.

There are a couple of intermediate steps I can think of:

  1. visualise the processing rate (success rate/error rate) of each bot in the monitor page. This clearly describes where the issue lies
  2. s/rate_limit/polling_intervall in the bots and then implement a real rate_limit (say: produce x rows / sec maximum)
  3. improve processing speed of the bottleneck bots at the end of the pipeline (experts etc) to always be faster than the parsers:
    a) deduplicator
    b) move away from network lookup based bots such as team cymru to only offline lookups (such as pyasn)

In my experience in operating our instance, this is the currently working solution.

For a general "let's implement backpressure" approach I recommend we first look at who already implemented it in which framework.
We are still flexible to chose whatever pipeline / MQ framework we want.

@sebix
Copy link
Member

sebix commented Sep 26, 2016

As this issue is a very distinc problem statement, let's discuss a short- and intermediate-term solution here and for long-term solutions (i.e. other messaging queuing systems) we discuss this later/elsewhere?

  1. visualise the processing rate (success rate/error rate) of each bot in the monitor page. This clearly describes where the issue lies

#361

  1. s/rate_limit/polling_intervall in the bots and then implement a real rate_limit (say: produce x rows / sec maximum)

#464

  1. improve processing speed of the bottleneck bots at the end of the pipeline (experts etc) to always be faster than the parsers:

#613 #373 #353

Also related to this issue: #253 #186 #164

@aaronkaplan
Copy link
Member

On 26 Sep 2016, at 22:31, Sebastian notifications@github.com wrote:

As this issue is a very distinc problem statement, let's discuss a short- and intermediate-term solution here and for long-term solutions (i.e. other messaging queuing systems) we discuss this later/elsewhere?

So there you go :)
That's pretty much what I said in private mails as well: let's plan and work on a rel 2.0 document off-issue-tracker, shall we?
And yes, of course, intermediate steps are useful.

  1. visualise the processing rate (success rate/error rate) of each bot in the monitor page. This clearly describes where the issue lies

#361

  1. s/rate_limit/polling_intervall in the bots and then implement a real rate_limit (say: produce x rows / sec maximum)

#464

  1. improve processing speed of the bottleneck bots at the end of the pipeline (experts etc) to always be faster than the parsers:

#613 #373 #353

Also related to this issue: #253 #186 #164


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@aaronkaplan aaronkaplan changed the title "Flow Control" How to handle full load? [8] "Flow Control" How to handle full load? Oct 5, 2016
@bernhard-herzog
Copy link
Contributor

I don't think we need something complicated to start with. The main goal
should be avoiding having too much data in Redis at the same time. I
think we should be able to achieve something useful in practice without
much infrastructure or even coordination between bots, at least for bots
that repeatedly fetch largish reports and can cope with pausing for a
while if the system load gets too high.

Plan to Limit the Memory Consumption of the Redis DB

Goal: Keep the size of the Redis database under some chosen limit.
Let's call this the memory_limit.

Now, choose another limit, the collection_limit, less than
memory_limit such that:

  • Redis stays below memory_limit if each collector bots adds a report
    when Redis is below the collection_limit and its output queue is
    empty
  • When all bots are running and there's a good number of events in the
    system Redis is still below the collection_limit.

Based on that collection_limit we should be able to implement a simple
mechanism that keeps Redis below the memory_limit in most cases. How
this may work in detail is explained below (and, I think, pretty obvious
from the conditions on collection_limit), but it basically boils down
to what was already suggested in some form in earlier comments: Try not
to add new data to the system when it's overloaded.

Collector Bots Changes

The INFO Command of the Redis
database gives us two useful memory related values used_memory and
used_memory_rss. The former is the number of allocated bytes and the
latter the number of bytes in the memory pages currently used by the
Redis process. Both could be useful for our purpose, but for simplicity
we should use used_memory for now. The python client library we use
makes it easy to get at this value because the info method of the
redis client object helpfully returns a dictionary.

The collector bot is modified so that it only fetches a new report if
two conditions are met:

  1. the output queue is empty
  2. used_memory < collection_limit

If the conditions are not met, the bot pauses for a while before trying
again. The duration of the pause should be random to reduce the
likelihood of bots fetching reports at the same time.

Note that the bots do not need to coordinate and there's no need for
(extra) buffering since the collection_limit is chosen such that there
should be enough space left if the conditions are met.

Estimating the Memory Requirements.

To choose the collection_limit we need to get an idea about the memory
requirements. In particular, we probably need:

  • The size of the reports collected by the collector bots.
  • How much additional memory will be allocated by Redis to process
    these reports.
  • Non-event data stored by bots

In particular, the deduplicator bot stores a hash of all events it
sees in the Redis database for 24 hours. The requirements of the
deduplicator should be roughly proportional to the number of events
the system processes in a day.

  • The memory requirements for a single event in the system.

This is not just the size of the JSON serialization, but also e.g.
overhead needed by Redis.

  • The number of events the system should be able to handle
    simultaneously.

Some rough notes on the relationships between these numbers:

In the worst case, all collector bots fetch new reports at the same time
and add them to the database, so the difference between memory_limit
and collection_limit should be greater than the sum of all the report
sizes + the additional memory needed for processing.

The product of the last two numbers, event size and number of events,
puts a lower bound on a reasonable value for collection_limit.

The actual lower bound is much less than this, obviously, because the
system would still work even if the collectors only fetch new reports if
all queues are empty. The real lower bound for collection_limit can
perhaps be determined by starting all bots and looking at used_memory
and adding some safety margin to that, taking other memory requirements,
such as that of the deduplicator into account.

There are also some correlations between the numbers. E.g. the total
number of events in the system depends on how many feeds are processed
at the same time and how big the reports are.

Measuring memory requirements

In order to get a feel for some of these numbers, particularly the
memory requirement for a single event and the deduplicator overhead,
I've run a little experiment:

  • Stop all bots
  • Prepare input for file-input bot: Put file into the right location
  • For each bot in the network of bots, starting with the file input
    bot, followed by the parser bot, and so on in order in which the
    events flow through the system:
    • start bot

    • wait until it has finished processing its input and written all
      output

    • stop bot

    • measure Redis memory consumption with

      redis-cli INFO memory | grep used_memory

Results (allocated bytes after the bot has finished):

The input file was a shadowserver Open-Portmapper report, 39562549
bytes, 138126 lines.

Bot input queue length used_memory
819304
file-input 67928520
parser 244435192
deduplicator 138125 266266840
modify-expert 138125 266266856
contact-bot 138125 291150040
postgresql-output 138125 22906168

¹ The file-input bot uses chunking with a chunk-size of 10000000 bytes,
hence the 4 items in the parser's input queue

Conclusions:

  • The 38M of the input file lead to an increase in allocated memory of
    64M (≅ 1.7 * 38M)
  • The deduplicator increases allocated memory by ~ 160 bytes / event
    ((266266840 - 244435192) / 138125 ≅ 158.06)
  • Immediately after parsing, each event consumes roughly 1800 bytes
    ((244435192 - 819304) / 138125 ≅ 1764)
  • Immediately before the events leave the system with the
    postgresql-output bot, each event consumes almost 2K bytes, not
    counting the 160 bytes added to the DB by the deduplicator
    ((291150040 - 819304) / 138125 - 158 ≅ 1944)
  • The difference in used_memory between the initial value and the final
    value after the postgresql-output bot has finished is practically
    entirely due to the deduplicator bot
    ((22906168 - 819304) / 138125 ≅ 160)

Caveats:

  • This is a very simple setup with few expert bots, in particular, few
    experts that add information or add new cached information to the
    Redis DB.
  • It's only one feed. Other feeds may lead to larger events output by
    the parser bot.

Problems

  • The approach will not work for collector bots that have to keep up
    with a steady stream of events.
  • Even bots that can pause may have some upper bound on the pause
    duration. E.g. if a report has to be downloaded at least once per
    day, the bot should not pause for longer than that, so it may have to
    occasionally fetch a report even if the conditions are not really
    met. So perhaps there should be a configuration parameter similar to
    rate_limit that puts an upper bound on the time between two fetch
    attempts.

However, even if not all collector bots can use the solution outline
here, it should still improve the situation by reducing the likelihood
of too high memory usage.

@ghost ghost modified the milestones: 1.1.0, 2.0.0 Jun 28, 2018
@ghost ghost modified the milestones: 2.0.0, 2.1.0 Apr 10, 2019
@ghost ghost modified the milestones: 2.1.0, 2.2.0 Oct 25, 2019
@ghost ghost removed this from the 2.2.0 milestone Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants