Tuning servers for high concurrency workloads

Non-blocking Mojolicious applications have the potential to scale to thousands of concurrent connections.

However, most operating systems are not configured to handle that sort of work out-of-the-box. This page details some of the tweaks your server will likely need before it can successfully handle very high concurrent loads. Note that if any part of your application blocks, much of this advice is premature optimization or even irrelevant.

Disclaimer

These tuning parameters are not a substitute for a sane load balancing scheme: they will help get the most out of a single server, but any single machine can and will fall over eventually. One benefit of Mojolicious/hypnotoad is that the multi-process model encourages an application architecture that can be easily scaled across servers (horizontally), and any serious application deployment should be doing exactly that.

File descriptor limits

An open connection on a *NIX server is represented by a file descriptor. Most/all operating systems have system-wide limits on the maximum number of simultaneous open file descriptors, and it is often quite low (1024, in the case of Linux/Ubuntu 12.04). In addition to the system-wide limit, there are per-process limits. As the number of connections increases, a server may quickly encounter either of these limits and begin refusing connections.

libev I/O event notification backends

Mojo can use libev as an event loop provider if it is available. This is the strongest option for applications that expect large numbers of concurrent connections, as the other Mojo::Reactor provider is a pure-Perl interface to to poll system call. However, libev supports several I/O event notification backends (poll, select, as well as OS-specific ones like epoll and kqueue). The LIBEV_FLAGS environment variable allows you to specify which I/O event notification method libev should prefer.

Known values (from gevent documentation):

LIBEV_FLAGS=1 # select backend
LIBEV_FLAGS=2 # poll backend
LIBEV_FLAGS=4 # epoll backend -- preferred choice for linux servers
LIBEV_FLAGS=8 # kqueue backend -- preferred choice for BSD (including OSX) servers

Recipes

hypnotoad/Mojo::IOLoop (system agnostic)

Max clients

In addition to system limits on open file descriptors, hypnotoad has a clients config setting, which defaults to 1000 per worker process. Internally, hypnotoad (via Mojo::Server::Prefork and Mojo::Server::Daemon) uses Mojo::IOLoop->max_connections to specify the maximum number of connections a single worker can handle. While the default setting is reasonably generous, if your architecture depends on a small number of powerful servers, this may become an issue.

Number of Workers

Hypnotoad is a preforking server: a supervisor process spawns a set of worker processes, and then delegates work to them as connections arrive. The number of workers that hypnotoad spawns is configurable, and should be adapted to the number of CPUs on your server (hypnotoad docs) for best results.

These two settings are strongly interconnected: a big change in one value without adjusting the other is likely to lead to surprises sooner or later.

Linux

System-wide file descriptor limits

$ sudo vi /etc/sysctl.conf

look for a line fs.file-max, and raise it to something you're comfortable with.

The exact value will depend on server resources and expected workload: too high and processes (like your Mojo app) might get killed by the OOM-killer. However, it is generally safe to set it to something rather large (on the scale of X0000).

Per-process file descriptor limits

$ sudo vi /etc/security/limits.conf

add:

*                soft    nofile          65535
*                hard    nofile          65535

where * can also be a specific user (like www-data)

libev

$ cd <your app home directory>
$ LIBEV_FLAGS=4 hypnotoad script/<your app> # tell libev to use epoll

Deployments vary, but the important part is setting the LIBEV_FLAGS environment variable to 4 (for epoll).

There are a great many articles on Linux network tuning that go into greater detail.

For starters:

OSX

Here's an example of running a one-off Mojolicious app with appropriate settings for 10k concurrent websocket connections.

$ sudo sysctl -w kern.maxfiles=40960 kern.maxfilesperproc=20480 # raise the kernel fd limits
$ ulimit -n 20480 # tell bash to allow this shell/its children up to 20480 open fds
$ LIBEV_FLAGS=8 perl c10k.pl # tell libev to use the kqueue backend

Source: sri's c10k websocket example. This gist also demonstrates raising Mojo::IOLoop's max_connections to accommodate a high benchmarking load.

Windows

???

END

Provide feedback

Saved searches

Use saved searches to filter your results more quickly