-
Notifications
You must be signed in to change notification settings - Fork 13
User's Guide 06:00 Administering TaskBotJS
So my day job is not application development. I build applications, from Ruby web frameworks to video mixer control applications, but my day job is devops consulting. It's not just because of my day job that I've tried to make TaskBotJS as stump-dumb simple for an administrator to handle as I could, but...it's a big part of it.
tl;dr: whatever's the newest that AWS Elasticache supports, I support
As of this writing (0.1.0-wip
), TaskBotJS works with almost any moderately
recent version of Redis. The set of commands used is small, we currently aren't
using any Lua scripting for Redis, etc. -- a large part of TaskBotJS was
developed and built against Redis 2.8.20, released on 4 June 2015, and since
it's been bumped to use 3.2.10, released in July 2017. I would expect, though
have not exhaustively tested (an integration test matrix is planned by or not
long after 1.0 release), that TaskBotJS will play nicely with any future Redis
version so long as backwards compatibility is maintained--and antirez, he of
the Redis wizardry, seems very into maintaining compatibility.
One guarantee I can make is that, until I declare otherwise, TaskBotJS will always run on the most recent version of AWS ElastiCache (3.2.10 at time of writing). It's much harder to determine what versions of Redis that managed Redis-alikes (or even fully managed Redis services--sometimes it's hard to tell!) such as Google Cloud Memorystore or Azure Redis Cache, but I consider those to be first-class citizens as well. TaskBotJS will consistently run on those platforms, too.
tl;dr: use systemd or runit or supervisord or whatever, but use something
TaskBotJS tries to be well-behaved with regards to running as a service. It
doesn't daemonize itself, expecting instead to be run under a process handler.
As a CentOS user by habit, I use systemd
, but it's not exactly rocket science
and whatever you want to use should be fine.
tl;dr: JSON logs in production unless you tell Bunyan otherwise
TaskBotJS standardizes on the awesome Bunyan logging library. (If you use Winston or something else, no worries-- there exist converters to sync everything up.) A big part of why I like Bunyan, and why IMO you should too, is that it goes hard for structured logging: every log entry comes out as a single JSON object. I apologize if you're already familiar with the Good News and understand why it's most cromulent that we do our logs with Bunyan, but for everyone else, the big win is that it's really easy to add context to your log entries. And that context makes debugging a whole lot easier on you later. Consider this line:
{"name":"consumer","hostname":"bigboss","pid":31800,"component":"ArgJob","jobId":"crash-anarchical-531252","level":30,"arg":25,"msg":"I have an arg: 25","time":"2018-05-03T06:02:13.416Z","v":0}
Here we've got the hostname/pid of the running service to answer the age-old
question "where the heck is this coming from?", the component
field against
which your friendly aggregation system can let you filter down to ArgJob
,
we've got our job ID inline--everything that's useful to you. And, because it's
Bunyan, custom fields are trivial to add, like the arg
field added to this
row. Bunyan also supports child loggers, so that it's easier to pass a logger to
a method to retain the context that belongs to the current flow of execution.
TaskBotJS catches SIGTERM and SIGINT and treats them both effectively the same. The server process waits for the intake and worker loops to stop (thus halting the fetching and queueing of new jobs), then puts back any jobs that are currently in progress or are waiting to be launched.
TaskBotJS Enterprise includes a graceful shutdown option.
tl;dr: the defaults are usually okay, but check your metrics
By default, a newly created Config
object in TaskBotJS has a concurrency
value
of 20. This means that TaskBotJS will attempt to keep in flight a maximum of about
20 jobs at any given time. (This is not an iron-clad limit; occasionally you may
see 21 jobs with a concurrency of 20, and that's intentional.) This makes the
most sense for very IO-constrained operations or ones that are offloading a lot
of work to C++ extensions. Because, after all, we're still in a NodeJS process;
not blocking the main thread of execution is kind of important. However, there
certainly exist programmatically complex, high-CPU jobs that your code is
bouncing out towards--in such situations ramping concurrency down to something
lower, such as 1 to 1.5 threads per execution thread, will reduce thrashing and
improve throughput.
If you're using TaskBotJS Pro, per-worker metrics can help you tease out how your configuration is handling your workload.