-
Notifications
You must be signed in to change notification settings - Fork 67
litemetric guide
Litemetric, a part of Litestack, is a low overhead, simple and generic telemetry tool that collects runtime data about Ruby and Rails applications.
Litemetric, as other components in Litestack is built on top of SQLite. It uses the embedded database engine to store and query telemetry data. As a result, Litemetric is a very low maintenance system. There is no need to setup/maintain/monitor any service aside from the application that integrates Litemetric.
Litemetric follows a simplistic approach where it tries to Nprovide easy enough APIs to cover most of the needed cases for event acquisition and measurement. It does not attempt to be an elaborate performance monitoring system. Still, it can be sufficient for many application needs, zero administrative overhead is a plus!.
Litestack components (e.g. Litejob, Litecache, Litecable) can optionally use Litemetric to report on usage and performance.
- Capture single/multishot events
- Measure single/multishot events
- Snapshot information capturing
- In memory aggregation
- Background aggregator
- Background garabage collector
- Thread safety
- Async/Fiber Scheduler integration
- Graceful shutdown
- Fork resilience
- Polyphony integration
- Web interface
For any class for which you need to collect metrics just include the Litemetric::Measurable module. Then we can set a unique identifier for the class by overriding the #metrics_identifier method.
Capturing and measuring events can then happen whenever required in the object methods.
# note that we only need to require litestack
# you could still do require 'litestack/litemetric'
class ImportantClass
include Litemetric::Measurable
# override the default identifier
def metrics_identifier
self.class.name
end
# the captured action will only be counted
# the database will have a count of times the event was captured
def simple
# do something
capture("simple")
end
# the measured action will also capture the runtime of the action
# the database will have a count of times the event was measured and the total time measured
def complex
measure("complex") do
# do something
end
end
Sometimes an action needs to be reported multiple times. Like for example when you need to report job insertion rate for each named queue and for all the queues at once. Litemetric provides a simple way to achieve this
# capture multiple events in one shot
def enqueue(queue_name, job)
# do the action
capture(["enqueue-all", "enqueue-#{queue_name}"])
end
# also with measurement
def perform(queue_name, job)
measure(["perform-all", "perform-#{queue_name}"]) do
# do the action
end
end
The above results in two entries being captured/measured, one for the specific queue and one that aggregates over all queues.
Litemetric looks for a litemetric.yml file in its working directory, the syntax and defaults for the file are as follows:
path: '/queue.db' # where the database file resides
queues:
- [default, 1] # default queue with the lowest priority
- [urgent, 10, spawn] # this is not a default, a higher priority queue which will run every job in its own thread or fiber
workers: 5 # how many threads/fibers to spawn for queue processing
retries: 5 # how many times to retry a failed job before giving up
retry_delay: 60 # seconds
retry_delay_multiplier: 10 # 60 -> 600 -> 6000 and so on
dead_job_retention: 864000 # 10 days to keep completely faild jobs in the _dead queue
gc_sleep_interval: 7200 # 2 hours of sleep between checking for dead jobs that are ready to be buried forever
logger: STDOUT # possible values are STDOUT, STDERR, NULL or a file location
The db path should preferably be outside of your application folder, in order to prevent accidental overrides during deployment
You can simply use the native interface in your Rails application, but if you want to specifically use the ActiveJob interface you can configure it as such in your environment file (e.g. production.rb)
metrics: true # default is false
Currently Litemetric lacks a UI (this is being addressed atm). But until then, all the metrics collected are stored in the litemetric database (metrics.db by default). The data is stored in two tables, an events table that stores the data for the last 24 hours, aggregated by the hour and the events_summary table, which stores the data for the last year, aggregated by the data. Their schema is as follows (simplified):
CREATE TABLE events(id, name, count, value, created_at, PRIMARY KEY(id, name, created_at)) WITHOUT ROWID;
CREATE TABLE events_summary(id, name, count, value, created_at, PRIMARY KEY(id, name, created_at)) WITHOUT ROWID;
The Litemetric class offers some methods that can be used to report on the events
# Litemetric is a singleton, there can only be one instance in a process
metrics = Litemetric.instance
# gets a list of all the topics that published events in the last 30 days
metrics.topics
# get a list of all the events captured for a specific topic in the last 30 days
# each event will have a count of all its occurrences and a sum of the values of these events
metrics.events(topic)