|
| 1 | +# Metrictank startup |
| 2 | + |
| 3 | +The full startup procedure has many details, but here we cover the main steps if they affect: |
| 4 | + |
| 5 | +* performance/resource usage characteristics |
| 6 | +* cluster status |
| 7 | +* API availability |
| 8 | +* diagnostics |
| 9 | + |
| 10 | + |
| 11 | +| Phase | Description | effect on CPU / RAM | |
| 12 | +| ----------------------- | -------------------------------------------------------------------------------------------------- | ----------------------------------- | |
| 13 | +| load config | load/validate config | no | |
| 14 | +| setup diagnostics | set up logging, profiling, proftrigger | no | |
| 15 | +| log startup | logs "Metrictank starting" message | no | |
| 16 | +| start sending stats | starts connecting and writing to graphite endpoint | no | |
| 17 | +| create Store | create keyspace, tables, write queues, etc | minor RAM increase ~ queue size | |
| 18 | +| create Input(s) | open connections (kafka) or listening sockets (carbon, prometheus) | no | |
| 19 | +| start cluster | starts gossip, joins cluster | no | |
| 20 | +| create Index | creates instance and starts write queues | minor RAM increase ~ queue size | |
| 21 | +| start API server | opens listening socket and starts handling requests in not-ready mode | no | |
| 22 | +| init Index | creates session, keyspace, tables, write queues, etc and loads in-memory index from persisted data | reasonable RAM and CPU increase | |
| 23 | +| create cluster notifier | optional: connects to Kafka, starts backfilling persistence message and waits until done or timeout| if backfilling: above-normal CPU, normal RAM usage | |
| 24 | +| start input plugin(s) | starts backfill (kafka) or listening (carbon, prometheus) and maintain priority based on input lag | if backfilling: above-normal CPU and RAM usage | |
| 25 | +| mark ready state | immediately (primary) or after warmup period (secondary) (combined with priority for clustering) | no | |
| 26 | + |
| 27 | +We recommend provisioning a cluster such that it can backfill a 7 hour backlog in half on hour or less. This means: |
| 28 | +* The CPU increase during the kafka backfilling is very significant: typically a 14x cpu increase compared to normal usage. |
| 29 | +* The RAM usage during the input data backfilling is typically about 1.5x to 2x normal. |
| 30 | + |
| 31 | +Backfilling will go as fast as it can until it reaches a bottleneck (kafka brokers, cpu constraints, etc), so your numbers may vary. |
| 32 | + |
| 33 | +This is true for v0.11.0, but may need revising later. |
0 commit comments