diff --git a/draft/core/storage-viz.txt b/draft/core/storage-viz.txt new file mode 100644 index 00000000000..0f085d6d188 --- /dev/null +++ b/draft/core/storage-viz.txt @@ -0,0 +1,271 @@ +=========== +Storage-viz +=========== + +"Storage-viz" is suite of tools that can be used to diagnose issues or +assess proposed changes related to storage allocation strategy and index +balancing heuristics. + +The ``storageDetails`` command will aggregate statistics related to the +storage layout (when invoked with ``analyze: "diskStorage"``) or the percentage +of pages currently in RAM (when invoked with ``analyze: "pagesInRAM"``) for the +specified collection, extent or part of extent. + +The ``indexStats`` command provides detailed and aggregate information and +statistics for the underlying btree of a particular index. +Stats are aggregated for the entire tree, per-depth and, if requested through +the ``expandNodes`` option, per-subtree. + +Both commands take a global READ_LOCK and will page in all the extents or btree +buckets encountered: this will have adverse effects on server performance. +The commands should never be run on a primary and will cause a secondary to +fall behind on replication. ``diskStorage`` when run with +``analyze: "pagesInRAM"`` is the exception as it typically returns rapidly and +may only page in extent headers. + +.. default-domain:: mongodb + +.. dbcommand:: storageDetails + + The command can be slow, particularly on larger data sets. + + .. code-block:: javascript + + { storageDetails: "collection_name", + analyze: "diskStorage" | "pagesInRAM" } + + This command will aggregate statistics related to the storage layout + (when invoked with ``analyze: "diskStorage"``) or the percentage of pages + currently in RAM (when invoked with ``analyze: "pagesInRAM"``) for the + specified collection. + You may also specify one of the following options: + + - ``extent: 4`` (0-based) only processes the 5th extent of the collection + + - ``range: [start, end]`` only processes the range between ``start`` bytes + and ``end`` bytes from the start of the extent. Requires an ``extent`` to + be specified. + + - ``granularity: 1 << 20`` splits the extents in 20MB slices and + reports statistics aggregated per-slice. + + - ``numberOfSlices: 100`` splits the extent(s) in 100 slices and + reports statistics aggregated per-slice. + + ``granularity`` and ``numberOfSlices`` are mutually exclusive. + + - ``characteristicField: "dotted.path"`` specifies a field in the + documents of the collection to be inspected and averaged to give + an hint on what kind of documents belong to an extent or slice. + Defaults to ``"_id"``. ObjectIDs, any number and Dates are + supported. If the field has the wrong type in some documents + it would be silently ignored. + + - ``processDeletedRecords: false`` disables the analysis of deleted + records which can be slow as it requires an iteration on all + the deletedList bucket for each extent. Defaults to ``true``. + + - ``showRecords: true`` outputs basic information for each document + and deletedRecord encountered. It should only be enabled for small + ranges on single extents. Produces large output which can exceed + the maximum bson object size. + + The typical output, when ``analyze: 'diskStorage'``, has the form: + + .. code-block:: javascript + + { extentHeaderBytes: , + recordHeaderBytes: , + range: [startOfs, endOfs], // extent-relative + numEntries: , + bsonBytes: , + recBytes: , + onDiskBytes: , + (opt) characteristicCount: + (opt) characteristicAvg: + outOfOrderRecs: + (opt) freeRecsPerBucket: [ ... ], + + The nth element in the ``freeRecsPerBucket`` array is the count of deleted records in the + nth bucket of the deletedList. + ``characteristicCount`` and ``characteristicAvg`` are only present if some documents contain + the field specified as ``characteristicField`` and it has a viable type (any number, ObjectID + or Date). + + The list of slices follows, with similar information aggregated per-slice: + + .. code-block:: javascript + + slices: [ + { numEntries: , + ... + freeRecsPerBucket: [ ... ] + }, + ... + ] + + If ``showRecords: true`` was set two additional fields are added to the outer document: + + .. code-block:: javascript + + records: [ + { ofs: , + recBytes: , + bsonBytes: , + (optional) characteristic: + }, + ... (one element per record) + ], + (optional) deletedRecords: [ + { ofs: , + recBytes: + }, + ... (one element per deleted record) + ] + + The typical output, when ``analyze: 'pagesInRAM'``, has the form: + + { pageBytes: , + onDiskBytes: , + inMem: , + (opt) slices: [ ... ] (only present if either params.granularity or numberOfSlices is not + zero and there exist more than one slice for this extent) + (opt) sliceBytes: + } + + The :program:`mongo` shell also provides wrappers: + + .. code-block:: javascript + + db.collection.diskStorageStats(); + db.collection.pagesInRAM(); + + db.collection.getDiskStorageStats(); + db.collection.getPagesInRAM(); + + ``diskStorageStats`` analyzes storage for the collection + (equivalent to invoking the command with ``{analyze: "diskStorage"}``). + + ``pagesInRAM`` reports the percentage of pages in RAM for the collection + (equivalent to invoking the command with ``{analyze: "pagesInRAM"}``). + + ``db.collection.getDiskStorageStats`` and ``db.collection.getPagesInRAM`` + take the same parameters as ``diskStorageStats`` and ``pagesInRAM``, + respectively, and provide a human-readable representation of the output. + + + .. warning:: This command is resource intensive and may have an + impact on the performance of your MongoDB instance. It also requires + the entire collection or extent to be loaded in RAM and it may + end up evicting some of the pages from other collections or extents. + + .. read-lock + +.. dbcommand:: indexStats + + The command can be slow, particularly on large indexes. + + .. code-block:: javascript + + { indexStats: "collection_name", + index: "index_name" } + + This command provides detailed and aggregate information and statistics for the underlying + btree for the index ``index_name`` in the collection ``collection_name``. + Stats are aggregated for the entire tree, per-depth and, if requested through the ``expandNodes`` + option, per-subtree. + + You can specify ``expandNodes: [0, 3]`` to expand the root (node 0 at depth 0) and the 4th child + of root (node 3 at depth 1). The first element of the array should always be 0 otherwise no + node will be expanded (there's only root ad depth 0). This will provide basic information about + the expanded nodes and statistics for the subtrees rooted at the nodes themselves. + + The typical output has the form: + + .. code-block:: javascript + + { name: , + version: , + keyPattern: , + storageNs: , + bucketBodyBytes: , + depth: + overall: { (statistics for the entire tree) + numBuckets: + keyCount: { (stats about the number of keys in a bucket) + count: , + mean: + (optional) stddev: + (optional) min: + (optional) max: + (optional) quantiles: { + 0.01: <1st percentile>, 0.02: ..., 0.09: ..., 0.25: <1st quartile>, + 0.5: , 0.75: <3rd quartile>, 0.91: ..., 0.98: ..., 0.99: ... + } + (optional fields are only present if there are enough samples to compute sensible + estimates) + } + usedKeyCount: + (same structure as keyCount) + bsonRatio: + (same structure as keyCount) + keyNodeRatio: + (same structure as keyCount) + fillRatio: + (same structure as keyCount) + }, + perLevel: [ (statistics aggregated per depth) + (one element with the same structure as 'overall' for each btree level, + the first refers to the root) + ] + } + + If 'expandNodes: [array]' was specified in the parameters, an additional field named + 'expandedNodes' is included in the output. It contains two nested arrays, such that the + n-th element of the outer array contains stats for nodes at depth n (root is included) and + the i-th element (0-based) of the inner array at depth n contains stats for the subtree + rooted at the i-th child of the expanend node at depth (n - 1). + Each element of the inner array has the same structure as 'overall' in the description above: + it includes the aggregate stats for all the nodes in the subtree excluding the current + bucket. + It also contains an additional field 'nodeInfo' representing information for the current + node: + + .. code-block:: javascript + + { childNum: + keyCount: + usedKeyCount: + diskLoc: { (bson representation of the disk location for this bucket) + file: + offset: + } + depth: + fillRatio: + firstKey: + lastKey: + } + + The :program:`mongo` shell also provides wrappers: + + .. code-block:: javascript + + db.collection.indexStats({index: "index_name"}); + db.collection.getIndexStats({index: "index_name"}); + + ``db.collection.indexStats({index: "index_name"})`` is equivalent to running the command + with {indexStats: "collection", index: "index_name"}. + + ``db.collection.getIndexStats`` takes the same parameters as ``indexStats`` and provides + a human-readable summary of the output in the shell. + + .. warning:: This command is resource intensive and may have an + impact on the performance of your MongoDB instance. It also requires + the entire collection or extent to be loaded in RAM and it may + end up evicting some of the pages from other collections or extents. + + .. read-lock + +