From 38e4734e69a41eacd958755a3929d7885b610388 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Mon, 5 Nov 2012 20:39:17 -0500 Subject: [PATCH 1/4] DOCS-660 DOCS-659 migrate index wiki pages --- source/administration/indexes.txt | 141 ++++++-- source/core/indexes.txt | 518 +++++++++++++++++++++--------- 2 files changed, 480 insertions(+), 179 deletions(-) diff --git a/source/administration/indexes.txt b/source/administration/indexes.txt index 2a6574b8dc4..a94540d40ce 100644 --- a/source/administration/indexes.txt +++ b/source/administration/indexes.txt @@ -4,9 +4,6 @@ Indexing Operations .. default-domain:: mongodb -Synopsis --------- - This document provides operational guidelines and procedures for indexing data in MongoDB collections. For the fundamentals of MongoDB indexing, see the :doc:`/core/indexes` document. For strategies and @@ -15,22 +12,57 @@ practical approaches, see the :doc:`/applications/indexes` document. Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the documents in a collection. -Operations ----------- +.. index:: index; create +.. _index-create-index: Create an Index -~~~~~~~~~~~~~~~ +--------------- To create an index, use :method:`db.collection.ensureIndex()` or a similar :api:`method from your driver <>`. For example -the following creates [#ensure]_ an index on the ``phone-number`` field +the following creates an index on the ``phone-number`` field of the ``people`` collection: .. code-block:: javascript db.people.ensureIndex( { "phone-number": 1 } ) -To create a :ref:`compound index `, use an +As the name suggests, :method:`ensureIndex() +` only creates an index if an index of the +same specification does not already exist. + +Once a collection is indexed on a key, queries match the specified key +are fast. If you query on a key for which there is no index, MongoDB +must go through each document checking the key's value. + +.. example:: + + If you create an index on the ``x`` field in the ``things`` + collection but not on the ``y`` field, then this query is fast: + + .. code-block:: javascript + + db.things.find( { x: 2 } ) + + while this query is slow: + + .. code-block:: javascript + + db.things.find( { y: 2 } ) + +If your collection is large, build the database in the background, as +described in :ref:`index-creation-background`. If you build in the +background on a live replica set, see also +:ref:`index-build-on-replica-sets`. + +.. index:: index; create +.. index:: index; compound +.. _index-create-compound-index: + +Create a Compound Index +----------------------- + +To create a :ref:`compound index ` use an operation that resembles the following prototype: .. code-block:: javascript @@ -51,12 +83,16 @@ resulting index. .. include:: /includes/note-build-indexes-on-replica-sets.rst -.. [#ensure] As the name suggests, :method:`ensureIndex() ` - only creates an index if an index of the same specification does - not already exist. +.. index:: index; options +.. _index-special-creation-options: + +If your collection is large, build the database in the background, as +described in :ref:`index-creation-background`. If you build in the +background on a live replica set, see also +:ref:`index-build-on-replica-sets`. Special Creation Options -~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------ .. note:: @@ -66,8 +102,11 @@ Special Creation Options .. todo::: insert link here to the geospatial index documents when they're published. +.. index:: index; sparse +.. _index-sparse-index: + Sparse Indexes -`````````````` +~~~~~~~~~~~~~~ To create a :ref:`sparse index ` on a field, use an operation that resembles the following prototype: @@ -92,8 +131,11 @@ without the ``twitter_name`` field. index. See the :ref:`sparse index ` section for more information. +.. index:: index; unique +.. _index-unique-index: + Unique Indexes -`````````````` +~~~~~~~~~~~~~~ To create a :ref:`unique indexes `, consider the following prototype: @@ -136,8 +178,11 @@ You can also enforce a unique constraint on :ref:`compound indexes These indexes enforce uniqueness for the *combination* of index keys and *not* for either key individually. -Background -`````````` +.. index:: index; create in background +.. _index-create-in-background: + +Create in Background +~~~~~~~~~~~~~~~~~~~~ To create an index in the background you can specify :ref:`background construction `. Consider the following @@ -151,8 +196,12 @@ Consider the section on :ref:`background index construction ` for more information about these indexes and their implications. +.. index:: index; drop duplicates +.. index:: index; duplicates +.. _index-drop-duplicates: + Drop Duplicates -``````````````` +~~~~~~~~~~~~~~~ To force the creation of a :ref:`unique index ` index on a collection with duplicate values in the field you are @@ -176,8 +225,11 @@ See the full documentation of :ref:`duplicate dropping Refer to the :method:`ensureIndex() ` documentation for additional index creation options. +.. index:: index; list indexes +.. _index-list-indexes-for-collection: + List a Collection's Indexes -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--------------------------- To list a collection's indexes, use the :method:`db.collection.getIndexes()` method or a similar @@ -189,8 +241,23 @@ For example, to view all indexes on the the ``people`` collection: db.people.getIndexes() -Remove an Index -~~~~~~~~~~~~~~~ +.. index:: index; list indexes +.. _index-list-indexes-for-database: + +List all Indexes for a Database +------------------------------- + +To see all indexes for a database, issue the following: + +.. code-block:: javascript + + db.system.indexes.find() + +.. index:: index; remove +.. _index-remove-index: + +Remove Indexes +-------------- To remove an index, use the :method:`db.collection.dropIndex()` method, as in the following example: @@ -217,8 +284,11 @@ These shell helpers provide wrappers around the library ` may have a different or additional interface for these operations. -Rebuilding -~~~~~~~~~~ +.. index:: index; rebuild +.. _index-rebuild-index: + +Rebuild Indexes +--------------- If you need to rebuild indexes for a collection you can use the :method:`db.collection.reIndex()` method. This will drop all indexes, @@ -257,13 +327,13 @@ may have a different or additional interface for this operation. .. include:: /includes/note-build-indexes-on-replica-sets.rst +.. index:: index; replica set +.. index:: replica set; index +.. _index-build-on-replica-sets: .. _index-building-replica-sets: -Building Indexes on Replica Sets --------------------------------- - -Consideration -~~~~~~~~~~~~~ +Build Indexes on Replica Sets +----------------------------- :ref:`Background index creation operations ` become *foreground* indexing operations @@ -281,9 +351,6 @@ primary finishes building the index. To minimize the impact of building an index on your replica set, use the following procedure to build indexes on secondaries: -Procedure -~~~~~~~~~ - .. note:: If you need to build an index in a :term:`sharded cluster`, repeat @@ -307,7 +374,6 @@ Procedure #. Run :method:`rs.stepDown()` on the :term:`primary` member of the set, and then repeat this procedure on the former primary. - .. warning:: Ensure that your :term:`oplog` is large enough to permit the @@ -327,10 +393,12 @@ Procedure clients will not contact the member while you are building the index. +.. index:: index; measure use +.. _index-measure-index-use: .. _indexes-measuring-use: -Measuring Index Use -------------------- +Measure Index Use +----------------- Query performance is a good general indicator of index use; however, for more precise insight into index use, MongoDB provides the @@ -370,8 +438,11 @@ following tools: :dbcommand:`serverStatus` for insight into database-wise index utilization. -Monitoring and Controlling Index Building ------------------------------------------ +.. index:: index; monitor index building +.. _index-monitor-index-building: + +Monitor and Control Index Building +---------------------------------- .. todo:: insert links to the values in the inprog array following the completion of DOCS-162 diff --git a/source/core/indexes.txt b/source/core/indexes.txt index 9d252c746e3..c9080812b6a 100644 --- a/source/core/indexes.txt +++ b/source/core/indexes.txt @@ -10,6 +10,9 @@ procedures, see the :doc:`/administration/indexes` document. For strategies and practical approaches, see the :doc:`/applications/indexes` document. +.. index:: index; overview +.. _index-overview-synopsis: + Synopsis -------- @@ -17,14 +20,24 @@ An index is a data structure that allows you to quickly locate documents based on the values stored in certain specified fields. Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB supports indexes on any field or sub-field contained in -documents within a MongoDB collection. MongoDB indexes have the -following core features: +documents within a MongoDB collection. + +MongoDB indexes have the following core features: - MongoDB defines indexes on a per-:term:`collection` level. -- Indexes often dramatically increase the performance of queries; - however, each index creates a slight overhead for every write +- You can create indexes on a single field or on multiple fields using + a :ref:`compound index `. + +- Indexes enhance query performance, often dramatically. + However, each index also creates a slight overhead for every write operation. + It is important to think about the kinds of queries your application + will need so that you can define relevant indexes. + +- All MongoDB indexes use the ``B-tree`` data structure. The MongoDB + :term:`query optimizer` uses this structure to quickly return query + results. - Every query, including update operations, use one and only one index. The query optimizer selects the index empirically by @@ -32,30 +45,60 @@ following core features: with the best response time for each query type. You can override the query optimizer using the :method:`cursor.hint()` method. -- You can create indexes on a single field or on multiple fields using - a :ref:`compound index `. - -- When the index covers queries, the database returns results more - quickly than queries that have to scan many individual documents. An - index "covers" a query if the keys of the index stores all the data +- When an index "covers" a query, the database returns results more + quickly than for queries that have to scan many individual documents. An + index "covers" a query if the index keys store all the data that the query must return. See :ref:`indexes-covered-queries` for more information. -- Using queries with good index coverage will reduce the number of full +- Using queries with good index coverage reduces the number of full documents that MongoDB needs to store in memory, thus maximizing database performance and throughput. +- When you update a document, if the document fits in its previous + allocation area, only those indexes whose keys have changed are + updated. This improves performance. Note that if the document has + grown and must move, all index keys must then update. + +.. index:: index; secondary index +.. index:: secondary index .. index:: index types .. _index-types: +.. _index-types-secondary: Index Types ----------- -All indexes in MongoDB are "B-Tree" indexes. In the :program:`mongo` -shell, the helper :method:`ensureIndex() ` -provides a method for creating indexes. This section provides an -overview of the types of indexes available in MongoDB as well as an -introduction to their use. +This section describes types of indexes available in MongoDB. + +For all collections, MongoDB creates default the :ref:`_id index +`. This is the ``primary`` index. All other indexes +in MongoDB are :term:`secondary indexes `. You can +create secondary indexes on any field within any document or +sub-document. You can also create secondary indexes multiple fields. +This section describes both the :ref:`_id index ` +and the various secondary indexes. + +In general, you should create indexes that support your primary, common, +and user-facing queries. Doing so requires MongoDB to scan the fewest +number of documents possible. + +In the :program:`mongo` shell, you can create an index by calling the +:method:`ensureIndex() ` method. +Arguments to :method:`ensureIndex() ` +resemble the following: + +.. code-block:: javascript + + { "field": 1 } + { "field0.field1": 1 } + { "field0": 1, "field1": 1 } + +For each field in the index you will specify either ``1`` for an +ascending order or ``-1`` for a descending order, which represents the +order of the keys in the index. For indexes with more than one key (i.e. +:ref:`compound indexes `) the sequence of fields is +important. .. index:: _id index .. index:: _id @@ -63,8 +106,8 @@ introduction to their use. .. index:: index types; primary key .. _index-type-primary: -_id -~~~ +_id Index +~~~~~~~~~ The ``_id`` index is a :ref:`unique index ` [#unique-index-report]_ on the ``_id`` field, and MongoDB creates this @@ -96,73 +139,86 @@ is a 12-byte unique identifiers suitable for use as the value of an See the :ref:`release notes <2.2-id-indexes-capped-collections>` for more information. -.. todo:: fix the above when a full capped-collection page exists. +.. todo fix the above when a full capped-collection page exists. -.. _index-types-secondary: +.. index:: index; embedded documents +.. _index-embedded-documents: -Secondary Indexes -~~~~~~~~~~~~~~~~~ +Indexes on Embedded Documents +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -All indexes in MongoDB are -:term:`secondary indexes `. You can create indexes on -any field within any document or sub-document. Additionally, you can -create compound indexes with multiple fields, so that a single query -can match multiple components using the index without needing to scan -(as many) actual documents. +Indexed fields may be of any type, including embedded documents. -In general, you should have secondary indexes that support all of your -primary, common, and user-facing queries and require MongoDB to scan -the fewest number of documents possible. +.. example:: -To create a secondary index, use the -:method:`ensureIndex() ` -method. The argument to -:method:`ensureIndex() ` -will resemble the following in the MongoDB shell: + Given the following document in the ``factories`` collection: -.. code-block:: javascript + .. code-block:: javascript - { "field": 1 } - { "field0.field1": 1 } - { "field0": 1, "field1": 1 } + { "_id": ObjectId(...), metro: { city: "New York", state: "NY" } } ) -For each field in the index you will specify either ``1`` for an -ascending order or ``-1`` for a descending order, which represents the -order of the keys in the index. For indexes with more than one key -(i.e. "compound indexes,") the sequence of fields is important. + You can create an index on the ``metro`` key. The following queries would + then use that index, and both would return the above document: -Embedded Fields -``````````````` + .. code-block:: javascript -You can create indexes on fields that exist in sub-documents within -your collection. Consider the collection ``people`` that holds -documents that resemble the following example document: + db.factories.find( { metro: { city: "New York", state: "NY" } } ); -.. code-block:: javascript + db.factories.find( { metro: { $gte : { city: "New York" } } } ); + + The second query returns the document because ``{ city: "New York" + }`` is less than ``{ city: "New York", state: "NY" }`` The order of + comparison is in ascending key order in the order the keys occur in + the :term:`BSON` document. + + The following query, however, would not match the document as the + order of the fields is significant: + + .. code-block:: javascript + + db.factories.find( { metro: { state: "NY" , city: "New York" } } ) + +.. index:: index; embedded fields +.. _index-embedded-fields: + +Indexes on Embedded Fields +~~~~~~~~~~~~~~~~~~~~~~~~~~ - {"_id": ObjectId(...) - "name": "John Doe" - "address": { +You can create indexes on fields that exist in sub-documents within your +collection. This differs from :ref:`index-embedded-documents` in that it +doesn't index the document but instead "reaches in" and indexes a +specific field. Introspecting sub-documents in this way is commonly +called "dot notation." + +.. example:: + + Consider the collection ``people`` that holds documents that resemble + the following example document: + + .. code-block:: javascript + + {"_id": ObjectId(...) + "name": "John Doe" + "address": { "street": "Main" "zipcode": 53511 "state": "WI" - } - } + } + } -You can create an index on the ``address.zipcode`` field, using the -following specification: + You can create an index on the ``address.zipcode`` field, using the + following specification: -.. code-block:: javascript + .. code-block:: javascript - db.people.ensureIndex( { "address.zipcode": 1 } ) - -Introspecting sub-documents in this way is commonly called "dot -notation." + db.people.ensureIndex( { "address.zipcode": 1 } ) +.. index:: index; compound +.. index:: compound index .. _index-type-compound: Compound Indexes -```````````````` +~~~~~~~~~~~~~~~~ MongoDB supports "compound indexes," where a single index structure holds references to multiple fields within a collection's @@ -175,6 +231,7 @@ that resemble the following example document: "_id": ObjectId(...) "item": "Banana" "category": ["food", "produce", "grocery"] + "location": "4th Street Store" "stock": 4 "type": cases "arrival": Date(...) @@ -186,29 +243,40 @@ specify a single compound index to support both of these queries: .. code-block:: javascript - db.products.ensureIndex( { "item": 1, "stock": 1 } ) + db.products.ensureIndex( { "item": 1, "location": 1, "stock": 1 } ) -MongoDB will be able to use this index to support queries that select -the ``item`` field as well as those queries that select the ``item`` -field **and** the ``stock`` field. However, this index will not -be useful for queries that select *only* the ``stock`` field. +Compound indexes support queries on any beginning subset of fields in the index. +For example, MongoDB can use the above index to support queries that select +the ``item`` field and to support queries that select the ``item`` +field **and** the ``location`` field. The index, however, would +not support queries that select the following: -.. note:: +- only the ``location`` field +- only the ``stock`` field +- only the ``location`` and ``stock`` fields +- only the ``item`` and ``stock`` fields - The order of fields in a compound index is very important. In the - previous example, the index will contain references to documents - sorted by the values of the ``item`` field, and within each item, - sorted by values of the ``stock`` field. +When creating an index, the number associated with a key specifies the +direction of the index. The options are ``1`` (ascending) and ``-1`` +(descending). Direction doesn't matter for single key indexes or for +random access retrieval but is important if you are doing sorts or range +queries on compound indexes. +The order of fields in a compound index is very important. In the +previous example, the index will contain references to documents sorted +first by the values of the ``item`` field and second, within each item, sorted by +values of the ``stock`` field. + +.. index:: index; sort order .. _index-ascending-and-descending: -Ascending and Descending -```````````````````````` +Indexes with Ascending and Descending Keys +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Indexes store references to fields in either ascending or descending order. For single-field indexes, the order of keys doesn't matter, because MongoDB can traverse the index in either direction. However, -for compound indexes, if you need to order results against two fields, +for :ref:`compound indexes `, if you need to order results against two fields, sometimes you need the index fields running in opposite order relative to each other. @@ -235,58 +303,177 @@ the following command: db.events.ensureIndex( { "username" : 1, "timestamp" : -1 } ) +.. index:: index; multikey .. _index-type-multi-key: .. _index-type-multikey: -Multikey -```````` +Multikey Indexes +~~~~~~~~~~~~~~~~ + +If you index a field that contains an array, MongoDB indexes each value +in the field separately. This is called a "multikey" index. A multikey +index records separate entries for every value in array field. A +multikey index can index only one array field. MongoDB does not let you +create an index that includes two or more array fields. + +.. example:: + + Given the following document: -If you index a field that contains an array, you will create a -multikey index, which adds entries to the index for *every* item in -the array. Consider a ``feedback`` collection with documents in the -following form: + .. code-block:: javascript + + { "_id" : ObjectId("..."), + "name" : "Warm Weather", + "author" : "Steve", + "tags" : [ "weather", "hot", "record", "april" ] } + + Then an index on the ``tags`` field would be a multikey index and + would include these separate entries: + + .. code-block:: none + + tags: "weather" + tags: "hot" + tags: "record" + tags: "april" + + All could be used to return the document. + +You can use multikey indexes to index fields within objects embedded in +arrays. + +.. example:: + + Consider a ``feedback`` collection with documents in the following form: + + .. code-block:: javascript + + { + "_id": ObjectId(...) + "title": "Grocery Quality" + "comments": [ + { author_id: ObjectId(...) + date: Date(...) + text: "Please expand the cheddar selection." }, + { author_id: ObjectId(...) + date: Date(...) + text: "Please expand the mustard selection." }, + { author_id: ObjectId(...) + date: Date(...) + text: "Please expand the olive selection." } + ] + } + + An index on the ``comments.text`` field would be a multikey index and + would add items to the index for all of the sub-documents in the + array. + + Running the following query to locate the document: + + .. code-block:: javascript + + db.feedback.find( { "comments.text": "Please expand the olive selection." } ) + + returns this document: + + .. code-block:: javascript + + { author_id: ObjectId(...) + date: Date(...) + text: "Please expand the olive selection." } + +.. include:: /includes/note-build-indexes-on-replica-sets.rst + +When using a :ref:`compound index `, at most one of +indexed values in any document can be an array. So if we have an index +on ``{a: 1, b: 1}``, the following documents are both fine: .. code-block:: javascript - { - "_id": ObjectId(...) - "title": "Grocery Quality" - "comments": [ - { author_id: ObjectId(..) - date: Date(...) - text: "Please expand the cheddar selection." }, - { author_id: ObjectId(..) - date: Date(...) - text: "Please expand the mustard selection." }, - { author_id: ObjectId(..) - date: Date(...) - text: "Please expand the olive selection." } - ] - } + {a: [1, 2], b: 1} -An index on the ``comments.text`` field would be a multikey index, and will -add items to the index for all of the sub-documents in the array. As a -result you will be able to run the following query, using only the -index to locate the document: + {a: 1, b: [1, 2]} + +This document, however, will fail to be inserted and will generate an error message that +you "cannot index parallel arrays": .. code-block:: javascript - db.feedback.find( { "comments.text": "Please expand the olive selection." } ) + {a: [1, 2], b: [1, 2]} -.. include:: /includes/note-build-indexes-on-replica-sets.rst +The problem with indexing parallel arrays is that each value in the +cartesian product of the compound keys would have to be indexed, which +can get out of hand very quickly. -.. warning:: +.. todo add a sentence and link: + For procedures for querying on values in an array, see LINK???. + +Querying on Multiple Values in a Given Array +```````````````````````````````````````````` + +You can query for a match on multiple items in an array by using the +:operator:`$all` operator. + +.. example:: + + Given the following documents in the ``weather`` collection: + + .. code-block:: javascript + + { _id : "April", tags : [ "hot", "record", "sunny" ] } + { _id : "May", tags : [ "cool", "record", "rainy" ] } + { _id : "June", tags : [ "hot", "normal" , "humid" ] } + + The following query returns the ``_id : "June"`` document but not the + ``_id" : "April"`` document: + + .. code-block:: javascript - MongoDB will refuse to insert documents into a compound index where - more than one field is an array (i.e. ``{a: [1, 2], b: [1, 2]}``); - however, MongoDB permits documents in collections with compound - indexes where only one field per compound index is an array - (i.e. ``{a: [1, 2], b: 1}`` and ``{a: 1, b: [1, 2]}``.) + db.weather.find( { tags: { $all: ["hot", "humid" ] } } ) +Exact Array Matching +```````````````````` + +You can *cannot* perform exact array matching with a multikey index. A +multikey index records each array value separately, so querying on +an exact array match will not use the multikey index but will instead +query through every document in the collection. + +Using Multikeys to Simulate a Large Number of Indexes +````````````````````````````````````````````````````` + +One way to work with data that has a high degree of options for +query-ability is to use the multikey indexing feature where the keys are +objects. For example, the following document allows you to add an +unlimited number of attributes types: + +.. code-block:: javascript + + { _id : ObjectId(...), + attrib : [ + { color: "red" }, + { shape: "rectangle" }, + { color: "blue" }, + { avail: true } + ] + } + +You could, for example, perform the following disparate queries, both +using the multikey index: + +.. code-block:: javascript + + db.mycollection.find( { attribs: { color: "blue" } } ) + db.mycollection.find( { attribs: { avail: false } } ) + +This approach is helpful for for simple attribute lookups. The approach +is not necessary helpful for sorting or certain other query types. + +.. index:: index; unique .. _index-type-unique: -Unique Index -~~~~~~~~~~~~ +Unique Indexes +~~~~~~~~~~~~~~ A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. To create a unique index @@ -297,6 +484,8 @@ following operation in the :program:`mongo` shell: db.addresses.ensureIndex( { "user_id": 1 }, { unique: true } ) +By default, the ``unique`` option is set to ``false``. + If you use the unique constraint on a :ref:`compound index ` then MongoDB will enforce uniqueness on the *combination* of values, rather than the individual value for any or all @@ -312,11 +501,14 @@ from the unique index. .. index:: index; sparse .. _index-type-sparse: -Sparse Index -~~~~~~~~~~~~ +Sparse Indexes +~~~~~~~~~~~~~~ + +Sparse indexes only contain entries for documents that have the indexed +field. Any document that is missing the field is not indexed. The index +is "sparse" because of the missing documents when values are missing. -Sparse indexes only contain entries for documents that have the -indexed field. By contrast, non-sparse indexes contain all documents +By contrast, non-sparse indexes contain all documents in a collection, and store null values for documents that do not contain the indexed field. Create a sparse index on the ``xmpp_id`` field, of the ``members`` collection, using the following operation in @@ -326,6 +518,8 @@ the :program:`mongo` shell: db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } ) +By default, the ``sparse`` option is set to ``false``. + .. warning:: Using these indexes will sometimes result in incomplete results @@ -345,40 +539,44 @@ the :program:`mongo` shell: .. _`block-level`: http://en.wikipedia.org/wiki/Index_%28database%29#Sparse_index> +.. index:: index; options .. _index-creation-operations: .. _index-operations: Index Creation Options ---------------------- -Most parameters [#index-parameters]_ to the :method:`ensureIndex() -` operation affect the kind of index that -MongoDB creates. Two options, :ref:`background construction -` and :ref:`duplicate dropping -`, affect how MongoDB builds the -indexes. +You specify index creation options in the second argument in +:method:`ensureIndex() `. -.. [#index-parameters] Other functionality accessible by way of - parameters include :ref:`sparse `, :ref:`unique - `, and :ref:`TTL `. +The options :ref:`sparse `, :ref:`unique +`, and :ref:`TTL ` affect the kind +of index that MongoDB creates. Two options, :ref:`background +construction ` and :ref:`duplicate dropping +`, affect how MongoDB builds the +indexes. Those two are discussed in this section. +.. index:: index; background creation .. _index-creation-background: Background Construction ~~~~~~~~~~~~~~~~~~~~~~~ By default, creating an index is a blocking operation. Building an -index on a large collection of data, the operation can take a long +index on a large collection of data can take a long time to complete. To resolve this issue, the background option can allow you to continue to use your :program:`mongod` instance during -the index build. Create an index in the background of the ``zipcode`` -field of the ``people`` collection using a command that resembles the -following: +the index build. + +For example, to create an index in the background of the ``zipcode`` +field of the ``people`` collection you would issue the following: .. code-block:: javascript db.people.ensureIndex( { zipcode: 1}, {background: true} ) +By default, the ``background`` option is set to ``false``. + You can combine the background option with other options, as in the following: @@ -394,7 +592,7 @@ construction: .. versionchanged:: 2.2 Before 2.2, a single :program:`mongod` instance could only build - one index at a time. + one index at a time. - The indexing operation runs in the background so that other database operations can run while creating the index. However, the @@ -408,15 +606,23 @@ construction: larger than the available RAM, then the incremental process can take *much* longer than the foreground build. +- In some cases, not having an index at all can impact performance + almost as much as building an index can. If this is the case, we + recommend your application code check for the index at startup using + the chosen :method:`getIndexes() ` + :api:`method for your driver <>` and terminate if the index cannot be + found. You can explicitly invoke a separate indexing script when safe + to do so. + .. admonition:: Building Indexes on Secondaries Background index operations on a :term:`replica set` - :term:`primary`, become foreground indexing operations on secondary + :term:`primary` become foreground indexing operations on secondary members of the set. All indexing operations on secondaries block replication. To build large indexes on secondaries the best approach is to - restart one secondary at a time in "standalone" mode and build the + restart one secondary at a time in :term:`standalone` mode and build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other members of the set, and then build the index on the next secondary. When all the @@ -430,10 +636,10 @@ construction: See :ref:`index-building-replica-sets` for more information on this process. - Indexes on secondary members in "recovering" mode are always built - in the foreground to allow them to catch up as soon as possible. + Indexes on secondary members in "recovering" mode are always built in + the foreground to allow them to catch up as soon as possible. - .. todo:: create tutorials for the replica set reindexing + .. todo create tutorials for the replica set reindexing http://www.mongodb.org/display/DOCS/Building+indexes+with+replica+sets .. note:: @@ -444,10 +650,12 @@ construction: Queries will not use these indexes until the index build is complete. +.. index:: index; duplicates +.. index:: index; drop duplicates .. _index-creation-duplicate-dropping: -Duplicate Dropping -~~~~~~~~~~~~~~~~~~ +Drop Duplicates +~~~~~~~~~~~~~~~ MongoDB cannot create a :ref:`unique index ` on a field that has duplicate values. To force the creation of a unique @@ -479,12 +687,16 @@ field of the ``accounts`` collection, use a command in the following form: Specifying ``{ dropDups: true }`` will delete data from your database. Use with extreme caution. +By default, the ``dropDups`` option is set to ``false``. + .. _index-features: .. _index-feature: Index Features -------------- +.. index:: index; TTL index +.. index:: TTL index .. _index-feature-ttl: TTL Indexes @@ -498,7 +710,7 @@ persist in a database for a limited amount of time. These indexes have the following limitations: -- Compound indexes are *not* supported. +- :ref:`Compound indexes ` are *not* supported. - The indexed field **must** be a date :term:`type `. @@ -514,6 +726,8 @@ indexes to fulfill arbitrary queries. .. see:: :doc:`/tutorial/expire-data` +.. index:: index; geospatial +.. index:: geospatial index .. _index-feature-geospatial: Geospatial Indexes @@ -548,13 +762,17 @@ See the :operator:`$near`, and the database command :dbcommand:`geoNear` for more information on accessing geospatial data. -.. todo:: insert link to special /core/geospatial.txt documentation +.. todo insert link to special /core/geospatial.txt documentation on this topic. once that document exists. +.. index:: index; geohaystack index +.. index:: geohaystack index +.. _index-geohaystack-index: + Geohaystack Indexes ~~~~~~~~~~~~~~~~~~~ -.. todo:: update links in the following session as needed: +.. todo update links in the following session as needed: In addition to conventional :ref:`geospatial indexes `, MongoDB also provides a bucket-based @@ -575,17 +793,29 @@ indexes are not suited for finding the closest documents to a particular location, when the closest documents are far away compared to bucket size. -Index Limitations ------------------ +.. index:: index; limitations +.. _index-limitations: + +Index Behaviors and Limitations +------------------------------- -Be aware of the following current limitations of MongoDB's indexes: +Be aware of the following behaviors and limitations: + +- MongoDB indexes are case-sensitive. - A collection may have no more than :ref:`64 indexes `. - Index keys can be no larger than :ref:`1024 bytes `. + This includes the field value or values, the field name or names, and + the :term:`namespace`. Documents whose fields have values greater than + this size cannot be indexed. + + To query for documents that were too large to index, you can use a + command similar to the following: + + .. code-block:: javascript - This includes the field value or values, the field name or names, - and the :term:`namespace`. + db.myCollection.find({: }).hint({$natural: 1}) - The name of an index, including the :term:`namespace` must be shorter than :ref:`128 characters `. From 23e4b6ea4bb6a4c8a4156c568665950b498474c0 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Tue, 6 Nov 2012 22:03:10 -0500 Subject: [PATCH 2/4] DOCS-660 DOCS-659 review edits --- source/administration/indexes.txt | 10 +-- source/core/indexes.txt | 119 ++++++++++++++---------------- 2 files changed, 59 insertions(+), 70 deletions(-) diff --git a/source/administration/indexes.txt b/source/administration/indexes.txt index a94540d40ce..96ad39bfb1e 100644 --- a/source/administration/indexes.txt +++ b/source/administration/indexes.txt @@ -31,9 +31,9 @@ As the name suggests, :method:`ensureIndex() ` only creates an index if an index of the same specification does not already exist. -Once a collection is indexed on a key, queries match the specified key -are fast. If you query on a key for which there is no index, MongoDB -must go through each document checking the key's value. +All indexes support and optimize the performance for queries that select +on this field. For queries that cannot use an index, MongoDB must scan +all documents in a collection for documents that match the query. .. example:: @@ -228,8 +228,8 @@ documentation for additional index creation options. .. index:: index; list indexes .. _index-list-indexes-for-collection: -List a Collection's Indexes ---------------------------- +List all Indexes on a Collection +-------------------------------- To list a collection's indexes, use the :method:`db.collection.getIndexes()` method or a similar diff --git a/source/core/indexes.txt b/source/core/indexes.txt index c9080812b6a..215bfcb2a3e 100644 --- a/source/core/indexes.txt +++ b/source/core/indexes.txt @@ -35,7 +35,7 @@ MongoDB indexes have the following core features: It is important to think about the kinds of queries your application will need so that you can define relevant indexes. -- All MongoDB indexes use the ``B-tree`` data structure. The MongoDB +- All MongoDB indexes use the B-tree data structure. The MongoDB :term:`query optimizer` uses this structure to quickly return query results. @@ -45,20 +45,21 @@ MongoDB indexes have the following core features: with the best response time for each query type. You can override the query optimizer using the :method:`cursor.hint()` method. -- When an index "covers" a query, the database returns results more - quickly than for queries that have to scan many individual documents. An - index "covers" a query if the index keys store all the data - that the query must return. See :ref:`indexes-covered-queries` for - more information. +- An index "covers" a query if the index keys store all the data that + the query must return. When an index "covers" a query, the database + returns results more quickly than for queries that have to scan many + individual documents. See :ref:`indexes-covered-queries` for more + information. - Using queries with good index coverage reduces the number of full documents that MongoDB needs to store in memory, thus maximizing database performance and throughput. -- When you update a document, if the document fits in its previous - allocation area, only those indexes whose keys have changed are - updated. This improves performance. Note that if the document has - grown and must move, all index keys must then update. +- If an update does not change the size of a document or cause the + document to outgrow its allocated area, then MongoDB updates a given + index only if the indexed fields have changed. This improves + performance. Note that if the document has grown and must move, all + index keys must then update. .. index:: index; secondary index .. index:: secondary index @@ -70,9 +71,8 @@ Index Types ----------- This section describes types of indexes available in MongoDB. - -For all collections, MongoDB creates default the :ref:`_id index -`. This is the ``primary`` index. All other indexes +For all collections, MongoDB creates the default :ref:`_id index +`. This is the primary index. All other indexes in MongoDB are :term:`secondary indexes `. You can create secondary indexes on any field within any document or sub-document. You can also create secondary indexes multiple fields. @@ -139,15 +139,15 @@ is a 12-byte unique identifiers suitable for use as the value of an See the :ref:`release notes <2.2-id-indexes-capped-collections>` for more information. -.. todo fix the above when a full capped-collection page exists. +.. todo:: fix the above when a full capped-collection page exists. -.. index:: index; embedded documents -.. _index-embedded-documents: +.. index:: index; subdocuments +.. _index-subdocuments: -Indexes on Embedded Documents -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Indexes on Subdocuments +~~~~~~~~~~~~~~~~~~~~~~~ -Indexed fields may be of any type, including embedded documents. +Indexed fields may be of any type, including subdocuments. .. example:: @@ -171,13 +171,6 @@ Indexed fields may be of any type, including embedded documents. comparison is in ascending key order in the order the keys occur in the :term:`BSON` document. - The following query, however, would not match the document as the - order of the fields is significant: - - .. code-block:: javascript - - db.factories.find( { metro: { state: "NY" , city: "New York" } } ) - .. index:: index; embedded fields .. _index-embedded-fields: @@ -185,26 +178,24 @@ Indexes on Embedded Fields ~~~~~~~~~~~~~~~~~~~~~~~~~~ You can create indexes on fields that exist in sub-documents within your -collection. This differs from :ref:`index-embedded-documents` in that it +collection. This differs from :ref:`index-subdocuments` in that it doesn't index the document but instead "reaches in" and indexes a specific field. Introspecting sub-documents in this way is commonly called "dot notation." -.. example:: +Consider the collection ``people`` that holds documents that resemble +the following example document: - Consider the collection ``people`` that holds documents that resemble - the following example document: - - .. code-block:: javascript +.. code-block:: javascript - {"_id": ObjectId(...) - "name": "John Doe" - "address": { - "street": "Main" - "zipcode": 53511 - "state": "WI" - } - } + {"_id": ObjectId(...) + "name": "John Doe" + "address": { + "street": "Main" + "zipcode": 53511 + "state": "WI" + } + } You can create an index on the ``address.zipcode`` field, using the following specification: @@ -310,11 +301,10 @@ the following command: Multikey Indexes ~~~~~~~~~~~~~~~~ -If you index a field that contains an array, MongoDB indexes each value -in the field separately. This is called a "multikey" index. A multikey -index records separate entries for every value in array field. A -multikey index can index only one array field. MongoDB does not let you -create an index that includes two or more array fields. +If you index a field that contains an array, MongoDB indexes each +value in the array separately, in a "multikey index." While you can +create multikey :ref:`compound indexes `, you +cannot include two array fields in a compound index. .. example:: @@ -405,7 +395,7 @@ The problem with indexing parallel arrays is that each value in the cartesian product of the compound keys would have to be indexed, which can get out of hand very quickly. -.. todo add a sentence and link: +.. todo:: add a sentence and link: For procedures for querying on values in an array, see LINK???. Querying on Multiple Values in a Given Array @@ -439,12 +429,12 @@ multikey index records each array value separately, so querying on an exact array match will not use the multikey index but will instead query through every document in the collection. -Using Multikeys to Simulate a Large Number of Indexes -````````````````````````````````````````````````````` +Using a Multikey Index for Attribute Lookups +```````````````````````````````````````````` -One way to work with data that has a high degree of options for -query-ability is to use the multikey indexing feature where the keys are -objects. For example, the following document allows you to add an +For simple attribute lookups, an effective indexing strategy can be to +index a field that contains an array of documents. For example, the +``attrib`` field in the following document allows you to add an unlimited number of attributes types: .. code-block:: javascript @@ -458,16 +448,15 @@ unlimited number of attributes types: ] } -You could, for example, perform the following disparate queries, both -using the multikey index: +The following queries would *both* use the multikey index: .. code-block:: javascript db.mycollection.find( { attribs: { color: "blue" } } ) db.mycollection.find( { attribs: { avail: false } } ) -This approach is helpful for for simple attribute lookups. The approach -is not necessary helpful for sorting or certain other query types. +Use this kind of indexing strategy for simple attribute lookups rather +than sorted query results or range queries. .. index:: index; unique .. _index-type-unique: @@ -606,13 +595,13 @@ construction: larger than the available RAM, then the incremental process can take *much* longer than the foreground build. -- In some cases, not having an index at all can impact performance - almost as much as building an index can. If this is the case, we - recommend your application code check for the index at startup using - the chosen :method:`getIndexes() ` - :api:`method for your driver <>` and terminate if the index cannot be - found. You can explicitly invoke a separate indexing script when safe - to do so. +- In some cases, not having an index can impact performance in that the + index must be built when the application starts. If this is a concern, + we recommend your application code check for the index at startup + using the :method:`getIndexes() ` method + or the :api:`equivalent method for your driver <>` and terminate if + the index cannot be found. You can explicitly invoke a separate + indexing script when safe to do so. .. admonition:: Building Indexes on Secondaries @@ -639,7 +628,7 @@ construction: Indexes on secondary members in "recovering" mode are always built in the foreground to allow them to catch up as soon as possible. - .. todo create tutorials for the replica set reindexing + .. todo:: create tutorials for the replica set reindexing http://www.mongodb.org/display/DOCS/Building+indexes+with+replica+sets .. note:: @@ -762,7 +751,7 @@ See the :operator:`$near`, and the database command :dbcommand:`geoNear` for more information on accessing geospatial data. -.. todo insert link to special /core/geospatial.txt documentation +.. todo:: insert link to special /core/geospatial.txt documentation on this topic. once that document exists. .. index:: index; geohaystack index @@ -772,7 +761,7 @@ data. Geohaystack Indexes ~~~~~~~~~~~~~~~~~~~ -.. todo update links in the following session as needed: +.. todo:: update links in the following session as needed: In addition to conventional :ref:`geospatial indexes `, MongoDB also provides a bucket-based From ba46e8d450b924c2acd646cb355f2c3a6f324c3a Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Tue, 13 Nov 2012 14:50:35 -0500 Subject: [PATCH 3/4] DOCS-660 DOCS-659 review edits --- source/administration/indexes.txt | 10 ++++---- source/core/indexes.txt | 39 +++++++++++-------------------- 2 files changed, 18 insertions(+), 31 deletions(-) diff --git a/source/administration/indexes.txt b/source/administration/indexes.txt index 96ad39bfb1e..cd114acd218 100644 --- a/source/administration/indexes.txt +++ b/source/administration/indexes.txt @@ -50,7 +50,7 @@ all documents in a collection for documents that match the query. db.things.find( { y: 2 } ) -If your collection is large, build the database in the background, as +If your collection is large, build the index in the background, as described in :ref:`index-creation-background`. If you build in the background on a live replica set, see also :ref:`index-build-on-replica-sets`. @@ -86,7 +86,7 @@ resulting index. .. index:: index; options .. _index-special-creation-options: -If your collection is large, build the database in the background, as +If your collection is large, build the index in the background, as described in :ref:`index-creation-background`. If you build in the background on a live replica set, see also :ref:`index-build-on-replica-sets`. @@ -152,7 +152,7 @@ records for the same legal entity: db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } ) -The :ref:`_id index ` is a unique index. In some +The :ref:`_id index ` is a unique index. In some situations you may consider using ``_id`` field itself for this kind of data rather than using a unique index on another field. @@ -277,7 +277,7 @@ the operation: Where the value of ``nIndexesWas`` reflects the number of indexes *before* removing this index. You can also use the :method:`db.collection.dropIndexes()` to remove *all* indexes, except -for the :ref:`_id index ` from a collection. +for the :ref:`_id index ` from a collection. These shell helpers provide wrappers around the :dbcommand:`dropIndexes` :term:`database command`. Your :doc:`client @@ -292,7 +292,7 @@ Rebuild Indexes If you need to rebuild indexes for a collection you can use the :method:`db.collection.reIndex()` method. This will drop all indexes, -including the :ref:`_id index `, and then rebuild +including the :ref:`_id index `, and then rebuild all indexes. The operation takes the following form: .. code-block:: javascript diff --git a/source/core/indexes.txt b/source/core/indexes.txt index 215bfcb2a3e..33dacf0b9e7 100644 --- a/source/core/indexes.txt +++ b/source/core/indexes.txt @@ -61,23 +61,18 @@ MongoDB indexes have the following core features: performance. Note that if the document has grown and must move, all index keys must then update. -.. index:: index; secondary index -.. index:: secondary index .. index:: index types .. _index-types: -.. _index-types-secondary: Index Types ----------- This section describes types of indexes available in MongoDB. For all collections, MongoDB creates the default :ref:`_id index -`. This is the primary index. All other indexes -in MongoDB are :term:`secondary indexes `. You can -create secondary indexes on any field within any document or -sub-document. You can also create secondary indexes multiple fields. -This section describes both the :ref:`_id index ` -and the various secondary indexes. +`. +You can +create additional indexes on any field within any document or +sub-document. You can also create indexes on multiple fields. In general, you should create indexes that support your primary, common, and user-facing queries. Doing so requires MongoDB to scan the fewest @@ -91,8 +86,8 @@ resemble the following: .. code-block:: javascript { "field": 1 } - { "field0.field1": 1 } - { "field0": 1, "field1": 1 } + { "product.quantity": 1 } + { "product": 1, "quantity": 1 } For each field in the index you will specify either ``1`` for an ascending order or ``-1`` for a descending order, which represents the @@ -104,7 +99,7 @@ important. .. index:: _id .. index:: index; _id .. index:: index types; primary key -.. _index-type-primary: +.. _index-type-id: _id Index ~~~~~~~~~ @@ -250,7 +245,7 @@ not support queries that select the following: When creating an index, the number associated with a key specifies the direction of the index. The options are ``1`` (ascending) and ``-1`` (descending). Direction doesn't matter for single key indexes or for -random access retrieval but is important if you are doing sorts or range +random access retrieval but is important if you are doing sort queries on compound indexes. The order of fields in a compound index is very important. In the @@ -283,7 +278,7 @@ following prototype: .. code-block:: javascript - db.products.ensureIndex( { "field0": 1, "field1": -1 } ) + db.products.ensureIndex( { "fieldA": 1, "fieldB": -1 } ) Consider a collection of event data that includes both usernames and a timestamp. If you want to return a list of events sorted by username @@ -392,8 +387,8 @@ you "cannot index parallel arrays": {a: [1, 2], b: [1, 2]} The problem with indexing parallel arrays is that each value in the -cartesian product of the compound keys would have to be indexed, which -can get out of hand very quickly. +cartesian product of the compound keys would have to be indexed, +which could quickly result in very large indexes that cannot contain all matching documents. .. todo:: add a sentence and link: For procedures for querying on values in an array, see LINK???. @@ -421,14 +416,6 @@ You can query for a match on multiple items in an array by using the db.weather.find( { tags: { $all: ["hot", "humid" ] } } ) -Exact Array Matching -```````````````````` - -You can *cannot* perform exact array matching with a multikey index. A -multikey index records each array value separately, so querying on -an exact array match will not use the multikey index but will instead -query through every document in the collection. - Using a Multikey Index for Attribute Lookups ```````````````````````````````````````````` @@ -709,8 +696,8 @@ These indexes have the following limitations: .. include:: /includes/note-ttl-collection-background-timing.rst -In all other respects, TTL indexes are normal :ref:`secondary indexes -`, and if appropriate, MongoDB can use these +In all other respects, TTL indexes are normal indexes, +and if appropriate, MongoDB can use these indexes to fulfill arbitrary queries. .. see:: :doc:`/tutorial/expire-data` From 81a0e241d1deb55e6faf63bf721269a09aa06017 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Tue, 13 Nov 2012 16:56:07 -0500 Subject: [PATCH 4/4] DOCS-660 DOCS-659 re-added array matching --- source/core/indexes.txt | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/source/core/indexes.txt b/source/core/indexes.txt index 33dacf0b9e7..a4d89ba3bfb 100644 --- a/source/core/indexes.txt +++ b/source/core/indexes.txt @@ -393,6 +393,12 @@ which could quickly result in very large indexes that cannot contain all matchin .. todo:: add a sentence and link: For procedures for querying on values in an array, see LINK???. +Exact Array Matching +```````````````````` + +An index can be used to look up an exact array match. It does so by +looking up the first element within the array in the index. + Querying on Multiple Values in a Given Array ````````````````````````````````````````````