diff --git a/source/administration/production-notes.txt b/source/administration/production-notes.txt index 49d45bd3fc8..a10199633f6 100644 --- a/source/administration/production-notes.txt +++ b/source/administration/production-notes.txt @@ -252,3 +252,225 @@ bwm-ng command-line tool for monitoring network use. If you suspect a network-based bottleneck, you may use ``bwm-ng`` to begin your diagnostic process. + +.. _gotchas: + +Production Checklist +-------------------- + +64-bit Builds for Production +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Always use 64-bit Builds for Production. MongoDB uses memory mapped +files. See the :ref:`32-bit limitations ` for +more information. + +32-bit builds exist to support use on development machines and also for +other miscellaneous things such as replica set arbiters. + +BSON Document Size Limit +~~~~~~~~~~~~~~~~~~~~~~~~ + +There is a :limit:`BSON Document Size` -- at the time of this writing +16MB per document. If you have large objects, use :doc:`GridFS +` instead. + +Set Appropriate Write Concern for Write Operations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +See :ref:`write concern ` for more information. + +.. MongoDB requires an explicit request for acknowledgement of the + results of a write. This is the getLastError command. If you don't call + it at all, that is likely bad -- unless you are doing so with intent. + The intent is to provide a way to do batch operations without + continuous client/server turnarounds on every write of a batch of say, + a million. The drivers support automatically calling this if you + indicate a "write concern" for your connection to the database. + +.. For example, if you try to insert a document above the BSON size limit + indicated in the above section, [getLastError|DOCS:getLastError + Command]/[writeconcern|http://docs.mongodb.org/manual/applications/repli + cation/#replica-set-write-concern] would return an error -- *if* you + ask for those. + +.. While this can be very useful when used appropriately, it is + acknowledged this can be confusing at first as this is an untraditional + pattern. + +.. See also the getLastError/WriteConcern parameters -- particularly 'j' + and 'w' parameters. + +Dynamic Schema +~~~~~~~~~~~~~~ + +Data in MongoDB has a *flexible schema*. :term:`Collections +` do not enforce :term:`document` structure. This makes +iterative development and polymorphism much easier. However, it is not +unusual for collections to have highly homogenous document structures +within them. See :doc:`/core/data-modeling` for more information. + +Some operational considerations include: + +- the exact set of collections to be used + +- the indexes to be used, which are created explicitly except for the + ``_id`` index + +- shard key declarations, which are explicit and quite important as it + is hard to change shard keys later + +One very simple rule-of-thumb is to not verbatim import data from a +relational database unmodified: you will generally want to "roll up" +certain data into richer documents that use some embedding of nested +documents and arrays (and/or arrays of subdocuments). + +Updates by Default Affect Only **one** Document +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Set the ``multi`` parameter to ``true`` to :method:`update +` multiple documents that meet the query +criteria. The :program:`mongo` shell syntax is: + +.. code-block:: javascript + + db.my_collection_name.update(my_query, my_update_expression, bool_upsert, bool_multi) + +Set ``bool_multi`` to ``true`` when updating many documents. Otherwise +only the first matched will update! + +Case Sensitive Strings +~~~~~~~~~~~~~~~~~~~~~~ + +MongoDB strings are case sensitive. So a search for ``"joe"`` will not +find ``"Joe"``. + +Consider: + +- storing data in a normalized case format, or + +- using regular expressions ending with ``/i`` + +- and/or using :doc:`$toLower ` or + :doc:`$toUpper ` in the + :doc:`aggregation framework ` + +Type Sensitive Fields +~~~~~~~~~~~~~~~~~~~~~ + +MongoDB data -- which is JSON-style, specifically, :meta-driver:`BSON +` format -- have several data types. + +Consider the following document which has a field ``x`` with the +*string* value ``"123"``: + +.. code-block:: javascript + + { x : "123" } + +Then the following query which looks for a *number* value ``123`` will +**not** return that document: + +.. code-block:: javascript + + db.mycollection.find( { x : 123 } ) + +Locking +~~~~~~~ + +Older versions of MongoDB used a "global lock"; use MongoDB v2.2+ for +better results. See the :doc:`Concurrency ` page for +more information. + +Packages +~~~~~~~~ + +Be sure you have the latest stable release if you are using a package +manager. You can see what is current on the Downloads page, even if you +then choose to install via a package manager. + +Use Odd Number of Replica Set Members +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:doc:`Replica sets ` perform consensus elections. Use +either an odd number of members (e.g., three) or else use an arbiter to +get up to an odd number of votes. + +Don't disable journaling +~~~~~~~~~~~~~~~~~~~~~~~~ + +See :doc:`Journaling ` for more information. + +Keep Replica Set Members Up-to-Date +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is important as MongoDB replica sets support automatic failover. +Thus you want your secondaries to be up-to-date. You have a few options +here: + +1. Monitoring and alerts for any lagging can be done via various means. + MMS shows a graph of replica set lag + +#. Using :ref:`getLastError ` with + ``w:'majority'``, you will get a timeout or no return if a majority of + the set is lagging. This is thus another way to guard against lag and + get some reporting back of its occurrence. + +#. Or, if you want to fail over manually, you can set your secondaries + to ``priority:0`` in their configuration. Then manual action would be + required for a failover. This is practical for a small cluster; for a + large cluster you will want automation. + +Additionally, see information on :ref:`replica set rollbacks +`. + +Additional Deployment Checklist +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Pick your shard keys carefully! They cannot be changed except + manually by making a new collection. + +- You cannot shard an existing collection over 256G. This will + eventually be removed but is a limit in the system today. You could + as a workaround create a new sharded collection and copy over the + data via some script -- albeit that will likely take a while at this + size. + +- Unique indexes are not enforced across shards except for the shard + key itself. + +- Consider :doc:`pre-splitting ` a + sharded collection before a massive bulk import. Usually this isn't + necessary but on a bulk import of size it is helpful. + +- Use :doc:`security/auth ` mode if you + need it. It is not on by default; rather a trusted environment is + assumed. + +- You do not have :doc:`fully generalized transactions + `. Create rich documents + and read the preceding link and consider the use case -- often there + is a good fit. + +- Disable NUMA - we find that works better. This will be printed as an + informational warning at :program:`mongod` startup on a NUMA box + usually. + +- Avoid excessive prefetch/readahead on the filesystem. Check your + prefetch settings. Note on linux the parameter is in *sectors*, not + bytes. 32KBytes (a setting of 64 sectors) is pretty reasonable. + +- Check :doc:`ulimits ` settings. + +- Use SSD if available and economical. Spinning disks can work well but + SSDs and MongoDB are a nice combo. See :ref:`production-nfs` for more + info. + +- Maintain a reasonable pool sizes on clients. If you have 100 client + machines and each has a pool size of 1000 connections, that would be + a worst case of 100,000 connections to MongoDB -- which would be too + much if to a single :program:`mongod` or :program:`mongos` (if to a + cluster with many :program:`mongos`, might be ok). Try to maintain a + reasonable number: 1K is fine; 100K is too much; 10K is ok. + +.. `rsmith`: http://rsmith.co/2012/11/05/mongodb-gotchas-and-how-to-avoid-them \ No newline at end of file diff --git a/source/tutorial/model-tree-structures-with-materialized-paths.txt b/source/tutorial/model-tree-structures-with-materialized-paths.txt index c3cd4d20670..5621c6ba4fb 100644 --- a/source/tutorial/model-tree-structures-with-materialized-paths.txt +++ b/source/tutorial/model-tree-structures-with-materialized-paths.txt @@ -55,17 +55,24 @@ delimiter: db.categories.find( { path: /,Programming,/ } ) -- You can also look for descendants of ``Books`` where Books must be at the topmost level -of the hierarchy as follows: +- You can also retrieve the descendants of ``Books`` where the + ``Books`` is also at the topmost level of the hierarchy: .. code-block:: javascript db.categories.find( { path: /^,Books,/ } ) -- Creating an index on field ``path`` would be somewhat useful. For the ^,Books, example -it helps a lot. For the other case, where the node to be found might be in the middle of the indexed -string, the entire index would be inspected; this might be somewhat helpful as the index might be -significantly smaller than the entire collection. +- An index on the field ``path`` may improve performance, depending on + the query: + + - For the ``/^,Books,/`` example, the index on the ``path`` improves + the query performance significantly. + + - For the ``/,Programming,/`` example, or similar queries where the + node might be in the middle of the indexed string, the query + inspects the entire index. In this case, the index *may* provide + some performance improvement *if* the index is significantly + smaller than the entire collection. .. code-block:: javascript