Skip to content

wiki Gotchas page migration and fix build error #607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 1, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
222 changes: 222 additions & 0 deletions source/administration/production-notes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -252,3 +252,225 @@ bwm-ng
command-line tool for monitoring network use. If you suspect a
network-based bottleneck, you may use ``bwm-ng`` to begin your
diagnostic process.

.. _gotchas:

Production Checklist
--------------------

64-bit Builds for Production
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Always use 64-bit Builds for Production. MongoDB uses memory mapped
files. See the :ref:`32-bit limitations <faq-32-bit-limitations>` for
more information.

32-bit builds exist to support use on development machines and also for
other miscellaneous things such as replica set arbiters.

BSON Document Size Limit
~~~~~~~~~~~~~~~~~~~~~~~~

There is a :limit:`BSON Document Size` -- at the time of this writing
16MB per document. If you have large objects, use :doc:`GridFS
</applications/gridfs/>` instead.

Set Appropriate Write Concern for Write Operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See :ref:`write concern <write-concern>` for more information.

.. MongoDB requires an explicit request for acknowledgement of the
results of a write. This is the getLastError command. If you don't call
it at all, that is likely bad -- unless you are doing so with intent.
The intent is to provide a way to do batch operations without
continuous client/server turnarounds on every write of a batch of say,
a million. The drivers support automatically calling this if you
indicate a "write concern" for your connection to the database.

.. For example, if you try to insert a document above the BSON size limit
indicated in the above section, [getLastError|DOCS:getLastError
Command]/[writeconcern|http://docs.mongodb.org/manual/applications/repli
cation/#replica-set-write-concern] would return an error -- *if* you
ask for those.

.. While this can be very useful when used appropriately, it is
acknowledged this can be confusing at first as this is an untraditional
pattern.

.. See also the getLastError/WriteConcern parameters -- particularly 'j'
and 'w' parameters.

Dynamic Schema
~~~~~~~~~~~~~~

Data in MongoDB has a *flexible schema*. :term:`Collections
<collection>` do not enforce :term:`document` structure. This makes
iterative development and polymorphism much easier. However, it is not
unusual for collections to have highly homogenous document structures
within them. See :doc:`/core/data-modeling` for more information.

Some operational considerations include:

- the exact set of collections to be used

- the indexes to be used, which are created explicitly except for the
``_id`` index

- shard key declarations, which are explicit and quite important as it
is hard to change shard keys later

One very simple rule-of-thumb is to not verbatim import data from a
relational database unmodified: you will generally want to "roll up"
certain data into richer documents that use some embedding of nested
documents and arrays (and/or arrays of subdocuments).

Updates by Default Affect Only **one** Document
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Set the ``multi`` parameter to ``true`` to :method:`update
<db.collection.update>` multiple documents that meet the query
criteria. The :program:`mongo` shell syntax is:

.. code-block:: javascript

db.my_collection_name.update(my_query, my_update_expression, bool_upsert, bool_multi)

Set ``bool_multi`` to ``true`` when updating many documents. Otherwise
only the first matched will update!

Case Sensitive Strings
~~~~~~~~~~~~~~~~~~~~~~

MongoDB strings are case sensitive. So a search for ``"joe"`` will not
find ``"Joe"``.

Consider:

- storing data in a normalized case format, or

- using regular expressions ending with ``/i``

- and/or using :doc:`$toLower </reference/aggregation/toLower/>` or
:doc:`$toUpper </reference/aggregation/toUpper/>` in the
:doc:`aggregation framework </applications/aggregation/>`

Type Sensitive Fields
~~~~~~~~~~~~~~~~~~~~~

MongoDB data -- which is JSON-style, specifically, :meta-driver:`BSON
</legacy/bson/>` format -- have several data types.

Consider the following document which has a field ``x`` with the
*string* value ``"123"``:

.. code-block:: javascript

{ x : "123" }

Then the following query which looks for a *number* value ``123`` will
**not** return that document:

.. code-block:: javascript

db.mycollection.find( { x : 123 } )

Locking
~~~~~~~

Older versions of MongoDB used a "global lock"; use MongoDB v2.2+ for
better results. See the :doc:`Concurrency </faq/concurrency/>` page for
more information.

Packages
~~~~~~~~

Be sure you have the latest stable release if you are using a package
manager. You can see what is current on the Downloads page, even if you
then choose to install via a package manager.

Use Odd Number of Replica Set Members
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:doc:`Replica sets </replication/>` perform consensus elections. Use
either an odd number of members (e.g., three) or else use an arbiter to
get up to an odd number of votes.

Don't disable journaling
~~~~~~~~~~~~~~~~~~~~~~~~

See :doc:`Journaling </administration/journaling/>` for more information.

Keep Replica Set Members Up-to-Date
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is important as MongoDB replica sets support automatic failover.
Thus you want your secondaries to be up-to-date. You have a few options
here:

1. Monitoring and alerts for any lagging can be done via various means.
MMS shows a graph of replica set lag

#. Using :ref:`getLastError <replica-set-write-concern>` with
``w:'majority'``, you will get a timeout or no return if a majority of
the set is lagging. This is thus another way to guard against lag and
get some reporting back of its occurrence.

#. Or, if you want to fail over manually, you can set your secondaries
to ``priority:0`` in their configuration. Then manual action would be
required for a failover. This is practical for a small cluster; for a
large cluster you will want automation.

Additionally, see information on :ref:`replica set rollbacks
<replica-set-rollback>`.

Additional Deployment Checklist
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Pick your shard keys carefully! They cannot be changed except
manually by making a new collection.

- You cannot shard an existing collection over 256G. This will
eventually be removed but is a limit in the system today. You could
as a workaround create a new sharded collection and copy over the
data via some script -- albeit that will likely take a while at this
size.

- Unique indexes are not enforced across shards except for the shard
key itself.

- Consider :doc:`pre-splitting </administration/sharded-clusters>` a
sharded collection before a massive bulk import. Usually this isn't
necessary but on a bulk import of size it is helpful.

- Use :doc:`security/auth </administration/security/>` mode if you
need it. It is not on by default; rather a trusted environment is
assumed.

- You do not have :doc:`fully generalized transactions
</tutorial/isolate-sequence-of-operations/>`. Create rich documents
and read the preceding link and consider the use case -- often there
is a good fit.

- Disable NUMA - we find that works better. This will be printed as an
informational warning at :program:`mongod` startup on a NUMA box
usually.

- Avoid excessive prefetch/readahead on the filesystem. Check your
prefetch settings. Note on linux the parameter is in *sectors*, not
bytes. 32KBytes (a setting of 64 sectors) is pretty reasonable.

- Check :doc:`ulimits </administration/ulimit/>` settings.

- Use SSD if available and economical. Spinning disks can work well but
SSDs and MongoDB are a nice combo. See :ref:`production-nfs` for more
info.

- Maintain a reasonable pool sizes on clients. If you have 100 client
machines and each has a pool size of 1000 connections, that would be
a worst case of 100,000 connections to MongoDB -- which would be too
much if to a single :program:`mongod` or :program:`mongos` (if to a
cluster with many :program:`mongos`, might be ok). Try to maintain a
reasonable number: 1K is fine; 100K is too much; 10K is ok.

.. `rsmith`: http://rsmith.co/2012/11/05/mongodb-gotchas-and-how-to-avoid-them
19 changes: 13 additions & 6 deletions source/tutorial/model-tree-structures-with-materialized-paths.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,17 +55,24 @@ delimiter:

db.categories.find( { path: /,Programming,/ } )

- You can also look for descendants of ``Books`` where Books must be at the topmost level
of the hierarchy as follows:
- You can also retrieve the descendants of ``Books`` where the
``Books`` is also at the topmost level of the hierarchy:

.. code-block:: javascript

db.categories.find( { path: /^,Books,/ } )

- Creating an index on field ``path`` would be somewhat useful. For the ^,Books, example
it helps a lot. For the other case, where the node to be found might be in the middle of the indexed
string, the entire index would be inspected; this might be somewhat helpful as the index might be
significantly smaller than the entire collection.
- An index on the field ``path`` may improve performance, depending on
the query:

- For the ``/^,Books,/`` example, the index on the ``path`` improves
the query performance significantly.

- For the ``/,Programming,/`` example, or similar queries where the
node might be in the middle of the indexed string, the query
inspects the entire index. In this case, the index *may* provide
some performance improvement *if* the index is significantly
smaller than the entire collection.

.. code-block:: javascript

Expand Down