diff --git a/source/administration/backups.txt b/source/administration/backups.txt index 2c94e11d31a..a063988016d 100644 --- a/source/administration/backups.txt +++ b/source/administration/backups.txt @@ -1,578 +1,55 @@ -================================= -Backup and Restoration Strategies -================================= +================================== +Backing Up the MongoDB Environment +================================== .. default-domain:: mongodb -This document provides an inventory of database backup strategies for -use with MongoDB. This document contains the following sections: +You should have an automated backup system for your MongoDB environment. +The backup system must capture data in a consistent and usable state and +in a way that can be reliably reproduced. The backup system must be +automated and run regularly. And the system must be regularly tested in +practical situations to ensure that your backups are functional. If you +cannot effectively restore your database from the backup, then your +backups are useless. You also should automate tests of backup +restoration. -- :ref:`backup-overview` and :ref:`backup-considerations` describe the - approaches for backing up your MongoDB environment. +This page describes the following: -- :ref:`block-level-backup` and :ref:`database-dumps` describe specific - strategies. +- :ref:`backup-considerations` -- :ref:`backups-with-sharding-and-replication` describes considerations - specific to :term:`replica sets ` and :term:`sharded - clusters `. - -.. _backup-overview: - -Backup Overview ---------------- - -Production systems should always have some consideration and strategy -for taking and restoring backups. The goal of a backup strategy is to -produce a full and consistent copy of the data that you can use to -bring up a new or replacement database instance. However, in some -cases, taking backups is difficult or impossible, given large data -volumes, distributed architectures, and data transmission speeds. - -Nevertheless, with MongoDB, there are two major approaches to backups: -[#replication-as-backups]_ - -- Using system-level tools, like disk image snapshots. - - See :ref:`block-level-backup`. - -- Using various capacities present in the :program:`mongodump` tool. - - See :ref:`database-dumps`. - -The methods described in this document operate by copying the data file -on the disk level. If your system does not provide functionality for -this kind of backup, see the section on :ref:`database-dumps`. - -Ensuring that the state captured by the backup is consistent and usable -is the primary challenge of producing backups of database systems. -Backups that you cannot produce reliably, or restore from feasibly are -worthless. - -As you develop your backup system, take into consideration the specific -features of your deployment, your use patterns, and your architecture. - -Because every environment is unique it's important to regularly test the -backups that you capture to ensure that your backup system is -practically, and not just theoretically, functional. - -.. [#replication-as-backups] In many situations increasing the amount - of replication provides useful assurances with data sets that are - difficult to restore in a timely manner. +- :ref:`backup-approaches` .. _backup-considerations: -Production Considerations for Backup Strategies ------------------------------------------------ +Backup Considerations +--------------------- -When evaluating a backup strategy for your MongoDB deployment consider +As you develop a backup strategy for your MongoDB deployment consider the following factors: - Geography. Ensure that you move some backups away from the your - primary database infrastructure. It's important to be able to - restore your database if you lose access to a system or site. + primary database infrastructure. - System errors. Ensure that your backups can survive situations where hardware failures or disk errors impact the integrity or availability of your backups. -- Production constraints. Backup operations themselves sometimes - require substantial system resources. It's important to consider the - time of the backup schedule relative to peak usage and maintenance - windows. +- Production constraints. Backup operations themselves sometimes require + substantial system resources. It is important to consider the time of + the backup schedule relative to peak usage and maintenance windows. -- System capabilities. Some of the block-level - snapshot tools requires special support on the operating-system or - infrastructure level. +- System capabilities. Some of the block-level snapshot tools requires + special support on the operating-system or infrastructure level. - Database configuration. :term:`Replication` and :term:`sharding - ` can affect the process, and impact of the backup implementation. + ` can affect the process and impact of the backup + implementation. See :ref:`sharded-cluster-backups` and + :ref:`replica-set-backups`. - Actual requirements. You may be able to save time, effort, and space by including only crucial data in the most frequent backups and backing up less crucial data less frequently. -With this information in hand you can begin to develop a backup plan -for your database. Remember that all backup plans must be: - -- Tested. If you cannot effectively restore your database from the - backup, then your backups are useless. Test backup restoration - regularly in practical situations to ensure that your backup system - provides value. - -- Automated. Database backups need to run regularly and - automatically. Also automate tests of backup restoration. - -.. _block-level-backup: - -Using Block Level Backup Methods --------------------------------- - -This section provides an overview of using disk/block level -snapshots (i.e. :term:`LVM` or storage appliance) to backup a MongoDB -instance. These tools make a quick block-level backup of the device -that holds MongoDB's data files. These methods complete quickly, work -reliably, and typically provide the easiest backup systems method to -implement. - -Snapshots work by creating pointers between the live data and a -special snapshot volume. These pointers are theoretically equivalent -to "hard links." As the working data diverges from the snapshot, -the snapshot process uses a copy-on-write strategy. As a result the snapshot -only stores modified data. - -After making the snapshot, you mount the snapshot image on your -file system and copy data from the snapshot. The resulting backup -contains a full copy of all data. - -Snapshots have the following limitations: - -- The database must be in a consistent or recoverable state when the - snapshot takes place. This means that all writes accepted by the - database need to be fully written to disk: either to the - :term:`journal` or to data files. - - If all writes are not on disk when the backup occurs, the backup - will not reflect these changes. If writes are *in progress* when the - backup occurs, the data files will reflect an inconsistent - state. With :term:`journaling ` all data-file states - resulting from in-progress writes are recoverable; without - journaling you must flush all pending writes to disk before - running the backup operation and must ensure that no writes occur during - the entire backup procedure. - - If you do use journaling, the journal must reside on the same volume - as the data. - -- Snapshots create an image of an entire disk image. Unless you need - to back up your entire system, consider isolating your MongoDB data - files, journal (if applicable), and configuration on one logical - disk that doesn't contain any other data. - - Alternately, store all MongoDB data files on a dedicated device - so that you can make backups without duplicating extraneous data. - -- Ensure that you copy data from snapshots and onto other systems to - ensure that data is safe from site failures. - -.. _backup-with-journaling: - -Backup With Journaling -~~~~~~~~~~~~~~~~~~~~~~ - -If your system has snapshot capability and your :program:`mongod` instance -has journaling enabled, then you can use any kind of file system or -volume/block level snapshot tool to create backups. - -.. warning:: - - .. versionchanged:: 1.9.2 - - Journaling is only enabled by default on 64-bit builds of - MongoDB. - - To enable journaling on all other builds, specify - :setting:`journal` = ``true`` in the configuration or use the - :option:`--journal ` run-time option for - :program:`mongod`. - -Many service providers provide a block-level backup service based on -disk image snapshots. If you manage your own infrastructure on a -Linux-based system, configure your system with :term:`LVM` to provide -your disk packages and provide snapshot capability. You can also use -LVM-based setups *within* a cloud/virtualized environment. - -.. note:: - - Running :term:`LVM` provides additional flexibility and enables the - possibility of using snapshots to back up MongoDB. - -If you use Amazon's EBS service in a software RAID 10 configuration, use -:term:`LVM` to capture a consistent disk image. Also, see the special -considerations described in :ref:`backup-amazon-software-raid`. - -The following sections provide an overview of a simple backup process -using :term:`LVM` on a Linux system. While the tools, commands, and paths may -be (slightly) different on your system the following steps provide a -high level overview of the backup operation. - -.. _lvm-backup-operation: - -Create Snapshot -``````````````` - -To create a snapshot with :term:`LVM`, issue a command, as root, in the -following format: - -.. code-block:: sh - - lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb - -This command creates an :term:`LVM` snapshot (with the ``--snapshot`` option) -named ``mdb-snap01`` of the ``mongodb`` volume in the ``vg0`` -volume group. - -This example creates a snapshot named ``mdb-snap01`` located at -``/dev/vg0/mdb-snap01``. The location and paths to your systems volume -groups and devices may vary slightly depending on your operating -system's :term:`LVM` configuration. - -The snapshot has a cap of at 100 megabytes, because of the parameter -``--size 100M``. This size does not reflect the total amount of the -data on the disk, but rather the quantity of differences between the -current state of ``/dev/vg0/mongodb`` and the creation of the snapshot -(i.e. ``/dev/vg0/mdb-snap01``.) - -.. warning:: - - Ensure that you create snapshots with enough space to account for - data growth, particularly for the period of time that it takes to copy - data out of the system or to a temporary image. - - If your snapshot runs out of space, the snapshot image - becomes unusable. Discard this logical volume and create another. - -The snapshot will exist when the command returns. You can restore -directly from the snapshot at any time or by creating a new logical -volume and restoring from this snapshot to the alternate image. - -While snapshots are great for creating high quality backups very -quickly, they are not ideal as a format for storing backup -data. Snapshots typically depend and reside on the same storage -infrastructure as the original disk images. Therefore, it's crucial -that you archive these snapshots and store them elsewhere. - -Archive Snapshots -````````````````` - -After creating a snapshot, mount the snapshot and move the data to -separate storage. Your system might try to compress the backup images as -you move the offline. Consider the following procedure to fully -archive the data from the snapshot: - -.. code-block:: sh - - umount /dev/vg0/mdb-snap01 - dd if=/dev/vg0/mdb-snap01 | tar -czf mdb-snap01.tar.gz - - -The above command sequence: - -- Ensures that the ``/dev/vg0/mdb-snap01`` device is not mounted. - -- Does a block level copy of the entire snapshot image using the ``dd`` - command, and compresses the result in a gzipped tar archive in the - current working directory. - - .. warning:: - - This command will create a large ``tar.gz`` file in your current - working directory. Make sure that you run this command in a file - system that has enough free space. - -.. _backup-restore-snapshot: - -Restore Snapshot -```````````````` - -To restore a backup created with the above method, use the following -procedure: - -.. code-block:: sh - - lvcreate --size 1G --name mdb-new vg0 - tar -xzf mdb-snap01.tar.gz | dd of=/dev/vg0/mdb-new - mount /dev/vg0/mdb-new /srv/mongodb - -The above sequence: - -- Creates a new logical volume named ``mdb-new``, in the ``/dev/vg0`` - volume group. The path to the new device will be ``/dev/vg0/mdb-new``. - - .. warning:: - - This volume will have a maximum size of 1 gigabyte. The original - file system must have had a total size of 1 gigabyte or smaller, or - else the restoration will fail. - - Change ``1G`` to your desired volume size. - -- Uncompresses and unarchives the ``mdb-snap01.tar.gz`` into the - ``mdb-new`` disk image. - -- Mounts the ``mdb-new`` disk image to the ``/srv/mongodb`` directory. - Modify the mount point to correspond to your MongoDB data file - location, or other location as needed. - -.. _backup-restore-from-snapshot: - -Restore Directly from a Snapshot -```````````````````````````````` - -To restore a backup without writing to a compressed ``tar`` archive, use -the following sequence: - -.. code-block:: sh - - umount /dev/vg0/mdb-snap01 - lvcreate --size 1G --name mdb-new vg0 - dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new - mount /dev/vg0/mdb-new /srv/mongodb - -Remote Backup Storage -````````````````````` - -You can implement off-system backups using the :ref:`combined process -` and SSH. - -This sequence is identical to procedures explained above, except that it -archives and compresses the backup on a remote system using SSH. - -Consider the following procedure: - -.. code-block:: sh - - umount /dev/vg0/mdb-snap01 - dd if=/dev/vg0/mdb-snap01 | ssh username@example.com tar -czf /opt/backup/mdb-snap01.tar.gz - lvcreate --size 1G --name mdb-new vg0 - ssh username@example.com tar -xzf /opt/backup/mdb-snap01.tar.gz | dd of=/dev/vg0/mdb-new - mount /dev/vg0/mdb-new /srv/mongodb - -.. _backup-without-journaling: - -Backup Without Journaling -~~~~~~~~~~~~~~~~~~~~~~~~~ - -If your :program:`mongod` instance does not run with journaling -enabled, or if your journal is on a separate volume, obtaining a -functional backup of a consistent state is more complicated. -As described in this section, you must flush all -writes to disk and lock the database to prevent writes during the -backup process. If you have a :term:`replica set` configuration, -then for your backup use a -:term:`secondary` which is not receiving reads (i.e. :term:`hidden -member`). - -1. To flush writes to disk and to "lock" the database (to prevent - further writes), issue the :method:`db.fsyncLock()` method in the - :program:`mongo` shell: - - .. code-block:: javascript - - db.fsyncLock(); - -#. Perform the backup operation described in :ref:`lvm-backup-operation`. - -#. To unlock the database after the snapshot has completed, use the - following command in the :program:`mongo` shell: - - .. code-block:: javascript - - db.fsyncUnlock(); - - .. note:: - - .. versionchanged:: 2.0 - MongoDB 2.0 added :method:`db.fsyncLock()` and - :method:`db.fsyncUnlock()` helpers to the :program:`mongo` - shell. Prior to this version, use the :dbcommand:`fsync` - command with the ``lock`` option, as follows: - - .. code-block:: javascript - - db.runCommand( { fsync: 1, lock: true } ); - db.runCommand( { fsync: 1, lock: false } ); - - .. include:: /includes/note-disable-profiling-fsynclock.rst - - .. include:: /includes/warning-fsync-lock-mongodump.rst - -.. _backup-amazon-software-raid: - -Amazon EBS in Software RAID 10 Configuration -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If your deployment depends on Amazon's Elastic Block Storage (EBS) -with RAID configured *within* your instance, it is impossible to get a -consistent state across all disks using the platform's snapshot -tool. As a result you may: - -- Flush all writes to disk and create a write lock to ensure - consistent state during the backup process. - - If you choose this option see :ref:`backup-without-journaling`. - -- Configure LVM to run and hold your MongoDB data files on top of the - RAID within your system. - - If you choose this option, perform the LVM backup operation described - in :ref:`lvm-backup-operation`. - -.. _database-dumps: - -Using Binary Database Dumps for Backups ---------------------------------------- - -This section describes the process for writing the entire contents of -your MongoDB instance to a file in a binary format. If -disk-level snapshots are not available, this approach -provides the best option for full system database backups. - -.. seealso:: - - The :doc:`/reference/mongodump` and :doc:`/reference/mongorestore` - documents contain complete documentation of these tools. If you - have questions about these tools not covered here, please refer to - these documents. - - If your system has disk level snapshot capabilities, consider the - backup methods described in :ref:`block-level-backup`. - -Database Dump with ``mongodump`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The :program:`mongodump` utility can perform a live backup of data or -can work against an inactive set of database files. -The :program:`mongodump` utility can create a dump for an entire -server/database/collection (or part of a collection using of query), -even when the database is running and active. If you run -:program:`mongodump` without any arguments, the command connects to -the local database instance (e.g. ``127.0.0.1`` or ``localhost``) and -creates a database backup named ``dump/`` in the current directory. - -.. include:: /includes/note-mongodump-compatibility-2.2.rst - -To limit the amount of data included in the database dump, you can -specify :option:`--database ` and -:option:`--collection ` as options to the -:program:`mongodump` command. For example: - -.. code-block:: sh - - mongodump --collection collection --db test - -This command creates a dump of the collection named ``collection`` -from the database ``test`` in a :file:`dump/` subdirectory of the current -working directory. - -Use the :option:`--oplog ` option with -:program:`mongodump` to collect the :term:`oplog` entries to build a -point-in-time snapshot of a database within a replica set. With :option:`--oplog -`, :program:`mongodump` copies all the data from -the source database as well as all of the :term:`oplog` entries from -the beginning of the backup procedure to until the backup procedure -completes. This backup procedure, in conjunction with -:option:`mongorestore --oplogReplay `, -allows you to restore a backup that reflects a consistent and specific -moment in time. - -If your MongoDB instance is not running, you can use the -:option:`--dbpath ` option to specify the -location to your MongoDB instance's database files. :program:`mongodump` -reads from the data files directly with this operation. This -locks the data directory to prevent conflicting writes. The -:program:`mongod` process must *not* be running or attached to these -data files when you run :program:`mongodump` in this -configuration. Consider the following example: - -.. code-block:: sh - - mongodump --dbpath /srv/mongodb - -Additionally, the :option:`--host ` and -:option:`--port ` options allow you to specify a -non-local host to connect to capture the dump. Consider the following -example: - -.. code-block:: sh - - mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/mongodump-2011-10-24 - -On any :program:`mongodump` command you may, as above, specify username -and password credentials to specify database authentication. - -.. _backup-restore-dump: - -Restore Database from Binary Dump with ``mongorestore`` -``````````````````````````````````````````````````````` - -The :program:`mongorestore` utility restores a binary backup created by -:program:`mongodump`. Consider the following example command: - -.. code-block:: sh - - mongorestore dump-2011-10-25/ - -Here, :program:`mongorestore` imports the database backup located in -the :file:`dump-2011-10-25` directory to the :program:`mongod` instance -running on the localhost interface. By default, :program:`mongorestore` -looks for a database dump in the :file:`dump/` directory and restores -that. If you wish to restore to a non-default host, the -:option:`--host ` and :option:`--port ` -options allow you to specify a non-local host to connect to capture -the dump. Consider the following example: - -.. code-block:: sh - - mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mongodump-2011-10-24 - -On any :program:`mongorestore` command you may specify -username and password credentials, as above. - -If you created your database dump using the :option:`--oplog -` option to ensure a point-in-time snapshot, call -:program:`mongorestore` with the -:option:`--oplogReplay ` -option, as in the following example: - -.. code-block:: sh - - mongorestore --oplogReplay - -You may also consider using the :option:`mongorestore --objcheck` -option to check the integrity of objects while inserting them into the -database, or you may consider the :option:`mongorestore --drop` option to drop each -collection from the database before restoring from -backups. :program:`mongorestore` also includes the ability to a filter -to all input before inserting it into the new database. Consider the -following example: - -.. code-block:: sh - - mongorestore --filter '{"field": 1}' - -Here, :program:`mongorestore` only adds documents to the database from -the dump located in the :file:`dump/` folder *if* the documents have a -field name ``field`` that holds a value of ``1``. Enclose the -filter in single quotes (e.g. ``'``) to prevent the filter from -interacting with your shell environment. - -.. code-block:: sh - - mongorestore --dbpath /srv/mongodb --journal - -Here, :program:`mongorestore` restores the database dump located in -:file:`dump/` folder into the data files located at :file:`/srv/mongodb`. -Additionally, -the :option:`--journal ` option ensures that -:program:`mongorestore` records all operation in the durability -:term:`journal`. The journal prevents data file corruption if anything -(e.g. power failure, disk failure, etc.) interrupts the restore -operation. - -.. seealso:: :doc:`/reference/mongodump` and - :doc:`/reference/mongorestore`. - -.. _backups-with-sharding-and-replication: - -Sharded Cluster and Replica Set Backups ---------------------------------------- - -The underlying architecture of :term:`sharded clusters ` and :term:`replica sets ` presents several -challenges for creating backups. This section describes how to make -quality backups in environments with these configurations and how to -perform restorations. - .. _sharded-cluster-backups: Sharded Cluster Backup Considerations @@ -580,12 +57,12 @@ Sharded Cluster Backup Considerations .. include:: /includes/note-shard-cluster-backup.rst -Sharded clusters complicate backup operations, as distributed -systems. True point-in-time backups are only possible when stopping all write -activity from the application. To create a precise moment-in-time -snapshot of a cluster, stop all application write activity to the -database, capture a backup, and only allow write operations to the -database after the backup is complete. +:term:`Sharded clusters ` complicate backup operations, +as distributed systems. True point-in-time backups are only possible +when stopping all write activity from the application. To create a +precise moment-in-time snapshot of a cluster, stop all application write +activity to the database, capture a backup, and allow only write +operations to the database after the backup is complete. However, you can capture a backup of a cluster that **approximates** a point-in-time backup by capturing a backup from a secondary member of @@ -594,7 +71,7 @@ same moment. If you decide to use an approximate-point-in-time backup method, ensure that your application can operate using a copy of the data that does not reflect a single moment in time. -The following documents describe all sharded cluster related backup +The following documents describe sharded cluster related backup procedures: - :doc:`/tutorial/backup-small-sharded-cluster-with-mongodump` @@ -610,20 +87,47 @@ Replica Set Backup Considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In most cases, backing up data stored in a :term:`replica set` is -similar to backing up data stored in a single instance. It's possible to -lock a single :term:`secondary` or :term:`slave` database and then -on create a backup from that instance. When you unlock the database, the secondary or -slave will catch up with the :term:`primary` or :term:`master`. You may also -chose to deploy a dedicated :term:`hidden member` for backup purposes. - -If you have a :term:`sharded cluster` where each :term:`shard` is itself a replica -set, you can use this method to create a backup of the entire cluster -without disrupting the operation of the node. In these situations you -should still turn off the balancer when you create backups. - -For any cluster, using a non-primary/non-master node to create backups is -particularly advantageous in that the backup operation does not -affect the performance of the primary or master. Replication -itself provides some measure of redundancy. Nevertheless, keeping -point-in time backups of your cluster to provide for disaster recovery -and as an additional layer of protection is crucial. +similar to backing up data stored in a single instance. It is possible +to lock a single :term:`secondary` database and then create a backup +from that instance. When you unlock the database, the secondary will +catch up with the :term:`primary`. You may also choose to deploy a +dedicated :term:`hidden member` for backup purposes. + +If you have a :term:`sharded cluster` where each :term:`shard` is itself +a replica set, you can use this method to create a backup of the entire +cluster without disrupting the operation of the node. In these +situations you should still turn off the balancer when you create +backups. + +For any cluster, using a non-primary node to create backups is +particularly advantageous in that the backup operation does not affect +the performance of the primary. Replication itself provides some measure +of redundancy. Nevertheless, keeping point-in time backups of your +cluster to provide for disaster recovery and as an additional layer of +protection is crucial. + +.. _backup-approaches: + +Approaches to Backing Up the MongoDB Environment +------------------------------------------------ + +The best option for backing up your MongoDB environment is to use +block-level backups, which provide duplicate images of your device. +Block-level backups, also called snapshots, complete quickly, work +reliably, and typically provide the easiest backup systems to implement. + +If block-level backup tools are not available, MongoDB provides the +:program:`mongodump` tool, which performs binary-level backups. The +:program:`mongodump` tool writes the entire contents of your MongoDB +instance to a file in a binary format. If snapshots are not available, +this approach provides the best option for full system database backups. + +The following topics provide details and procedures on the two approaches: + +- :doc:`/tutorial/backup-databases-with-filesystem-snapshots`. + +- :doc:`/tutorial/backup-databases-with-binary-database-dumps`. + +In some cases, taking backups is difficult or impossible because of +large data volumes, distributed architectures, and data transmission +speeds. In these situations, increase the amount of replication. diff --git a/source/administration/production-notes.txt b/source/administration/production-notes.txt index 132fa249d6e..49d45bd3fc8 100644 --- a/source/administration/production-notes.txt +++ b/source/administration/production-notes.txt @@ -13,8 +13,8 @@ especially in production. Backups ------- -To make backups of your MongoDB database, please refer to the -:ref:`backups section `. +To make backups of your MongoDB database, please refer to +:doc:`/administration/backups`. Networking ---------- diff --git a/source/administration/replication-architectures.txt b/source/administration/replication-architectures.txt index 6994c0c1c86..0035fa7ef5b 100644 --- a/source/administration/replication-architectures.txt +++ b/source/administration/replication-architectures.txt @@ -163,7 +163,7 @@ dedicated :ref:`hidden member ` for the purpose of creating backups. If this member runs with journaling enabled, you can safely use standard -:ref:`block level backup methods ` to create a +:doc:`block level backup methods ` to create a backup of this member. Otherwise, if your underlying system does not support snapshots, you can connect :program:`mongodump` to create a backup directly from the secondary member. In these cases, use the diff --git a/source/reference/command/fsync.txt b/source/reference/command/fsync.txt index f07a492ddfa..f472b74e7e3 100644 --- a/source/reference/command/fsync.txt +++ b/source/reference/command/fsync.txt @@ -90,8 +90,7 @@ fsync :dbcommand:`fsync` lock is only possible on individual shards of a sharded cluster, not on the entire sharded cluster. To backup an - entire sharded cluster, please read :ref:`considerations for - backing up sharded clusters `. + entire sharded cluster, please read :ref:`sharded-cluster-backups`. If your :program:`mongod` has :term:`journaling ` enabled, consider using :ref:`another method diff --git a/source/reference/mongodump.txt b/source/reference/mongodump.txt index 3a3bac95319..3571e90a674 100644 --- a/source/reference/mongodump.txt +++ b/source/reference/mongodump.txt @@ -212,7 +212,7 @@ from :term:`secondary` members of the set. Usage ----- -See the ":ref:`backup guide section on database dumps `" +See the :doc:`/tutorial/backup-databases-with-binary-database-dumps` for a larger overview of :program:`mongodump` usage. Also see the ":doc:`mongorestore`" document for an overview of the :program:`mongorestore`, which provides the related inverse diff --git a/source/reference/mongorestore.txt b/source/reference/mongorestore.txt index 8cc8099da2e..36d4360084a 100644 --- a/source/reference/mongorestore.txt +++ b/source/reference/mongorestore.txt @@ -230,8 +230,8 @@ Options Usage ----- -See the ":ref:`backup guide section on database dumps -`" for a larger overview of :program:`mongorestore` +See :doc:`/tutorial/backup-databases-with-binary-database-dumps` +for a larger overview of :program:`mongorestore` usage. Also see the ":doc:`mongodump`" document for an overview of the :program:`mongodump`, which provides the related inverse functionality. diff --git a/source/tutorial/backup-databases-with-binary-database-dumps.txt b/source/tutorial/backup-databases-with-binary-database-dumps.txt new file mode 100644 index 00000000000..66078a6d7bf --- /dev/null +++ b/source/tutorial/backup-databases-with-binary-database-dumps.txt @@ -0,0 +1,160 @@ +============================================ +Backup Databases Using Binary Database Dumps +============================================ + +.. default-domain:: mongodb + +This section describes the process for writing the entire contents of +your MongoDB instance to a file in a binary format. If disk-level +snapshots are not available, this approach provides the best option for +full system database backups. If your system has disk level snapshot +capabilities, consider the backup methods described in +:doc:`/tutorial/backup-databases-with-filesystem-snapshots`. + +This page describes the following: + +- :ref:`backup-mongodump` + +- :ref:`backup-restore-dump` + +.. seealso:: + + - :doc:`/reference/mongodump` + - :doc:`/reference/mongorestore` + +.. _backup-mongodump: + +Backup a Database with ``mongodump`` +------------------------------------- + +The :program:`mongodump` utility can perform a live backup of data or +can work against an inactive set of database files. +The :program:`mongodump` utility can create a dump for an entire +server/database/collection (or part of a collection using of query), +even when the database is running and active. If you run +:program:`mongodump` without any arguments, the command connects to +the local database instance (e.g. ``127.0.0.1`` or ``localhost``) and +creates a database backup named ``dump/`` in the current directory. + +.. include:: /includes/note-mongodump-compatibility-2.2.rst + +To limit the amount of data included in the database dump, you can +specify :option:`--database ` and +:option:`--collection ` as options to the +:program:`mongodump` command. For example: + +.. code-block:: sh + + mongodump --collection collection --db test + +This command creates a dump of the collection named ``collection`` +from the database ``test`` in a :file:`dump/` subdirectory of the current +working directory. + +Use the :option:`--oplog ` option with +:program:`mongodump` to collect the :term:`oplog` entries to build a +point-in-time snapshot of a database within a replica set. With :option:`--oplog +`, :program:`mongodump` copies all the data from +the source database as well as all of the :term:`oplog` entries from +the beginning of the backup procedure to until the backup procedure +completes. This backup procedure, in conjunction with +:option:`mongorestore --oplogReplay `, +allows you to restore a backup that reflects a consistent and specific +moment in time. + +If your MongoDB instance is not running, you can use the +:option:`--dbpath ` option to specify the +location to your MongoDB instance's database files. :program:`mongodump` +reads from the data files directly with this operation. This +locks the data directory to prevent conflicting writes. The +:program:`mongod` process must *not* be running or attached to these +data files when you run :program:`mongodump` in this +configuration. Consider the following example: + +.. code-block:: sh + + mongodump --dbpath /srv/mongodb + +Additionally, the :option:`--host ` and +:option:`--port ` options allow you to specify a +non-local host to connect to capture the dump. Consider the following +example: + +.. code-block:: sh + + mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/mongodump-2011-10-24 + +On any :program:`mongodump` command you may, as above, specify username +and password credentials to specify database authentication. + +.. _backup-restore-dump: + +Restore a Database with ``mongorestore`` +---------------------------------------- + +The :program:`mongorestore` utility restores a binary backup created by +:program:`mongodump`. Consider the following example command: + +.. code-block:: sh + + mongorestore dump-2011-10-25/ + +Here, :program:`mongorestore` imports the database backup located in +the :file:`dump-2011-10-25` directory to the :program:`mongod` instance +running on the localhost interface. By default, :program:`mongorestore` +looks for a database dump in the :file:`dump/` directory and restores +that. If you wish to restore to a non-default host, the +:option:`--host ` and :option:`--port ` +options allow you to specify a non-local host to connect to capture +the dump. Consider the following example: + +.. code-block:: sh + + mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mongodump-2011-10-24 + +On any :program:`mongorestore` command you may specify +username and password credentials, as above. + +If you created your database dump using the :option:`--oplog +` option to ensure a point-in-time snapshot, call +:program:`mongorestore` with the +:option:`--oplogReplay ` +option, as in the following example: + +.. code-block:: sh + + mongorestore --oplogReplay + +You may also consider using the :option:`mongorestore --objcheck` +option to check the integrity of objects while inserting them into the +database, or you may consider the :option:`mongorestore --drop` option to drop each +collection from the database before restoring from +backups. :program:`mongorestore` also includes the ability to a filter +to all input before inserting it into the new database. Consider the +following example: + +.. code-block:: sh + + mongorestore --filter '{"field": 1}' + +Here, :program:`mongorestore` only adds documents to the database from +the dump located in the :file:`dump/` folder *if* the documents have a +field name ``field`` that holds a value of ``1``. Enclose the +filter in single quotes (e.g. ``'``) to prevent the filter from +interacting with your shell environment. + +.. code-block:: sh + + mongorestore --dbpath /srv/mongodb --journal + +Here, :program:`mongorestore` restores the database dump located in +:file:`dump/` folder into the data files located at :file:`/srv/mongodb`. +Additionally, +the :option:`--journal ` option ensures that +:program:`mongorestore` records all operation in the durability +:term:`journal`. The journal prevents data file corruption if anything +(e.g. power failure, disk failure, etc.) interrupts the restore +operation. + +.. seealso:: :doc:`/reference/mongodump` and + :doc:`/reference/mongorestore`. diff --git a/source/tutorial/backup-databases-with-filesystem-snapshots.txt b/source/tutorial/backup-databases-with-filesystem-snapshots.txt new file mode 100644 index 00000000000..44698f5ba2c --- /dev/null +++ b/source/tutorial/backup-databases-with-filesystem-snapshots.txt @@ -0,0 +1,303 @@ +================================================= +Backup Databases Using Block Level Backup Methods +================================================= + +.. default-domain:: mongodb + +This section describes how to backup and restore a MongoDB instance +using system-level tools (such as :term:`LVM` or storage appliance) that +create block-level snapshots of the device +that holds MongoDB's data files. These methods complete quickly, work +reliably, and typically provide the easiest backup systems method to +implement. + +This page describes the following: + +- :ref:`snapshots-overview` + +- :ref:`lvm-backup-and-restore` + +- :ref:`backup-without-journaling` + +.. _snapshots-overview: + +Snapshots Overview +------------------ + +Snapshots work by creating pointers between the live data and a +special snapshot volume. These pointers are theoretically equivalent +to "hard links." As the working data diverges from the snapshot, +the snapshot process uses a copy-on-write strategy. As a result the snapshot +only stores modified data. + +After making the snapshot, you mount the snapshot image on your +file system and copy data from the snapshot. The resulting backup +contains a full copy of all data. + +Snapshots have the following limitations: + +- The database must be in a consistent or recoverable state when the + snapshot takes place. This means that all writes accepted by the + database need to be fully written to disk: either to the + :term:`journal` or to data files. + + If all writes are not on disk when the backup occurs, the backup + will not reflect these changes. If writes are *in progress* when the + backup occurs, the data files will reflect an inconsistent + state. With :term:`journaling ` all data-file states + resulting from in-progress writes are recoverable; without + journaling you must flush all pending writes to disk before + running the backup operation and must ensure that no writes occur during + the entire backup procedure. + + If you do use journaling, the journal must reside on the same volume + as the data. + +- Snapshots create an image of an entire disk image. Unless you need + to back up your entire system, consider isolating your MongoDB data + files, journal (if applicable), and configuration on one logical + disk that doesn't contain any other data. + + Alternately, store all MongoDB data files on a dedicated device + so that you can make backups without duplicating extraneous data. + +- Ensure that you copy data from snapshots and onto other systems to + ensure that data is safe from site failures. + +.. _backup-with-journaling: + +Snapshots With Journaling +~~~~~~~~~~~~~~~~~~~~~~~~~ + +If your :program:`mongod` instance has journaling enabled, then you can +use any kind of file system or volume/block level snapshot tool to +create backups. + +If you manage your own infrastructure on a Linux-based system, configure +your system with :term:`LVM` to provide your disk packages and provide +snapshot capability. You can also use LVM-based setups *within* a +cloud/virtualized environment. + +.. note:: + + Running :term:`LVM` provides additional flexibility and enables the + possibility of using snapshots to back up MongoDB. + +Snapshots with Amazon EBS in a RAID 10 Configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If your deployment depends on Amazon’s Elastic Block Storage (EBS) with +RAID configured within your instance, it is impossible to get a +consistent state across all disks using the platform’s snapshot tool. As +an alternative, you can do one of the following: + +- Flush all writes to disk and create a write lock to ensure + consistent state during the backup process. + + If you choose this option see :ref:`backup-without-journaling`. + +- Configure :term:`LVM` to run and hold your MongoDB data files on top of the + RAID within your system. + + If you choose this option, perform the LVM backup operation described + in :ref:`lvm-backup-operation`. + +.. _lvm-backup-and-restore: + +Backup and Restore Using LVM on a Linux System +---------------------------------------------- + +This section provides an overview of a simple backup process +using :term:`LVM` on a Linux system. While the tools, commands, and paths may +be (slightly) different on your system the following steps provide a +high level overview of the backup operation. + +.. _lvm-backup-operation: + +Create a Snapshot +~~~~~~~~~~~~~~~~~ + +To create a snapshot with :term:`LVM`, issue a command as root in the +following format: + +.. code-block:: sh + + lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb + +This command creates an :term:`LVM` snapshot (with the ``--snapshot`` option) +named ``mdb-snap01`` of the ``mongodb`` volume in the ``vg0`` +volume group. + +This example creates a snapshot named ``mdb-snap01`` located at +``/dev/vg0/mdb-snap01``. The location and paths to your systems volume +groups and devices may vary slightly depending on your operating +system's :term:`LVM` configuration. + +The snapshot has a cap of at 100 megabytes, because of the parameter +``--size 100M``. This size does not reflect the total amount of the +data on the disk, but rather the quantity of differences between the +current state of ``/dev/vg0/mongodb`` and the creation of the snapshot +(i.e. ``/dev/vg0/mdb-snap01``.) + +.. warning:: + + Ensure that you create snapshots with enough space to account for + data growth, particularly for the period of time that it takes to copy + data out of the system or to a temporary image. + + If your snapshot runs out of space, the snapshot image + becomes unusable. Discard this logical volume and create another. + +The snapshot will exist when the command returns. You can restore +directly from the snapshot at any time or by creating a new logical +volume and restoring from this snapshot to the alternate image. + +While snapshots are great for creating high quality backups very +quickly, they are not ideal as a format for storing backup +data. Snapshots typically depend and reside on the same storage +infrastructure as the original disk images. Therefore, it's crucial +that you archive these snapshots and store them elsewhere. + +Archive a Snapshot +~~~~~~~~~~~~~~~~~~ + +After creating a snapshot, mount the snapshot and move the data to +separate storage. Your system might try to compress the backup images as +you move the offline. The following procedure fully +archives the data from the snapshot: + +.. code-block:: sh + + umount /dev/vg0/mdb-snap01 + dd if=/dev/vg0/mdb-snap01 | tar -czf mdb-snap01.tar.gz - + +The above command sequence does the following: + +- Ensures that the ``/dev/vg0/mdb-snap01`` device is not mounted. + +- Performs a block level copy of the entire snapshot image using the ``dd`` + command and compresses the result in a gzipped tar archive in the + current working directory. + + .. warning:: + + This command will create a large ``tar.gz`` file in your current + working directory. Make sure that you run this command in a file + system that has enough free space. + +.. _backup-restore-snapshot: + +Restore a Snapshot +~~~~~~~~~~~~~~~~~~ + +To restore a snapshot created with the above method, issue the following +sequence of commands: + +.. code-block:: sh + + lvcreate --size 1G --name mdb-new vg0 + tar -xzf mdb-snap01.tar.gz | dd of=/dev/vg0/mdb-new + mount /dev/vg0/mdb-new /srv/mongodb + +The above sequence does the following: + +- Creates a new logical volume named ``mdb-new``, in the ``/dev/vg0`` + volume group. The path to the new device will be ``/dev/vg0/mdb-new``. + + .. warning:: + + This volume will have a maximum size of 1 gigabyte. The original + file system must have had a total size of 1 gigabyte or smaller, or + else the restoration will fail. + + Change ``1G`` to your desired volume size. + +- Uncompresses and unarchives the ``mdb-snap01.tar.gz`` into the + ``mdb-new`` disk image. + +- Mounts the ``mdb-new`` disk image to the ``/srv/mongodb`` directory. + Modify the mount point to correspond to your MongoDB data file + location, or other location as needed. + +.. _backup-restore-from-snapshot: + +Restore Directly from a Snapshot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To restore a backup without writing to a compressed ``tar`` archive, use +the following sequence of commands: + +.. code-block:: sh + + umount /dev/vg0/mdb-snap01 + lvcreate --size 1G --name mdb-new vg0 + dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new + mount /dev/vg0/mdb-new /srv/mongodb + +Remote Backup Storage +~~~~~~~~~~~~~~~~~~~~~ + +You can implement off-system backups using the :ref:`combined process +` and SSH. + +This sequence is identical to procedures explained above, except that it +archives and compresses the backup on a remote system using SSH. + +Consider the following procedure: + +.. code-block:: sh + + umount /dev/vg0/mdb-snap01 + dd if=/dev/vg0/mdb-snap01 | ssh username@example.com tar -czf /opt/backup/mdb-snap01.tar.gz + lvcreate --size 1G --name mdb-new vg0 + ssh username@example.com tar -xzf /opt/backup/mdb-snap01.tar.gz | dd of=/dev/vg0/mdb-new + mount /dev/vg0/mdb-new /srv/mongodb + +.. _backup-without-journaling: + +Create Backups on Instances that do not Run Journaling +------------------------------------------------------ + +If your :program:`mongod` instance does not run with journaling +enabled, or if your journal is on a separate volume, obtaining a +functional backup of a consistent state is more complicated. +As described in this section, you must flush all +writes to disk and lock the database to prevent writes during the +backup process. If you have a :term:`replica set` configuration, +then for your backup use a +:term:`secondary` which is not receiving reads (i.e. :term:`hidden +member`). + +1. To flush writes to disk and to "lock" the database (to prevent + further writes), issue the :method:`db.fsyncLock()` method in the + :program:`mongo` shell: + + .. code-block:: javascript + + db.fsyncLock(); + +#. Perform the backup operation described in :ref:`lvm-backup-operation`. + +#. To unlock the database after the snapshot has completed, use the + following command in the :program:`mongo` shell: + + .. code-block:: javascript + + db.fsyncUnlock(); + + .. note:: + + .. versionchanged:: 2.0 + MongoDB 2.0 added :method:`db.fsyncLock()` and + :method:`db.fsyncUnlock()` helpers to the :program:`mongo` + shell. Prior to this version, use the :dbcommand:`fsync` + command with the ``lock`` option, as follows: + + .. code-block:: javascript + + db.runCommand( { fsync: 1, lock: true } ); + db.runCommand( { fsync: 1, lock: false } ); + + .. include:: /includes/note-disable-profiling-fsynclock.rst + + .. include:: /includes/warning-fsync-lock-mongodump.rst diff --git a/source/tutorial/backup-sharded-cluster-with-filesystem-snapshots.txt b/source/tutorial/backup-sharded-cluster-with-filesystem-snapshots.txt index cbf7775da60..f5c5cdde497 100644 --- a/source/tutorial/backup-sharded-cluster-with-filesystem-snapshots.txt +++ b/source/tutorial/backup-sharded-cluster-with-filesystem-snapshots.txt @@ -78,7 +78,7 @@ shard. #. Back up the replica set members of the shards that you locked. You may back up the shards in parallel. For each shard, create a snapshot. Use the procedures in - :ref:`block-level-backup`. + :doc:`/tutorial/backup-databases-with-filesystem-snapshots`. #. Unlock all locked replica set members of each shard using the :method:`db.fsyncUnlock()` method in the :program:`mongo` shell.