-
Notifications
You must be signed in to change notification settings - Fork 478
Usage with MongoDB
Mongo Connector can replicate from one MongoDB replica set or sharded cluster to another using the Mongo DocManager. The most basic usage is like the following:
mongo-connector -m localhost:27017 -t localhost:37017 -d mongo_doc_manager
old usage (before 2.0 release):
mongo-connector -m localhost:27017 -t localhost:37017 -d <your-doc-manager-folder>/mongo_doc_manager.py
This assumes you are running a replica set or sharded cluster on ports 27017 and 37017 of the local machine.
- Even though replication is mongo-to-mongo, Mongo Connector still needs to insert the
_ts
andns
fields in order to handle rollbacks and provide renaming features. Note: in version 1.3, the_ts
andns
will not appear in replicated documents and is instead stored in the__mongo_connector
database. This is true as of commit b10b94f3ec3d1bc104d807ac7b8e61aabaa120d8. - Mongo Connector is "upsert only." This means that when a document is updated, the original document is overwritten with the latest version of that document on the source cluster. This is not the normal behavior of MongoDB replication, and it can result in short-lived discrepancies between the source and target MongoDB clusters.
MongoDB comes with several other tools that can be helpful in certain situations where Mongo Connector may also apply. These tools include:
For backup purposes, these tools work fine and are probably a lot faster than Mongo Connector. Furthermore, MongoDB Inc. officially supports their use (mongo-connector is not "officially supported"), and they may have fewer bugs. It's even possible to backup or move data from one MongoDB cluster to another without downtime using filesystem snapshots and mongooplog. However, there are certain situations where Mongo Connector really excels. Some of these are:
- Needing to replicate to a system other than MongoDB
- Needing to backup or move data from a MongoDB cluster without downtime, and filesystem snapshots aren't an option
- Targeting specific namespaces for live replication
- Replicating to multiple targets with one tool
- Migrating databases or collections to have different names without downtime
The take-away: Consider your options first before committing to a solution for just moving data around.