Skip to content

Configuration Options

Luke Lovett edited this page Oct 7, 2015 · 20 revisions

This page details all the options that can be specified in Mongo Connector's configuration file. You can also look at an example. You can tell mongo-connector what configuration file to use via the -c option (this will also be shown with --help).

Mongo Connector uses JSON as the format for it's configuration file. We'll use MongoDB "dot-notation" for the configuration option names themselves. For example, we'll use the name authentication.password to mean:

{"authentication": {"password": XXX}}

Comment Syntax

Although JSON itself doesn't provide a syntax for comments, Mongo Connector allows its JSON configuration file to have comments, which are defined as any key in an object that is prefixed by 2 underscores (_). For example:

{
    "__comment": "this is a comment"
}

Global Configuation Options

mainAddress

Command-line equivalent: -m, --main

Default: localhost:27017

The address of the replica set or sharded cluster from which to replicate. This may be any MongoDB connection string.

oplogFile

Command-line equivalent: -o, --oplog-ts

Default: oplog.timestamp

The path to the oplog progress file.

noDump

Command-line equivalent: --no-dump

Default: false

Do not dump collections from MongoDB to the remote system prior to tailing the MongoDB oplog.

batchSize

Command-line equivalent: --batch-size

Default: -1

Number of records processed from the oplog before updating the timestamp file.

verbosity

Command-line equivalent: -v, --verbose

Default: 0

The verbosity of Mongo Connector. Note that the command-line option only turns on/off debug-level logging. In the config file, verbosity may be set according to the following table:

Verbosity Log Level
0 ERROR
1 WARNING
2 INFO
3 DEBUG

continueOnError

Command-line equivalent: --continue-on-error

Default: false

Whether to continue tailing the oplog after an error occurred while dumping a collection. This doesn't effect the connector's behavior while already tailing the oplog.

fields

Command-line equivalent: -i, --fields

Default: all fields

Comma-separated list of fields to read from MongoDB documents. This option can be used to select just a few fields out of every document. Note that the _id field, and the ns and _ts fields for Solr, will always be included.

timezoneAware

Command-line equivalent: --tz-aware

Default: false

Whether Dates read from MongoDB should be timezone-aware.

Configure Logging

logging.type

Command-line equivalents: --logfile, -s, --enable-syslog

Default: file

Where to direct Mongo Connector logs. This may be one of "file", "syslog", or "stream".

logging.filename

Command-line equivalent: --logfile

Default: mongo-connector.log

The path to Mongo Connector's log file. This option only applies if logging.type is "file".

logging.rotationWhen

Command-line equivalent: --logfile-when

Default: midnight

The type of period defining when Mongo Connector should rotate its log file. This must be one of:

  • S (second)
  • M (minute)
  • H (hour)
  • D (day)
  • W0 - W6 (days of the week, numbered 0 - 6)
  • midnight

For more details, see the Python documentation for TimedRotatingFileHandler

This option only applies if logging.type is "file".

logging.rotationInterval

Command-line equivalent: --logfile-interval

Default: 1

How frequently the log file should be rotated. Specifically, how many units of logging.rotationWhen should occur before rotation. This option cannot be used if logging.rotationWhen is any of W0 - W6.

For more details, see the Python documentation for TimedRotatingFileHandler

This option only applies if logging.type is "file".

logging.rotationBackups

Command-line equivalent: --logfile-backups

Default: 7

How many rotated log files to keep around.

This option only applies if logging.type is "file".

logging.host

Command-line equivalent: --syslog-host

Default: localhost:512

Address of the syslog. This can inlude a host and port like "localhost:512" or, on Unix/Linux, be a Unix domain socket such as "/dev/log".

This option only applies if logging.type is "syslog".

logging.facility

Command-line equivalent: --syslog-facility

Default: user

The syslog facility to use.

This option only applies if logging.type is "syslog".

Configure Authentication

authentication.adminUsername

Command-line equivalent: -a, --admin-username

Default: (no default)

The username that Mongo Connector should use to log into MongoDB.

authentication.password

Command-line equivalent: -p, --password

Default: (no default)

The password for authentication.adminUsername. This option cannot be used with authentication.passwordFile.

authentication.passwordFile

Command-line equivalent: -f, --password-file

Default: (no default)

A path to a file that contains the password for authentication.adminUsername. This option cannot be used with authentication.password.

Configure SSL

ssl.sslCertfile

Command-line equivalent: --ssl-certfile

Default: (no default)

A path to the SSL certificate that Mongo Connector should use to identify the local connection to MongoDB.

ssl.sslKeyfile

Command-line equivalent: --ssl-keyfile

Default: (no default)

A path to the private key for ssl.sslCertfile. This option isn't necessary if ssl.sslCertfile already has the private key included.

ssl.sslCertificatePolicy

Command-line equivalent: --ssl-certificate-policy

Default: ignored

Policy for validating SSL certificates provided from the other end of the connection (i.e., to MongoDB). Must be one of:

  • required - Require and validate the remote certificate.
  • optional - Validate the remote certificate only if one is provided.
  • ignored - Remote SSL certificates are ignored completely.

Configure Namespaces

namespaces.include

Command-line equivalent: -n, --namespace-set

Default: all namespaces

List of collections to read from MongoDB. Collection names should be given as database_name.collection_name. By default, Mongo Connector will replicate all namespaces except for system and GridFS collections.

Usage Examples: -n test.test,alpha.foo on the command line or ["test.test", "alpha.foo"] in a config file.

namespaces.mapping

Command-line equivalent: -g, --dest-namespace-set

Default: no mapping

Comma-separated list of new names to use for each collection. Each namespace provided in namespaces.include will be renamed respectively at the destination according to this list. This option may only be used with namespaces.include, and both options must include the same number of names. By default, no renaming will occur.

Note that when replicating to Elasticsearch, the MongoDB database name, which will become the Elasticsearch index name, is always made lowercase.

namespaces.gridfs

Command-line equivalent: --gridfs-set

Default: empty

Comma-separated list of GridFS root collections. For example, if GridFS metadata is stored in the test.fs.files collection, and chunks are stored in the test.fs.chunks collection, pass test.fs as the namespace.

Configure DocManagers

Mongo Connector may use more than one DocManager at a time to support replicating to more than one location simultaneously. An array of DocManagers should be provided, even if that array only contains one DocManager configuration. Here we use <index> in the configuration key name to mean "at any index within the array". For example, docManagers.0.docManager means:

{"docManagers": [{"docManager": XXX}]}

docManagers.<index>.docManager

Command-line equivalent: -d, --doc-manager

Default: doc_manager_simulator

Module name of the DocManager to use. Included in Mongo Connector are mongo_doc_manager, solr_doc_manager, elastic_doc_manager, and doc_manager_simulator. To write your own DocManager, see Writing Your Own DocManager.

docManagers.<index>.targetURL

Command-line equivalent: -t, --target-url

Default: (no default)

URL to pass to the DocManager. For example, this should point to the base REST endpoint for a Solr core, or should be a MongoDB connection string, or the base REST endpoint for Elasticsearch.

docManagers.<index>.uniqueKey

Command-line equivalent: -u, --unique-key

_Default: id

What to call the _id field in a MongoDB document. This is useful for certain systems that call their primary key something else (e.g., Solr uses id instead).

docManagers.<index>.autoCommitInterval

Command-line equivalent: --auto-commit-interval

Default: no auto commit

Interval in seconds between when the DocManager forces the end system to flush changes. This doesn't apply to every system.

docManagers.<index>.bulkSize

Command-line equivalent: (none)

Default: 1000

The number of documents that are sent in a single batch to the remote system.

docManagers.<index>.args

Command-line equivalent: (none)

Default: (no default)

Any arbitrary keyword arguments to pass to the constructor of the DocManager. What arguments can be passed should be documented by the author of the DocManager.

Clone this wiki locally