Skip to content

A collection of elasticsearch command line tools for doing things like bulk importing/exporting and exporting/importing mappings.

License

Notifications You must be signed in to change notification settings

peterhoneder/elasticsearch-tools

 
 

Repository files navigation

elasticsearch-tools

A collection of elasticsearch command line tools for doing things like bulk importing/exporting and exporting/importing mappings.

It was created because some of the existing import/export tools ran too slow on my machine. Using the new bulk API seemed to speed things up dramatically. The other tools I used also weren't exporting _parent and _routing fields.

Changes of this fork

This fork fixes:

  • support for ES 7 which reports total hit counts differently (Error: total required)
  • npm audit warnings at time of last commit

Installation

npm install https://github.com/peterhoneder/elasticsearch-tools

After installing, you will have access to the following command line tools:

Exporting

Performance

For exports that don't need specific sort orders, please include this for maximum performance:

--sort _doc:asc

Importing

Usage: es-export-bulk

Options

es-export-bulk --help

Usage: es-export-bulk [options]

  Options:

    -h, --help                           output usage information
    -v, --version                        output the version number
    -u, --url <url>                      comma-separated elasticsearch urls to connect to
    -f, --file <file>                    the file to write data to
    -m, --max <number>                   the maximum number of items to export. different than the scroll size
    --transformMeta <js>                 a javascript function that returns an object that is the transformed meta object
    --transformSource <js>               a javascript function that returns an object that is the transformed source object
    --transformMetaInit <js>             a javascript function that returns an init object that contains helpers for the transform function
    --transformSourceInit <js>           a javascript function that returns an init object that contains helpers for the transform function
    --index <index>                      ES OPTION: a comma-separated list of index names to search; use _all or empty string to perform the operation on all indices
    --type <type>                        ES OPTION: a comma-separated list of document types to search; leave empty to perform the operation on all types
    --body <body>                        ES OPTION: the body to send along with this request.
    --analyzer <analyzer>                ES OPTION: The analyzer to use for the query string
    --analyzeWildcard <analyzeWildcard>  ES OPTION: specify whether wildcard and prefix queries should be analyzed (default: false)
    --fields <fields>                    ES OPTION: a comma-separated list of fields to return as part of a hit (default: "*")
    --from <from>                        ES OPTION: starting offset (default: 0)
    --q <q>                              ES OPTION: query in the Lucene query string syntax
    --routing <routing>                  ES OPTION: a comma-separated list of specific routing values
    --scroll <scroll>                    ES OPTION: specify how long a consistent view of the index should be maintained for scrolled search (default: 1m)
    --size <size>                        ES OPTION: number of hits to return during each scan
    --sort <sort>                        ES OPTION: a comma-separated list of <field>:<direction> pairs
    --timeout <timeout>                  ES OPTION: explicit operation timeout
    --apiVersion <apiVersion>            ES CLIENT OPTION: the major version of the Elasticsearch nodes you will be connecting to (default: 2.3)
    --maxRetries <maxRetries>            ES CLIENT OPTION: how many times should the client try to connect to other nodes before returning a ConnectionFault error (default: 3)
    --requestTimeout <requestTimeout>    ES CLIENT OPTION: milliseconds before an HTTP request will be aborted and retried. This can also be set per request (default: 30000)
    --deadTimeout <deadTimeout>          ES CLIENT OPTION: milliseconds that a dead connection will wait before attempting to revive itself (default: 60000)
    --pingTimeout <pingTimeout>          ES CLIENT OPTION: milliseconds that a ping request can take before timing out (default: 3000)
    --maxSockets <maxSockets>            ES CLIENT OPTION: maximum number of concurrent requests that can be made to any node (default: 10)
    --minSockets <minSockets>            ES CLIENT OPTION: minimum number of sockets to keep connected to a node (default: 10)
    --selector <selector>                ES CLIENT OPTION: select a connection from the ConnectionPool using roundRobin (default) or random

Examples

export 1 hour of data from local db

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --body '
{"query":{"range":{"timestamp":{"gte":"2014-08-13T11:00:00.000Z","lte":"2014-08-13T12:00:00.000Z"}}}}
'

export "myIndex" from local db

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --index myIndex

add a key/value to all exported documents

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --transformSource 'data.foo = "neat"'
# the return statement is optional
es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --transformSource 'data.foo = "neat";return data;'

delete the key "foo" from all exported documents

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --transformSource 'delete data.foo'

don't include _parent in meta data

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --transformMeta 'delete data.index._parent'

change the index name that we export

es-export-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/data.json --transformMeta 'data.index._index = "newIndex"'

Usage: es-export-mappings

Options

es-export-mappings --help

Usage: es-export-mappings [options]

  Options:

    -h, --help                               output usage information
    -v, --version                            output the version number
    -u, --url <url>                          the elasticsearch url to connect to
    -f, --file <file>                        the file to write data to
    --index <index>                          ES OPTION: String, String[], Boolean — A comma-separated list of index names
    --type <type>                            ES OPTION: String, String[], Boolean — A comma-separated list of document types
    --ignoreUnavailable <ignoreUnavailable>  ES OPTION: Boolean — Whether specified concrete indices should be ignored when unavailable (missing or closed)
    --allowNoIndices <allowNoIndices>        ES OPTION: Boolean — Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)
    --expandWildcards <expandWildcards>      ES OPTION: String — Whether to expand wildcard expression to concrete indices that are open, closed or both.
    --local <local>                          ES OPTION: Boolean — Return local information, do not retrieve the state from master node (default: false)

Examples

export mappings from local db

es-export-mappings --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.mappings.json

Usage: es-export-settings

Options

es-export-settings --help

Usage: es-export-settings [options]

  Options:

    -h, --help                               output usage information
    -v, --version                            output the version number
    -u, --url <url>                          the elasticsearch url to connect to
    -f, --file <file>                        the file to write data to
    --index <index>                          ES OPTION: String, String[], Boolean — A comma-separated list of index names
    --ignoreUnavailable <ignoreUnavailable>  ES OPTION: Boolean — Whether specified concrete indices should be ignored when unavailable (missing or closed)
    --allowNoIndices <allowNoIndices>        ES OPTION: Boolean — Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)
    --expandWildcards <expandWildcards>      ES OPTION: String — Whether to expand wildcard expression to concrete indices that are open, closed or both.
    --local <local>                          ES OPTION: Boolean — Return local information, do not retrieve the state from master node (default: false)
    --name <name>                            ES OPTION: String, String[], Boolean — The name of the settings that should be included

Examples

export settings from local db

es-export-settings --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.settings.json

Usage: es-export-aliases

Options

es-export-aliases --help

Usage: es-export-aliases [options]

  Options:

    -h, --help         output usage information
    -v, --version      output the version number
    -u, --url <url>    the elasticsearch url to connect to
    -f, --file <file>  the file to write data to
    --index <index>    ES OPTION: String, String[], Boolean — A comma-separated list of index names
    --local <local>    ES OPTION: Boolean — Return local information, do not retrieve the state from master node (default: false)
    --name <name>      ES OPTION: String, String[], Boolean — The name of the settings that should be included

Examples

export aliases from local db

es-export-aliases --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.aliases.json

Usage: es-import-bulk

Options

es-import-bulk --help

Usage: es-import-bulk [options]

  Options:

    -v, --version          output the version number
    -u, --url <url>        the elasticsearch url to connect to
    -f, --file <file>      the file to read data from
    -m, --max <items>      the max number of lines to process per batch (default: 20,000) (default: 20000)
    --requestTimeout <ms>  ES CLIENT OPTION: milliseconds before an HTTP request will be aborted and retried. This can also be set per request (default: 30000) (default: 30000)
    -h, --help             output usage information

Examples

import data to local db from file

es-import-bulk --url http://localhost:9200 --file ~/backups/elasticsearch/prod/rafflev1.json

Usage: es-import-mappings

Options

es-import-mappings --help

Usage: es-import-mappings [options]

  Options:

    -v, --version                            output the version number
    -u, --url <url>                          the elasticsearch url to connect to
    -f, --file <file>                        the file to read data from
    --ignoreConflicts <ignoreConflicts>      ES OPTION: Boolean — Specify whether to ignore conflicts while updating the mapping (default: false)
    --timeout <timeout>                      ES OPTION: Date, Number — Explicit operation timeout
    --masterTimeout <masterTimeout>          ES OPTION: Date, Number — Specify timeout for connection to master
    --ignoreUnavailable <ignoreUnavailable>  ES OPTION: Boolean — Whether specified concrete indices should be ignored when unavailable (missing or closed)
    --allowNoIndices <allowNoIndices>        ES OPTION: Boolean — Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)
    --expandWildcards <expandWildcards>      ES OPTION: String — Whether to expand wildcard expression to concrete indices that are open, closed or both.
    -h, --help                               output usage information

Examples

import mappings to local db

es-import-mappings --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.mappings.json

Usage: es-import-settings

Options

es-import-settings --help

Usage: es-import-settings [options]

  Options:

    -v, --version                            output the version number
    -u, --url <url>                          the elasticsearch url to connect to
    -f, --file <file>                        the file to read data from
    --masterTimeout <masterTimeout>          ES OPTION: Date, Number — Specify timeout for connection to master
    --ignoreUnavailable <ignoreUnavailable>  ES OPTION: Boolean — Whether specified concrete indices should be ignored when unavailable (missing or closed)
    --allowNoIndices <allowNoIndices>        ES OPTION: Boolean — Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)
    --expandWildcards <expandWildcards>      ES OPTION: String — Whether to expand wildcard expression to concrete indices that are open, closed or both.
    -h, --help                               output usage information

Examples

import settings to local db

es-import-settings --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.settings.json

Usage: es-import-aliases

Options

es-import-aliases --help

Usage: es-import-aliases [options]

  Options:

    -v, --version                    output the version number
    -u, --url <url>                  the elasticsearch url to connect to
    -f, --file <file>                the file to read data from
    --timeout <timeout>              ES OPTION: Date, Number — Explicit operation timeout
    --masterTimeout <masterTimeout>  ES OPTION: Date, Number — Specify timeout for connection to master
    -h, --help                       output usage information

Examples

import aliases to local db

es-import-aliases --url http://localhost:9200 --file ~/backups/elasticsearch/prod/prod.aliases.json

Other Elasticsearch Tools

Imports / Exports

Running tests

Unit tests can be ran via:

npm run test

The integration tests hit an elasticsearch server at: localhost:20202. To start the server, you can install docker, then run:

docker-compose up

One the server is running, you can run the integration tests via:

npm run test:integration

License

Copyright (c) 2014 skratchdot Licensed under the MIT license.

About

A collection of elasticsearch command line tools for doing things like bulk importing/exporting and exporting/importing mappings.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 100.0%