Skip to content

Reindexing from ES6 to ES7

Latest
Compare
Choose a tag to compare
@sysadmin1139 sysadmin1139 released this 20 Apr 00:33
ee3ba2c

This release updates the reindexing framework to use the Reindexing API present in Elasticsearch 5.x and higher. This removes the previous brute-force method of doing it, and relies on the cluster to do all of the work of fetching and putting documents. We use the tasks API to keep track of reindexing before moving on to a new index to chew on.

Some other key changes:

  • Added the ability to tune the shard-width based on a target shard-size. Earlier Elasticsearch versions defaulted to a shard-width of 5 out of the box, which later versions stopped doing. Reindexing is a great time to change how wide your shards are, and this lets you do that.
  • Added support for keyword mappings. Elastic added keyword-type mappings in ES6, which are a short-cut method for a text field that doesn't have a tokenizer run on it. Use these for fields where you only do full-string matches.
  • Added support for index-routing attributes. If you are doing attribute-based routing of indexes, to manage shard routing, this will now ensure reindexed indexes have your routing attributes copied.
  • Added support for copying the mapping field-limit. Elasticsearch added a hard ceiling to the number of fields allowed in an index, so we are now copying that limit as part of reindexing.
  • Merge mappings down to _doc due to the mapping-removal work. Elasticsearch 7 removes support for multiple mapping-types, so you will make the migration easier if you merge your mapping-type to _doc, which this framework does for you. This also removes support of the _default_ type, which the reindexing framework cleans up for you.
  • Revised the README to address changed performance. The README got more updates to help you figure out how your cluster will react to reindexing.