Skip to content

chapter47_part3:/520_Post_Deployment/30_indexing_perf.asciidoc #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 22, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 53 additions & 129 deletions 520_Post_Deployment/30_indexing_perf.asciidoc
Original file line number Diff line number Diff line change
@@ -1,111 +1,64 @@
[[indexing-performance]]
=== Indexing Performance Tips
=== 索引性能技巧

If you are in an indexing-heavy environment,((("indexing", "performance tips")))((("post-deployment", "indexing performance tips"))) such as indexing infrastructure
logs, you may be willing to sacrifice some search performance for faster indexing
rates. In these scenarios, searches tend to be relatively rare and performed
by people internal to your organization. They are willing to wait several
seconds for a search, as opposed to a consumer facing a search that must
return in milliseconds.
如果你是在一个索引负载很重的环境,((("indexing", "performance tips")))((("post-deployment", "indexing performance tips")))比如索引的是基础设施日志,你可能愿意牺牲一些搜索性能换取更快的索引速率。在这些场景里,搜索常常是很少见的操作,而且一般是由你公司内部的人发起的。他们也愿意为一个搜索等上几秒钟,而不像普通消费者,要求一个搜索必须毫秒级返回。

Because of this unique position, certain trade-offs can be made
that will increase your indexing performance.
基于这种特殊的场景,我们可以有几种权衡办法来提高你的索引性能。

.These Tips Apply Only to Elasticsearch 1.3+
.这些技巧仅适用于 Elasticsearch 1.3 及以上版本
****
This book is written for the most recent versions of Elasticsearch, although much
of the content works on older versions.
本书是为最新几个版本的 Elasticsearch 写的,虽然大多数内容在更老的版本也也有效。

The tips presented in this section, however, are _explicitly_ for version 1.3+. There
have been multiple performance improvements and bugs fixed that directly impact
indexing. In fact, some of these recommendations will _reduce_ performance on
older versions because of the presence of bugs or performance defects.
不过,本节提及的技巧, _只_ 针对 1.3 及以上版本。该版本后有不少性能提升和故障修复是直接影响到索引的。事实上,有些建议在老版本上反而会因为故障或性能缺陷而 _降低_ 性能。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉bug是不是可以不用翻译啊。。

****

==== Test Performance Scientifically
==== 科学的测试性能

Performance testing is always difficult, so try to be as scientific as possible
in your approach.((("performance testing")))((("indexing", "performance tips", "performance testing"))) Randomly fiddling with knobs and turning on ingestion is not
a good way to tune performance. If there are too many _causes_, it is impossible
to determine which one had the best _effect_. A reasonable approach to testing is as follows:
性能测试永远是复杂的,所以在你的方法里已经要尽可能的科学。((("performance testing")))((("indexing", "performance tips", "performance testing")))随机摆弄旋钮以及写入开关可不是做性能调优的好办法。如果有太多种 _可能_ ,我们就无法判断到底哪一种有最好的 _效果_ 。合理的测试方法如下:

1. Test performance on a single node, with a single shard and no replicas.
2. Record performance under 100% default settings so that you have a baseline to
measure against.
3. Make sure performance tests run for a long time (30+ minutes) so you can
evaluate long-term performance, not short-term spikes or latencies. Some events
(such as segment merging, and GCs) won't happen right away, so the performance
profile can change over time.
4. Begin making single changes to the baseline defaults. Test these rigorously,
and if performance improvement is acceptable, keep the setting and move on to the
next one.
1. 在单个节点上,对单个分片,无副本的场景测试性能。
2. 在 100% 默认配置的情况下记录性能结果,这样你就有了一个对比基线。
3. 确保性能测试运行足够长的时间(30 分钟以上)这样你可以评估长期性能,而不是短期的峰值或延迟。一些事件(比如段合并,GC)不会立刻发生,所以性能概况会随着时间继续而改变的。
4. 开始在基线上逐一修改默认值。严格测试它们,如果性能提升可以接受,保留这个配置项,开始下一项。

==== Using and Sizing Bulk Requests
==== 使用批量请求并调整其大小

This should be fairly obvious, but use bulk indexing requests for optimal performance.((("indexing", "performance tips", "bulk requests, using and sizing")))((("bulk API", "using and sizing bulk requests")))
Bulk sizing is dependent on your data, analysis, and cluster configuration, but
a good starting point is 5–15 MB per bulk. Note that this is physical size.
Document count is not a good metric for bulk size. For example, if you are
indexing 1,000 documents per bulk, keep the following in mind:
显而易见的,优化性能应该使用批量请求。((("indexing", "performance tips", "bulk requests, using and sizing")))((("bulk API", "using and sizing bulk requests")))批量的大小则取决于你的数据、分析和集群配置,不过每次批量数据 5–15 MB 大是个不错的起始点。注意这里说的是物理字节数大小。文档计数对批量大小来说不是一个好指标。比如说,如果你每次批量索引 1000 个文档,记住下面的事实:

- 1,000 documents at 1 KB each is 1 MB.
- 1,000 documents at 100 KB each is 100 MB.
- 1000 个 1 KB 大小的文档加起来是 1 MB 大。
- 1000 个 100 KB 大小的文档加起来是 100 MB 大。

Those are drastically different bulk sizes. Bulks need to be loaded into memory
at the coordinating node, so it is the physical size of the bulk that is more
important than the document count.
这可是完完全全不一样的批量大小了。批量请求需要在协调节点上加载进内存,所以批量请求的物理大小比文档计数重要得多。

Start with a bulk size around 5–15 MB and slowly increase it until you do not
see performance gains anymore. Then start increasing the concurrency of your
bulk ingestion (multiple threads, and so forth).
从 5–15 MB 开始测试批量请求大小,缓慢增加这个数字,直到你看不到性能提升为止。然后开始增加你的批量写入的并发度(多线程等等办法)。

Monitor your nodes with Marvel and/or tools such as `iostat`, `top`, and `ps` to see
when resources start to bottleneck. If you start to receive `EsRejectedExecutionException`,
your cluster can no longer keep up: at least one resource has reached capacity. Either reduce concurrency, provide more of the limited resource (such as switching from spinning disks to SSDs), or add more nodes.
用 Marvel 以及诸如 `iostat` 、 `top` 和 `ps` 等工具监控你的节点,观察资源什么时候达到瓶颈。如果你开始收到 `EsRejectedExecutionException` ,你的集群没办法再继续了:至少有一种资源到瓶颈了。或者减少并发数,或者提供更多的受限资源(比如从机械磁盘换成 SSD),或者添加更多节点。

[NOTE]
====
When ingesting data, make sure bulk requests are round-robined across all your
data nodes. Do not send all requests to a single node, since that single node
will need to store all the bulks in memory while processing.
写数据的时候,要确保批量请求是轮询发往你的全部数据节点的。不要把所有请求都发给单个节点,因为这个节点会需要在处理的时候把所有批量请求都存在内存里。
====

==== Storage
==== 存储

Disks are usually the bottleneck of any modern server. Elasticsearch heavily uses disks, and the more throughput your disks can handle, the more stable your nodes will be. Here are some tips for optimizing disk I/O:
磁盘在现代服务器上通常都是瓶颈。Elasticsearch 重度使用磁盘,你的磁盘能处理的吞吐量越大,你的节点就越稳定。这里有一些优化磁盘 I/O 的技巧:

- Use SSDs. As mentioned elsewhere, ((("storage")))((("indexing", "performance tips", "storage")))they are superior to spinning media.
- Use RAID 0. Striped RAID will increase disk I/O, at the obvious expense of
potential failure if a drive dies. Don't use mirrored or parity RAIDS since
replicas provide that functionality.
- Alternatively, use multiple drives and allow Elasticsearch to stripe data across
them via multiple `path.data` directories.
- Do not use remote-mounted storage, such as NFS or SMB/CIFS. The latency introduced
here is antithetical to performance.
- If you are on EC2, beware of EBS. Even the SSD-backed EBS options are often slower
than local instance storage.
- 使用 SSD。就像其他地方提过的,((("storage")))((("indexing", "performance tips", "storage")))他们比机械磁盘优秀多了。
- 使用 RAID 0。条带化 RAID 会提高磁盘 I/O,代价显然就是当一块硬盘故障时整个就故障了。不要使用镜像或者奇偶校验 RAID 因为副本已经提供了这个功能。
- 另外,使用多块硬盘,并允许 Elasticsearch 通过多个 `path.data` 目录配置把数据条带化分配到它们上面。
- 不要使用远程挂载的存储,比如 NFS 或者 SMB/CIFS。这个引入的延迟对性能来说完全是背道而驰的。
- 如果你用的是 EC2,当心 EBS。即便是基于 SSD 的 EBS,通常也比本地实例的存储要慢。

[[segments-and-merging]]
==== Segments and Merging
==== 段和合并

Segment merging is computationally expensive,((("indexing", "performance tips", "segments and merging")))((("merging segments")))((("segments", "merging"))) and can eat up a lot of disk I/O.
Merges are scheduled to operate in the background because they can take a long
time to finish, especially large segments. This is normally fine, because the
rate of large segment merges is relatively rare.
段合并的计算量庞大,((("indexing", "performance tips", "segments and merging")))((("merging segments")))((("segments", "merging")))而且还要吃掉大量磁盘 I/O。合并在后台定期操作,因为他们可能要很长时间才能完成,尤其是比较大的段。这个通常来说都没问题,因为大规模段合并的概率是很小的。

But sometimes merging falls behind the ingestion rate. If this happens, Elasticsearch
will automatically throttle indexing requests to a single thread. This prevents
a _segment explosion_ problem, in which hundreds of segments are generated before
they can be merged. Elasticsearch will log `INFO`-level messages stating `now
throttling indexing` when it detects merging falling behind indexing.
不过有时候合并会拖累写入速率。如果这个真的发生了,Elasticsearch 会自动限制索引请求到单个线程里。这个可以防止出现 _段爆炸_ 问题,即数以百计的段在被合并之前就生成出来。如果 Elasticsearch 发现合并拖累索引了,它会会记录一个声明有 `now throttling indexing` 的 `INFO` 级别信息。

Elasticsearch defaults here are conservative: you don't want search performance
to be impacted by background merging. But sometimes (especially on SSD, or logging
scenarios), the throttle limit is too low.
Elasticsearch 默认设置在这块比较保守:不希望搜索性能被后台合并影响。不过有时候(尤其是 SSD,或者日志场景)限流阈值太低了。

The default is 20 MB/s, which is a good setting for spinning disks. If you have
SSDs, you might consider increasing this to 100–200 MB/s. Test to see what works
for your system:
默认值是 20 MB/s,对机械磁盘应该是个不错的设置。如果你用的是 SSD,可以考虑提高到 100–200 MB/s。测试验证对你的系统哪个值合适:

[source,js]
----
Expand All @@ -117,9 +70,7 @@ PUT /_cluster/settings
}
----

If you are doing a bulk import and don't care about search at all, you can disable
merge throttling entirely. This will allow indexing to run as fast as your
disks will allow:
如果你在做批量导入,完全不在意搜索,你可以彻底关掉合并限流。这样让你的索引速度跑到你磁盘允许的极限:

[source,js]
----
Expand All @@ -130,58 +81,31 @@ PUT /_cluster/settings
}
}
----
<1> Setting the throttle type to `none` disables merge throttling entirely. When
you are done importing, set it back to `merge` to reenable throttling.
<1> 设置限流类型为 `none` 彻底关闭合并限流。等你完成了导入,记得改回 `merge` 重新打开限流。

If you are using spinning media instead of SSD, you need to add this to your
`elasticsearch.yml`:
如果你使用的是机械磁盘而非 SSD,你需要添加下面这个配置到你的 `elasticsearch.yml` 里:

[source,yaml]
----
index.merge.scheduler.max_thread_count: 1
----

Spinning media has a harder time with concurrent I/O, so we need to decrease
the number of threads that can concurrently access the disk per index. This setting
will allow `max_thread_count + 2` threads to operate on the disk at one time,
so a setting of `1` will allow three threads.

For SSDs, you can ignore this setting. The default is
`Math.min(3, Runtime.getRuntime().availableProcessors() / 2)`, which works well
for SSD.

Finally, you can increase `index.translog.flush_threshold_size` from the default
512 MB to something larger, such as 1 GB. This allows larger segments to accumulate
in the translog before a flush occurs. By letting larger segments build, you
flush less often, and the larger segments merge less often. All of this adds up
to less disk I/O overhead and better indexing rates. Of course, you will need
the corresponding amount of heap memory free to accumulate the extra buffering
space, so keep that in mind when adjusting this setting.

==== Other

Finally, there are some other considerations to keep in mind:

- If you don't need near real-time accuracy on your search results, consider
dropping the `index.refresh_interval` of((("indexing", "performance tips", "other considerations")))((("refresh_interval setting"))) each index to `30s`. If you are doing
a large import, you can disable refreshes by setting this value to `-1` for the
duration of the import. Don't forget to reenable it when you are finished!

- If you are doing a large bulk import, consider disabling replicas by setting
`index.number_of_replicas: 0`.((("replicas, disabling during large bulk imports"))) When documents are replicated, the entire document
is sent to the replica node and the indexing process is repeated verbatim. This
means each replica will perform the analysis, indexing, and potentially merging
process.
机械磁盘在并发 I/O 支持方面比较差,所以我们需要降低每个索引并发访问磁盘的线程数。这个设置允许 `max_thread_count + 2` 个线程同时进行磁盘操作,也就是设置为 `1` 允许三个线程。

对于 SSD,你可以忽略这个设置,默认是 `Math.min(3, Runtime.getRuntime().availableProcessors() / 2)` ,对 SSD 来说运行的很好。

最后,你可以增加 `index.translog.flush_threshold_size` 设置,从默认的 512 MB 到更大一些的值,比如 1 GB。这可以在一次清空触发的时候在事务日志里积累出更大的段。而通过构建更大的段,清空的频率变低,大段合并的频率也变低。这一切合起来导致更少的磁盘 I/O 开销和更好的索引速率。当然,你会需要对应量级的 heap 内存用以积累更大的缓冲空间,调整这个设置的时候请记住这点。

==== 其他

最后,还有一些其他值得考虑的东西需要记住:

- 如果你的搜索结果不需要近实时的准确度,考虑把每个索引的 `index.refresh_interval`((("indexing", "performance tips", "other considerations")))((("refresh_interval setting")))改到 `30s` 。如果你是在做大批量导入,导入期间你可以通过设置这个值为 `-1` 关掉刷新。别忘记在完工的时候重新开启它。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

近实时。。感觉准实时更顺一些
不过但是也所谓。。


- 如果你在做大批量导入,考虑通过设置 `index.number_of_replicas: 0`((("replicas, disabling during large bulk imports")))关闭副本。文档在复制的时候,整个文档内容都被发往副本节点,然后逐字的把索引过程重复一遍。这意味着每个副本也会执行分析、索引以及可能的合并过程。
+
In contrast, if you index with zero replicas and then enable replicas when ingestion
is finished, the recovery process is essentially a byte-for-byte network transfer.
This is much more efficient than duplicating the indexing process.

- If you don't have a natural ID for each document, use Elasticsearch's auto-ID
functionality.((("id", "auto-ID functionality of Elasticsearch"))) It is optimized to avoid version lookups, since the autogenerated
ID is unique.

- If you are using your own ID, try to pick an ID that is http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html[friendly to Lucene]. ((("UUIDs (universally unique identifiers)"))) Examples include zero-padded
sequential IDs, UUID-1, and nanotime; these IDs have consistent, sequential
patterns that compress well. In contrast, IDs such as UUID-4 are essentially
random, which offer poor compression and slow down Lucene.
相反,如果你的索引是零副本,然后在写入完成后再开启副本,恢复过程本质上只是一个字节到字节的网络传输。相比重复索引过程,这个算是相当高效的了。

- 如果你没有给每个文档自带 ID,使用 Elasticsearch 的自动 ID 功能。((("id", "auto-ID functionality of Elasticsearch")))这个为避免版本查找做了优化,因为自动生成的 ID 是唯一的。

- 如果你在使用自己的 ID,尝试使用一种 http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html[Lucene 友好的] ID。((("UUIDs (universally unique identifiers)")))包括零填充序列 ID、UUID-1 和纳秒;这些 ID 都是有一致的,压缩良好的序列模式。相反的,像 UUID-4 这样的 ID,本质上是随机的,压缩比很低,会明显拖慢 Lucene。