-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new index setting to skip recovery source when synthetic source is enabled #114618
Add a new index setting to skip recovery source when synthetic source is enabled #114618
Conversation
Hi @jimczi, I've created a changelog YAML for you. |
@martijnvg, here are the complete results we discussed offline. I've disabled the recovery source only for logsdb to allow for flexibility if needed. This is just a draft, so nothing is integrated yet. We begin with a benchmark to evaluate the various options. |
Nice work @jimczi! I need some time to take a good look at this to see how we can move this forward, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look Jim and this looks very exciting!
Some thoughts:
- Would be good to get feedback of @elastic/es-distributed-indexing team around this draft PR.
- The micro benchmark is a good start to asses how reading translog operations from Lucene index are affected. Next step is that we see how performance or shard recovery and cross cluster replication is affected by this change. I'm not sure what benchmarking tools we have for this. Maybe we can just start by benchmarkmarking shard changes action?
- The current micro benchmark results suggest to me that not storing and indexing recovery source at the cost of (sometimes) slower reading of translog operations is a reasonable tradeoff.
- I think If we want to proceed with a change like this we properly want to enable this first behind a feature flag.
- I think we should always have a setting that acts as an escape hatch and falls back to
LuceneChangesSnapshot
implementation.
server/src/main/java/org/elasticsearch/index/engine/LuceneBatchChangesSnapshot.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/LuceneBatchChangesSnapshot.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/elasticsearch/index/mapper/SortedNumericDocValuesSyntheticFieldLoader.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/LuceneBatchChangesSnapshot.java
Outdated
Show resolved
Hide resolved
@jimczi I benchmarked this change using the elastic/logs track, and the results are great. I think we don't need a high default batch_size; 256 or lower should be sufficient. The approach in the PR looks promising. Could we update it to be ready for review? Thank you! |
@dnhatn @martijnvg The PR is now ready for a more thorough review. The latest changes incorporate the new behavior as an index setting. We’re still discussing whether this should be the default for synthetic sources or just for logsdb, so I haven’t finalized that aspect in the code yet. When the new setting is enabled (only at index creation), synthetic sources are retrieved in batches using the new The maximum memory size is currently set to 4MB, but we could adjust it based on the available JVM memory on the node if needed. |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
…t' into lucene_changes_synthetic_snapshot
…t' into lucene_changes_synthetic_snapshot
@elasticmachine run Elasticsearch Serverless Checks |
… is enabled (elastic#114618) This change adds a new undocumented index settings that allows to use synthetic source for recovery and CCR without storing a recovery source.
The new method with the overloaded chunk size should be used instead. Relates elastic#114618
The new method with the overloaded chunk size should be used instead. Relates #114618
This PR draft proposes a method to skip the recovery source when LogsDB is enabled.
The recovery source is only used for peer recovery so two alternatives are introduced to handle the removal in
Translog.Snapshot
:batchSize
).The new
LuceneBatchChangesSnapshot
holds batchSize operations (including source and metadata fields) in memory simultaneously, unlike the currentLuceneChangesSnapshot
, which keeps only one operation at a time.The batch size is configurable to limit the memory usage.
Benchmark Overview
A benchmark comparing these two methods is added to the PR. It includes sources extracted from three Elastic integrations:
logs-kafka-log
)logs-endpoint-events-process
)logs-endpoint-events-security
)The benchmark retrieves 10,000 documents for recovery through the
Translog.Snapshot
. Document throughput per millisecond is recorded for the following recovery strategies:The benchmark is executed with two scenarios:
While sequential tests are included for completeness, these are not expected in real-world scenarios, especially in LogsDB mode where index sorting is used. Therefore, only the non-sequential results are considered relevant.
Non-Sequential Benchmark Results
Kafka Logs
The default mode shows a much higher throughput (22.5 ops/ms) compared to LogsDB (7.5 ops/ms). However, using batch sizes in the LogsDB mode significantly improves throughput, with the highest results occurring at batch sizes of 512 and 1024.
Endpoint Events Process Logs
The default mode for endpoint events process logs performs better than LogsDB, as seen with Kafka logs. Small batch sizes improves throughput significantly, reaching 12.2 ops/ms with a batch size of 1024.
Endpoint Events Security Logs
Endpoint events security logs follow a similar pattern as the other logs. Default mode performs best, but batching closes the gap significantly.
Appendix: Sequential Benchmark Results
Sequential execution should rarely occur in practice, particularly in LogsDB mode where index sorting is typically enabled. For completeness, the sequential results are reported but are not relevant to the typical use case.
Kafka Logs (Sequential)
Endpoint Events Process Logs (Sequential)
Endpoint Events Security Logs (Sequential)
Relates to #116726