Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11949. Update Recon OM Sync default configs #7600

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

devmadhuu
Copy link
Contributor

What changes were proposed in this pull request?

This PR change is to update Recon OM Sync default configs values.

Default configs for recon om sync are recommended based on recent performance test and evaluation of recon om sync process and underlying tasks execution speed.

Recommended configs default values:

ozone.recon.om.snapshot.task.interval.delay -> 5s
recon.om.delta.update.limit -> 50000
recon.om.delta.update.loop.limit -> 50

Above are recommended and default configs for high write TPS workload in the range of approx 5k to achieve near real time sync between Recon and OM data.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11949

How was this patch tested?

Tested manually with existing Junit test cases.

@devmadhuu devmadhuu marked this pull request as draft December 19, 2024 09:28
@devmadhuu devmadhuu marked this pull request as ready for review December 19, 2024 09:29
@adoroszlai adoroszlai changed the title HDDS-11949. Ozone Recon - Update Recon OM Sync default configs and docker configs HDDS-11949. Update Recon OM Sync default configs Dec 19, 2024
@errose28
Copy link
Contributor

Thanks for adding this @devmadhuu. Could you share the benchmarks that were used to arrive at these numbers with the community?

@devmadhuu
Copy link
Contributor Author

devmadhuu commented Dec 20, 2024

Thanks for adding this @devmadhuu. Could you share the benchmarks that were used to arrive at these numbers with the community?

yes sure, thanks @errose28 . We did following performance benchmarking testing for Recon OM sync process flow.

Workload test ran for 5K TPS (create/commit operations) on cluster:

ozone freon ockrw -n 10000000 -t 100 --percentage-read 0 --size 0 -r 1000000 -v voltest -b buckettest -p performanceTest

Following configs:

Recon heap allocation - 31 GB
ozone.recon.om.snapshot.task.interval.delay - 5s
recon.om.delta.update.limit - 50k
recon.om.delta.update.loop.limit - 50

Test ran for 39 mins and approx 10M OM DB events got generated having a mix of following events due to 5K create/commit key operations per sec:
create/commit

- 
-     insert in open key (PUT)
-     update bucket info (UPDATE)
-     delete from open key (DELETE)
-     update bucket info (UPDATE)
-     insert to key (PUT)
-     insert to delete key (PUT)
-     removal from deleted key table (DELETE)

Further observations:

  • Approx 2.1M OM DB events per min got generated by workload till the whole test run duration.
    
  • No JVM pause detection and GC pauses.
    
  • Recon OM data was lagging by approx 330k OM DB events in one sync interval and it was near real time sync while test workload was in progress.
    
  • Recon OM sync is divided among following sub tasks:
    -     Get from OM
    -     Perform DB update in batch
    -     Prepare events based on DB update in batch.- These 3 tasks together took 16 secs
    -     Process those DB events by each of the 4 background task concurrently.- 30 secs
    

    So based on above perf stats, Recon was actually processing end to end 1.4M per min and OM was generating at a pace of 2.1M per min. This data also confirmed by Grafana metrics. Based on this data, if we increase delta update limit further, it will not help much because processing time will increase and after each run, there is an delay of 1 min. so we need to reduce the delay further to 5s, so that lag between Recon and OM is kept to min in the range of just 330k (1 sync cycle will match up this as well after test workload finishes).

Our next task would be to think, how we can optimize the processing speed of background tasks, though there is limited possibility due to the nature of data and Recon's background tasks must process all the events in sequence and cannot process concurrently, we need to think and see the optimization possibility or possibility of processing concurrently in processing logic of single event by each background task.
Raised HDDS-11688 and HDDS-11953 to handle further optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants