Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark and Optimize Raft #2975

Closed
manishrjain opened this issue Feb 5, 2019 · 2 comments
Closed

Benchmark and Optimize Raft #2975

manishrjain opened this issue Feb 5, 2019 · 2 comments
Labels
area/performance Performance related issues. kind/maintenance Maintenance tasks, such as refactoring, with no impact in features. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/accepted We accept to investigate/work on it.

Comments

@manishrjain
Copy link
Contributor

I'm seeing that after proposing a mutation, it is taking over 130ms just to see it back in CommitedEntries. We should try to figure out why it takes this long to see the mutation back via Raft. What is the slowdown here? SaveToStorage, the comms, or something else?

2019-02-04-174622_1243x422_scrot

@manishrjain manishrjain added optimization priority/P1 Serious issue that requires eventual attention (can wait a bit) labels Feb 5, 2019
@manishrjain
Copy link
Contributor Author

One of the things might be that raft.Advance is slow, and needs optimization -- one of the things that needs to be done there is to add a cache in front of LastEntry API call in raft storage.

manishrjain added a commit that referenced this issue Feb 11, 2019
I've spent the last few days looking at how to optimize the live mutation path in Dgraph server. While trying many things in the server (past commits included), I realized my optimizations in the server are not improving things much, the throughput saturating at 20-30K NQuads/sec.

Turns out, it was the live loader which was causing the saturation. The XID to UID assigner was the bottleneck causing the throughput to stagnate, despite the server being underutilized.

This PR fixes that by optimizing the assigner. In particular, I've removed the slow LRU cache. Added buffer to `newRanges` channel to ensure we always have a range handy when we run out. Made passing badger DB instance optional, so we can avoid doing disk writes if not required. And made other optimizations around how we lock, etc. I also added benchmarks for the assigner, which shows each allocation (tested via parallel benchmark) takes 350 ns/op on my desktop.

With these changes, the live loader throughput jumps to 100K-120K NQuads/sec on my desktop. In particular, pre-assigning UIDs to the RDF/JSON file yields maximum throughput. I can load 140M friend graph RDFs in 25 mins.

Helps with #2975 .

Changes:
* Work on optimizing XidToUid map.
* Add the test and benchmark for xid to uid map
* Working code with decreased memory usage. Includes a new BumpUp API.
* Working live loader, which can optionally just keep all the mapping in memory.
* Adding shards back to XidMap speed up operations by a huge factor. Benchmark shows each allocation is 300ns.
* Make BumpTo much faster by calling Zero directly, instead of looping through the newRanges channel.
* Improve how BumpTo() happens by using a maxSeenUid variable.
@campoy campoy added area/performance Performance related issues. and removed optimization labels May 31, 2019
dna2github pushed a commit to dna2fork/dgraph that referenced this issue Jul 19, 2019
I've spent the last few days looking at how to optimize the live mutation path in Dgraph server. While trying many things in the server (past commits included), I realized my optimizations in the server are not improving things much, the throughput saturating at 20-30K NQuads/sec.

Turns out, it was the live loader which was causing the saturation. The XID to UID assigner was the bottleneck causing the throughput to stagnate, despite the server being underutilized.

This PR fixes that by optimizing the assigner. In particular, I've removed the slow LRU cache. Added buffer to `newRanges` channel to ensure we always have a range handy when we run out. Made passing badger DB instance optional, so we can avoid doing disk writes if not required. And made other optimizations around how we lock, etc. I also added benchmarks for the assigner, which shows each allocation (tested via parallel benchmark) takes 350 ns/op on my desktop.

With these changes, the live loader throughput jumps to 100K-120K NQuads/sec on my desktop. In particular, pre-assigning UIDs to the RDF/JSON file yields maximum throughput. I can load 140M friend graph RDFs in 25 mins.

Helps with dgraph-io#2975 .

Changes:
* Work on optimizing XidToUid map.
* Add the test and benchmark for xid to uid map
* Working code with decreased memory usage. Includes a new BumpUp API.
* Working live loader, which can optionally just keep all the mapping in memory.
* Adding shards back to XidMap speed up operations by a huge factor. Benchmark shows each allocation is 300ns.
* Make BumpTo much faster by calling Zero directly, instead of looping through the newRanges channel.
* Improve how BumpTo() happens by using a maxSeenUid variable.
@shekarm shekarm added the status/accepted We accept to investigate/work on it. label Feb 18, 2020
@MichelDiz MichelDiz added the kind/maintenance Maintenance tasks, such as refactoring, with no impact in features. label Feb 26, 2020
@minhaj-shakeel
Copy link
Contributor

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Performance related issues. kind/maintenance Maintenance tasks, such as refactoring, with no impact in features. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/accepted We accept to investigate/work on it.
Development

No branches or pull requests

5 participants