Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record perf contexts to the write procedure #227

Merged
merged 6 commits into from
Jun 24, 2022

Conversation

sticnarf
Copy link
Contributor

@sticnarf sticnarf commented Jun 22, 2022

This PR tries to add RocksDB-like perf contexts to Raft Engine.

Now, I only add these statistics:

  • log_populating_nanos
  • write_leader_wait_nanos
  • log_write_nanos
  • log_rotate_nanos
  • log_sync_nanos
  • write_apply_nanos

I'm willing to know if there are other important metrics that is worth recording here.

@sticnarf sticnarf requested a review from tabokie June 22, 2022 06:16
@codecov
Copy link

codecov bot commented Jun 22, 2022

Codecov Report

Merging #227 (f5a6920) into master (a950ec5) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
+ Coverage   97.13%   97.22%   +0.08%     
==========================================
  Files          30       30              
  Lines        9216     9368     +152     
==========================================
+ Hits         8952     9108     +156     
+ Misses        264      260       -4     
Impacted Files Coverage Δ
src/lib.rs 100.00% <ø> (ø)
src/engine.rs 97.58% <100.00%> (+0.07%) ⬆️
src/file_pipe_log/log_file.rs 99.28% <100.00%> (ø)
src/file_pipe_log/pipe.rs 99.15% <100.00%> (+<0.01%) ⬆️
src/log_batch.rs 97.04% <100.00%> (+0.44%) ⬆️
src/metrics.rs 98.21% <100.00%> (+0.55%) ⬆️
src/purge.rs 97.43% <100.00%> (ø)
src/write_barrier.rs 100.00% <100.00%> (ø)
tests/failpoints/test_engine.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a950ec5...f5a6920. Read the comment docs.

src/write_barrier.rs Outdated Show resolved Hide resolved
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@tabokie
Copy link
Member

tabokie commented Jun 23, 2022

@sticnarf It's getting a little messy, I'm thinking something like this:

It basically decouples perf context from prometheus with a new interface fn observe(t: &mut Instant) -> Duration.

src/metrics.rs Outdated Show resolved Hide resolved
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf
Copy link
Contributor Author

sticnarf commented Jun 23, 2022

It basically decouples perf context from prometheus with a new interface fn observe(t: &mut Instant) -> Duration.

I think StopWatch is sometimes more convenient. Now I still keep it possible to use StopWatch with perf context but add an observe_since for cases suitable to pass in just an Instant.

@sticnarf
Copy link
Contributor Author

I think it important to pass the breakdown to the followers. For each thread, it still gives mostly accurate information about how long is spent inside raft-engine for each reason.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Copy link
Member

@tabokie tabokie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the changelog and add a simple test?

src/engine.rs Outdated Show resolved Hide resolved
src/engine.rs Outdated Show resolved Hide resolved
src/engine.rs Outdated Show resolved Hide resolved
src/file_pipe_log/log_file.rs Outdated Show resolved Hide resolved
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Copy link
Member

@tabokie tabokie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LG.

CHANGELOG.md Outdated
@@ -10,6 +10,7 @@

* Add `is_empty` to `Engine` API.
* Add metadata deletion capability to `FileSystem` trait. Users can implement `exists_metadata` and `delete_metadata` to clean up obsolete metadata from older versions of Raft Engine.
* Add `PerfContext` which records detailed time breakdown of the write process to thread-local storage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it to new feature.

src/metrics.rs Outdated
Comment on lines 49 to 50
/// Time spent waiting for becoming the write leader.
pub write_leader_wait_duration: Duration,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Time spent waiting for becoming the write leader.
pub write_leader_wait_duration: Duration,
/// Time spent waiting for the write group to form and get processed.
pub write_wait_duration: Duration,

}
let entered_time = writer.entered_time.unwrap();
ENGINE_WRITE_PREPROCESS_DURATION_HISTOGRAM
.observe(entered_time.saturating_duration_since(start).as_secs_f64());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug assert perf_context_diff.write_wait_duration == 0.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@tabokie tabokie merged commit 1a9dd43 into tikv:master Jun 24, 2022
ti-chi-bot added a commit to tikv/tikv that referenced this pull request Jun 28, 2022
ref tikv/raft-engine#227, ref #12362

We used to record perf contexts for Raft RocksDB and KV RocksDB with
the same PerfContext. But we also have raft-engine now. So, we will miss
perf contexts if we still use RocksDB perf contexts.

This commit adds PerfContext support to RaftEngine and distinguish it
from the perf context used for applying. Then, we'll record correct perf
statistics for both raft engine and KV DB.

Updated raft-engine to include tikv/raft-engine#227

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ethercflow pushed a commit to ethercflow/tikv that referenced this pull request Jun 28, 2022
ref tikv/raft-engine#227, ref tikv#12362

We used to record perf contexts for Raft RocksDB and KV RocksDB with
the same PerfContext. But we also have raft-engine now. So, we will miss
perf contexts if we still use RocksDB perf contexts.

This commit adds PerfContext support to RaftEngine and distinguish it
from the perf context used for applying. Then, we'll record correct perf
statistics for both raft engine and KV DB.

Updated raft-engine to include tikv/raft-engine#227

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants