-
Notifications
You must be signed in to change notification settings - Fork 29
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add basic profiling to benchmarks (#116)
# Problem The benchmarks currently return the number of instructions it took to execute each benchmark. While this number is useful to measure performance, it doesn't provide insight into where these instructions are being used and where the performance bottle necks are. Without this information, making informed performance optimizations would require a lot of trial and error. # Solution The typical solution to this problem is to use some kind of profiler. `ic-repl` already supports profiling and can output a flamegraph of where instructions are being spent, but it has a few drawbacks that makes it difficult to use: 1. The names of rust methods are mangled, even when `debug = 1` is turned on, making it hard to make sense of the output. 2. Each benchmark includes logic to first setup, and only after setup would we want to profile, so we'd need a way to programmatically tell the profiler to reset its measurements. 3. Often we'd like to benchmark blocks of code that aren't functions. To address the issues above, this commit introduces a "poor man profiler". This profiler is manual, in the sense that the developer adds to the code hints for what they care about profiling. In this PR, I added some basic hints, and the benchmarks now return an output that looks like this: ``` Benchmarking btreemap_insert_blob_64_1024_v2: Warming up for 1.0000 ms 2023-08-23 07:26:53.560585 UTC: [Canister rwlgt-iiaaa-aaaaa-aaaaa-cai] { "node_load_v2": "5_182_358_668 (80%)", "node_save_v2": "786_197_957 (12%)", } Benchmarking btreemap_insert_blob_64_1024_v2: Collecting 10 samples in estimated 345.63 s (165 iterations btreemap_insert_blob_64_1024_v2 time: [6474.1 M Instructions 6474.1 M Instructions 6474.1 M Instructions] change: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05) No change in performance detected. ``` This approach is simple and effective, but it does have the draw back that it makes the instructions count slightly inaccurate, as the profiling logic itself consumes cycles. I think we can limit this inaccuracy by making the `profiler` crate internally account for its own overhead and deducting those from its measurements.
- Loading branch information
Showing
11 changed files
with
137 additions
and
6 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[package] | ||
name = "profiler" | ||
version = "0.1.0" | ||
edition = "2021" | ||
|
||
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html | ||
|
||
[dependencies] | ||
ic-cdk = "0.6.8" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
//! A module for profiling canisters. | ||
use std::cell::RefCell; | ||
use std::collections::BTreeMap; | ||
|
||
thread_local! { | ||
static PROFILING: RefCell<BTreeMap<&'static str, u64>> = RefCell::new(BTreeMap::new()); | ||
} | ||
|
||
/// Starts profiling the instructions consumed. | ||
/// | ||
/// Instructions are counted and recorded under the given name until the | ||
/// `Profile` object returned is dropped. | ||
pub fn profile(name: &'static str) -> Profile { | ||
Profile::new(name) | ||
} | ||
|
||
/// Clears all profiling data. | ||
pub fn reset() { | ||
PROFILING.with(|p| p.borrow_mut().clear()); | ||
} | ||
|
||
/// Returns the number of instructions used for each of the profile names. | ||
pub fn get_results() -> std::collections::BTreeMap<&'static str, u64> { | ||
PROFILING.with(|p| p.borrow().clone()) | ||
} | ||
|
||
pub struct Profile { | ||
name: &'static str, | ||
start_instructions: u64, | ||
} | ||
|
||
impl Profile { | ||
fn new(name: &'static str) -> Self { | ||
Self { | ||
name, | ||
start_instructions: instruction_count(), | ||
} | ||
} | ||
} | ||
|
||
impl Drop for Profile { | ||
fn drop(&mut self) { | ||
let instructions_count = instruction_count() - self.start_instructions; | ||
|
||
PROFILING.with(|p| { | ||
let mut p = p.borrow_mut(); | ||
let entry = p.entry(&self.name).or_insert(0); | ||
*entry += instructions_count; | ||
}); | ||
} | ||
} | ||
|
||
fn instruction_count() -> u64 { | ||
#[cfg(target_arch = "wasm32")] | ||
{ | ||
ic_cdk::api::performance_counter(0) | ||
} | ||
|
||
#[cfg(not(target_arch = "wasm32"))] | ||
{ | ||
// Consider using cpu time here. | ||
0 | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters