Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track time spent in datastore methods; use this to charge energy #1957

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

gefjon
Copy link
Contributor

@gefjon gefjon commented Nov 7, 2024

Description of Changes

TxId and MutTxId now read from the Instant clock around all of their public methods, and accumulate the total Duration spent in those methods throughout the life of the transaction. Then, when committing or aborting the transaction, the RelationalDB charges energy via a new EnergyMonitor method, record_datastore_time.

This necessitates having RelationalDB hold an Arc<dyn EnergyMonitor>, as previously it had no mechanism by which to access the energy monitor.

Note that this approach is intended as a stopgap, rather than a permanent solution. Notable flaws include:

  • Energy consumed by datastore operations is not included in the WS broadcasts sent to clients; these contain only WASM execution energy. This is because we construct the ModuleEvent before downgrading the TX and running incremental queries, and the total datastore energy cost is only available after those operations.

  • Wall time is a very noisy measurement. It is not necessarily incorrect to pass this noise on to our customers, since many of the unpredictable costs are caused by events like page allocations which are caused by the user code, and in general we expect any variance not caused by user code to disappear as the total time grows large. Still, it's not wonderful that we will end up charging different amounts of energy to run the same operations against the same data sets multiple times.

  • This patch does not track time spent in iterator or RowRef operations, only methods on the (Mut-) TxId (which are generally accessed through the RelationalDB).

  • We also need to take care that this does not significantly regress performance, since it involves a potentially large number of added clock reads and relaxed atomic increments. My hope is that the atomics are uncontended and so it's not a significant cost, but hey, this is why we have benchmarks, right?

    • Update: Callgrind benchmarks look fine.

API and ABI breaking changes

Will require private PR, which I will do after we determine that this is an acceptable approach and that the overhead is low. No user-facing interfaces change.

Expected complexity level and risk

2 - performance risks, and may under- or over-charge energy, but seems unlikely to break functionality.

Testing

Describe any testing you've done, and any testing you'd like your reviewers to do,
so that you're confident that all the changes work as expected!

  • None - curious to hear reviewers' thoughts on how best to test this!

`TxId` and `MutTxId` now read from the `Instant` clock around all of their public methods,
and accumulate the total `Duration` spent in those methods
throughout the life of the transaction.
Then, when committing or aborting the transaction,
the `RelationalDB` charges energy via a new `EnergyMonitor` method,
`record_datastore_time`.

This necessitates having `RelationalDB` hold an `Arc<dyn EnergyMonitor>`,
as previously it had no mechanism by which to access the energy monitor.

Note that this approach is intended as a stopgap, rather than a permanent solution.
Notable flaws inclue:

- Energy consumed by datastore operations is not included
  in the WS broadcasts sent to clients; these contain only WASM execution energy.
  This is because we construct the `ModuleEvent` before downgrading the TX
  and running incremental queries,
  and the total datastore energy cost is only available after those operations.

- Wall time is a very noisy measurement.
  It is not necessarily incorrect to pass this noise on to our customers,
  since many of the unpredictable costs are caused by events like page allocations
  which are caused by the user code,
  and in general we expect any variance not caused by user code
  to disappear as the total time grows large.
  Still, it's not wonderful that we will end up charging different amounts of energy
  to run the same operations against the same data sets multiple times.

- This patch does not track time spent in iterator or `RowRef` operations,
  only methods on the (`Mut`-) `TxId`
  (which are generally accessed through the `RelationalDB`).

- We also need to take care that this does not significantly regress performance,
  since it involves a potentially large number of added clock reads
  and relaxed atomic increments.
  My hope is that the atomics are uncontended and so it's not a significant cost,
  but hey, this is why we have benchmarks, right?
@gefjon gefjon requested a review from cloutiertyler as a code owner November 7, 2024 18:30
@gefjon gefjon self-assigned this Nov 7, 2024
@gefjon
Copy link
Contributor Author

gefjon commented Nov 7, 2024

Closes #666 . Nice.

@gefjon
Copy link
Contributor Author

gefjon commented Nov 8, 2024

benchmarks please

Copy link

github-actions bot commented Nov 8, 2024

Benchmarking failed. Please check the workflow run for details.

Copy link

github-actions bot commented Nov 8, 2024

Callgrind benchmark results

Callgrind Benchmark Report

These benchmarks were run using callgrind,
an instruction-level profiler. They allow comparisons between sqlite (sqlite), SpacetimeDB running through a module (stdb_module), and the underlying SpacetimeDB data storage engine (stdb_raw). Callgrind emulates a CPU to collect the below estimates.

Measurement changes larger than five percent are in bold.

In-memory benchmarks

callgrind: empty transaction

db total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw 6635 6399 3.69% 6735 6491 3.76%
sqlite 5579 5579 0.00% 5971 6047 -1.26%

callgrind: filter

db schema indices count preload _column data_type total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str no_index 64 128 1 u64 77170 76593 0.75% 77610 77049 0.73%
stdb_raw u32_u64_str no_index 64 128 2 string 119668 119091 0.48% 120394 119793 0.50%
stdb_raw u32_u64_str btree_each_column 64 128 2 string 25319 25084 0.94% 25877 25578 1.17%
stdb_raw u32_u64_str btree_each_column 64 128 1 u64 24287 24051 0.98% 24707 24507 0.82%
sqlite u32_u64_str no_index 64 128 2 string 144695 144695 0.00% 146243 146159 0.06%
sqlite u32_u64_str no_index 64 128 1 u64 124044 124044 0.00% 125350 125370 -0.02%
sqlite u32_u64_str btree_each_column 64 128 1 u64 131361 131361 0.00% 132751 132819 -0.05%
sqlite u32_u64_str btree_each_column 64 128 2 string 134494 134494 0.00% 136086 136086 0.00%

callgrind: insert bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 64 128 887737 878319 1.07% 913259 929371 -1.73%
stdb_raw u32_u64_str btree_each_column 64 128 1038684 1026724 1.16% 1074864 1056184 1.77%
sqlite u32_u64_str unique_0 64 128 398320 398320 0.00% 414982 416162 -0.28%
sqlite u32_u64_str btree_each_column 64 128 983637 983637 0.00% 1016279 1024023 -0.76%

callgrind: iterate

db schema indices count total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 154134 153726 0.27% 154220 153834 0.25%
stdb_raw u32_u64_str unique_0 64 17159 16751 2.44% 17221 16843 2.24%
sqlite u32_u64_str unique_0 1024 1067255 1067255 0.00% 1070651 1070643 0.00%
sqlite u32_u64_str unique_0 64 76201 76201 0.00% 77247 77187 0.08%

callgrind: serialize_product_value

count format total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
64 json 47528 47528 0.00% 50282 50282 0.00%
64 bsatn 25509 25509 0.00% 27787 27719 0.25%
16 bsatn 8200 8200 0.00% 9594 9526 0.71%
16 json 12188 12188 0.00% 14160 14194 -0.24%

callgrind: update bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 1024 20678646 20085072 2.96% 21435048 20703480 3.53%
stdb_raw u32_u64_str unique_0 64 128 1296886 1285086 0.92% 1377924 1359386 1.36%
sqlite u32_u64_str unique_0 1024 1024 1802182 1802182 0.00% 1811256 1811420 -0.01%
sqlite u32_u64_str unique_0 64 128 128528 128528 0.00% 131378 131350 0.02%
On-disk benchmarks

callgrind: empty transaction

db total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw 6640 6404 3.69% 6752 6512 3.69%
sqlite 5621 5621 0.00% 6069 6157 -1.43%

callgrind: filter

db schema indices count preload _column data_type total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str no_index 64 128 1 u64 77175 76598 0.75% 77559 77034 0.68%
stdb_raw u32_u64_str no_index 64 128 2 string 119673 119096 0.48% 120335 119850 0.40%
stdb_raw u32_u64_str btree_each_column 64 128 2 string 25324 25088 0.94% 25782 25518 1.03%
stdb_raw u32_u64_str btree_each_column 64 128 1 u64 24292 24056 0.98% 24672 24496 0.72%
sqlite u32_u64_str no_index 64 128 1 u64 125965 125965 0.00% 127467 127535 -0.05%
sqlite u32_u64_str no_index 64 128 2 string 146616 146616 0.00% 148380 148352 0.02%
sqlite u32_u64_str btree_each_column 64 128 2 string 136616 136616 0.00% 138666 138614 0.04%
sqlite u32_u64_str btree_each_column 64 128 1 u64 133457 133457 0.00% 135249 135329 -0.06%

callgrind: insert bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 64 128 836925 826091 1.31% 861209 876309 -1.72%
stdb_raw u32_u64_str btree_each_column 64 128 988008 976812 1.15% 1053446 1035968 1.69%
sqlite u32_u64_str unique_0 64 128 415857 415857 0.00% 431815 433009 -0.28%
sqlite u32_u64_str btree_each_column 64 128 1021898 1021898 0.00% 1054634 1062154 -0.71%

callgrind: iterate

db schema indices count total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 154139 153731 0.27% 154205 153823 0.25%
stdb_raw u32_u64_str unique_0 64 17164 16756 2.43% 17246 16844 2.39%
sqlite u32_u64_str unique_0 1024 1070323 1070323 0.00% 1074173 1074097 0.01%
sqlite u32_u64_str unique_0 64 77973 77973 0.00% 79307 79267 0.05%

callgrind: serialize_product_value

count format total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
64 json 47528 47528 0.00% 50282 50282 0.00%
64 bsatn 25509 25509 0.00% 27787 27719 0.25%
16 bsatn 8200 8200 0.00% 9594 9526 0.71%
16 json 12188 12188 0.00% 14160 14194 -0.24%

callgrind: update bulk

db schema indices count preload total reads + writes old total reads + writes Δrw estimated cycles old estimated cycles Δcycles
stdb_raw u32_u64_str unique_0 1024 1024 19198881 18998827 1.05% 19994017 19715619 1.41%
stdb_raw u32_u64_str unique_0 64 128 1250890 1237909 1.05% 1330418 1313287 1.30%
sqlite u32_u64_str unique_0 1024 1024 1809743 1809743 0.00% 1818345 1818309 0.00%
sqlite u32_u64_str unique_0 64 128 132654 132654 0.00% 135704 135620 0.06%

@joshua-spacetime joshua-spacetime linked an issue Nov 8, 2024 that may be closed by this pull request
@kazimuth
Copy link
Contributor

kazimuth commented Nov 8, 2024

Uhhh keep in mind callgrind might not simulate delays due to atomics. But if they're uncontended it probably doesn't matter.

@gefjon
Copy link
Contributor Author

gefjon commented Nov 12, 2024

Uhhh keep in mind callgrind might not simulate delays due to atomics. But if they're uncontended it probably doesn't matter.

Ah, I hadn't considered that. Are the wall-time benchmarks working again? Do I need to merge something into this branch?

@bfops bfops added release-any To be landed in any release window breaks library compatibility This change breaks the SpacetimeDB library interface release-1.0 and removed release-any To be landed in any release window labels Dec 2, 2024
@gefjon
Copy link
Contributor Author

gefjon commented Dec 6, 2024

@Centril has offered to run some benchmarks. This is low-priority among his work.

@gefjon gefjon requested a review from Centril December 6, 2024 18:38
Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in providing feedback here.

We're most likely going to have to rethink this approach since the majority of energy usage is going to come from reads, i.e. iterators. There are really only two operations that determine the cost of reads (in the datastore):

  1. Index seeks
  2. Row ID fetches, or RowPointer dereferences.

Of course the cost of both of these operations is highly data dependent, but just counting them should give us an initial estimate that's good enough modulo some constant factor. Ultimately we'll want to take the table schema into account to more accurately estimate the cost of a fetch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaks library compatibility This change breaks the SpacetimeDB library interface release-1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track energy for database compute
4 participants