-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve storage access patterns in ReportsProcessed DO #384
Conversation
8b1345a
to
f2ecdd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're changing the data we're storing, we need to document the new format in the commit message.
@@ -1,6 +1,7 @@ | |||
# SPDX-License-Identifier: BSD-3-Clause | |||
|
|||
[workspace] | |||
resolver = "2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cargo can properly resolve the feature sets and not pull in tokio when building for wasm (since tokio doesn't support wasm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a comment to clarify?
@@ -45,26 +43,56 @@ pub(crate) const DURABLE_REPORTS_PROCESSED_MARK_AGGREGATED: &str = | |||
/// where `<report_id>` is the hex-encoded report ID. | |||
#[durable_object] | |||
pub struct ReportsProcessed { | |||
#[allow(dead_code)] | |||
state: State, | |||
env: Env, | |||
config: DaphneWorkerConfig, | |||
touched: bool, | |||
alarmed: bool, | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the meaning of this constant. How did you pick 131,072? Also, is this the best name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the max amount a value can have inside a Durable Object, I got it from the error message. I'll document
.into_iter() | ||
.flatten() | ||
.collect::<Vec<ReportId>>(); | ||
let mut seen = self.load_seen_reports().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level question: Isn't this going to get expensive if the number of reports processed gets large? Do we have a sense of how many is too many? Once we reach that point, what should we do at the protocol level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested for more than 8k reports and it behaved just fine. But perhaps we need to measure what a real workload looks like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this in wrangler dev
or on Workers?
To avoid creating a DoS vector here, we're going to want to enforce a cap on the number of reports we're willing to store in a single instance. If we overflow this, then we'll need to reject reports for which we aren't tracking replay (with PrepareError "report_dropped"). Note that the "mark aggregated" call back will have to handle the rejection logic.
a94664e
to
4c904e4
Compare
The format processed reports are now stored in is an array of bytes with the 16 byte wide ids contiguous in memory
4c904e4
to
a1dc740
Compare
No description provided.