Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve storage access patterns in ReportsProcessed DO #384

Closed
wants to merge 1 commit into from

Conversation

mendess
Copy link
Collaborator

@mendess mendess commented Aug 9, 2023

No description provided.

@mendess mendess self-assigned this Aug 9, 2023
@mendess mendess force-pushed the mendess/improve-mark-aggregated branch 3 times, most recently from 8b1345a to f2ecdd8 Compare August 10, 2023 13:34
Copy link
Contributor

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're changing the data we're storing, we need to document the new format in the commit message.

@@ -1,6 +1,7 @@
# SPDX-License-Identifier: BSD-3-Clause

[workspace]
resolver = "2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cargo can properly resolve the feature sets and not pull in tokio when building for wasm (since tokio doesn't support wasm)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a comment to clarify?

daphne_worker/src/durable/reports_processed.rs Outdated Show resolved Hide resolved
daphne_worker/src/durable/reports_processed.rs Outdated Show resolved Hide resolved
daphne_worker/src/durable/reports_processed.rs Outdated Show resolved Hide resolved
@@ -45,26 +43,56 @@ pub(crate) const DURABLE_REPORTS_PROCESSED_MARK_AGGREGATED: &str =
/// where `<report_id>` is the hex-encoded report ID.
#[durable_object]
pub struct ReportsProcessed {
#[allow(dead_code)]
state: State,
env: Env,
config: DaphneWorkerConfig,
touched: bool,
alarmed: bool,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the meaning of this constant. How did you pick 131,072? Also, is this the best name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the max amount a value can have inside a Durable Object, I got it from the error message. I'll document

.into_iter()
.flatten()
.collect::<Vec<ReportId>>();
let mut seen = self.load_seen_reports().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level question: Isn't this going to get expensive if the number of reports processed gets large? Do we have a sense of how many is too many? Once we reach that point, what should we do at the protocol level?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested for more than 8k reports and it behaved just fine. But perhaps we need to measure what a real workload looks like

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this in wrangler dev or on Workers?

To avoid creating a DoS vector here, we're going to want to enforce a cap on the number of reports we're willing to store in a single instance. If we overflow this, then we'll need to reject reports for which we aren't tracking replay (with PrepareError "report_dropped"). Note that the "mark aggregated" call back will have to handle the rejection logic.

@mendess mendess force-pushed the mendess/improve-mark-aggregated branch 6 times, most recently from a94664e to 4c904e4 Compare August 11, 2023 12:49
The format processed reports are now stored in is an array of bytes with
the 16 byte wide ids contiguous in memory
@mendess mendess force-pushed the mendess/improve-mark-aggregated branch from 4c904e4 to a1dc740 Compare August 11, 2023 14:29
@cjpatton cjpatton closed this Sep 13, 2023
@mendess mendess deleted the mendess/improve-mark-aggregated branch December 4, 2023 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants