Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Miniflare 3] Implement new storage system #555

Merged
merged 7 commits into from
Apr 28, 2023
Merged

[Miniflare 3] Implement new storage system #555

merged 7 commits into from
Apr 28, 2023

Conversation

mrbbot
Copy link
Contributor

@mrbbot mrbbot commented Apr 13, 2023

This PR implements a new underlying storage system for Miniflare, based off #525 and internal discussions. The storage system is split into 2 parts: an SQLite-backed metadata store, and a streaming file-system backed/in-memory blob store. Blobs use unguessable identifiers to support atomic transactions with the SQLite metadata store. See BlobStore for more details. This is a breaking change to the persistence format.

This PR also re-implements the KV gateway using the new storage system, as an example of what a data store consuming this system might look like. Notably, this implementation uses an intermediate expiring key-value-metadata store, that will be shared by the Cache gateway in the future.

Best reviewed commit-by-commit.

Closes DEVX-583, DEVX-584, DEVX-585, DEVX-586, DEVX-587, DEVX-588 and DEVX-589.

@mrbbot mrbbot added the tre Relating to Miniflare 3 label Apr 13, 2023
@mrbbot mrbbot requested a review from a team April 13, 2023 20:30
@changeset-bot
Copy link

changeset-bot bot commented Apr 13, 2023

⚠️ No Changeset found

Latest commit: df0b949

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@mrbbot
Copy link
Contributor Author

mrbbot commented Apr 14, 2023

Whoops! 4214fee completely broke Workers Sites. 🥲 I've fixed this in 661dd60 (and manually tested), but haven't ported the tests over yet. We'll do this in a follow-up PR, when we switch to integration tests for KV too.

@mrbbot
Copy link
Contributor Author

mrbbot commented Apr 14, 2023

The CI failure is the same as #551, and should be fixed once that's merged.

packages/tre/src/shared/types.ts Outdated Show resolved Hide resolved
packages/tre/src/storage/file.ts Show resolved Hide resolved
packages/tre/src/storage2/blob/store.ts Show resolved Hide resolved
Implements file-system backed and in-memory `ReadableStream`s
supporting single and multiple ranges, and a file-system backed
`WritableStream`.

Closes DEVX-583 and DEVX-584
packages/tre/src/plugins/kv/gateway.ts Show resolved Hide resolved
packages/tre/src/plugins/kv/gateway.ts Show resolved Hide resolved
packages/tre/src/plugins/kv/gateway.ts Outdated Show resolved Hide resolved
packages/tre/src/plugins/kv/gateway.ts Outdated Show resolved Hide resolved
Comment on lines +120 to +123
// Validate cacheTtl, but ignore it as there's only one "edge location":
// the user's computer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if at some point we might want to add some random jitter to KV writes, to simulate the real consistency guarantees?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've so far avoided adding random failures to Miniflare, because I wanted tests using it to be as deterministic as possible, but it would definitely be interesting to explore. Could try simulate things like transient network failures too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it deterministic, the API could perhaps include a way to set the next operation (specific or any) to fail the next one or more times?

packages/tre/src/storage2/keyvalue.ts Show resolved Hide resolved
packages/tre/src/storage2/keyvalue.ts Outdated Show resolved Hide resolved
packages/tre/src/storage2/keyvalue.ts Show resolved Hide resolved
packages/tre/src/storage2/keyvalue.ts Show resolved Hide resolved
packages/tre/src/storage2/keyvalue.ts Outdated Show resolved Hide resolved
@mrbbot mrbbot requested a review from penalosa April 26, 2023 10:30
delete(id: BlobId): Promise<void>;
}

function generateBlobId(): BlobId {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just using a UUIDv4? Or a ULID if the time component is important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do, just wanted something practically guaranteed unique, without adding another dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not remotely blocking for this PR, but I think crypto.randomUUID() should be a dependency-free way to get a UUID

@mrbbot mrbbot requested a review from penalosa April 28, 2023 09:13
Copy link
Contributor

@penalosa penalosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really really good—I'm very excited to see this in Miniflare! Two remaining questions, and then this is good to go from my perspective:

  • It would be great to see some tests for this system (as far as I can tell only the blob storage system is tested right now)
  • What's the migration story here? What happens when someone runs this new storage system in a project that previously had the old one? Can we add some messaging around that?

@mrbbot
Copy link
Contributor Author

mrbbot commented Apr 28, 2023

It would be great to see some tests for this system

We have tests for the blob stores and the streams. We don't have tests for the SQL stuff since it's more or less better-sqlite3 passed straight through. Were you thinking about tests for the createMemoryStorage/createFileStorage functions and KeyValueStorage classes? I'd argue the existing plugin integration tests are sufficient for those. We have tests for file-system persistence too. What do you think?

What's the migration story here?

Miniflare should ignore the old data, unless there are keys like db.sqlite or blobs. However, this is only when using Miniflare directly. In Wrangler, we can change the persistence directories to ./.wrangler/v3/kv or something, and warn if we detect the old paths.

Unfortunately, there's no easy way for us to detect the old system. We could check if the directory is non-empty and doesn't contain db.sqlite?

@penalosa
Copy link
Contributor

We have tests for the blob stores and the streams. We don't have tests for the SQL stuff since it's more or less better-sqlite3 passed straight through. Were you thinking about tests for the createMemoryStorage/createFileStorage functions and KeyValueStorage classes? I'd argue the existing plugin integration tests are sufficient for those. We have tests for file-system persistence too. What do you think?

I was specifically thinking about KeyValueStorage , but I take your point that the gateway tests should be sufficient. I still think there's some value in testing KeyValueStorage directly, but testing the observed behaviour through the gateway should be sufficient.

In Wrangler, we can change the persistence directories to ./.wrangler/v3/kv or something, and warn if we detect the old paths.

This seems like a reasonable tradeoff—lets make sure we add this messaging when upgrading to this version of Miniflare in Wrangler

mrbbot added 6 commits April 28, 2023 13:49
Closes DEVX-585 and DEVX-586
KV and cache have similar storage requirements: they both need an
expiring-key-value-metadata store. To simplify their implementation,
implement an abstract store on top of the new storage system to
provide this functionality.

Closes DEVX-588
Whilst we are migrating gateways over to the new storage system, we'd
like the `GatewayConstructor` type to use the old `Storage` type to
keep TypeScript happy. This change adds a `getNewStorage()` method
that gateways implemented using new storage should call to get an
appropriate storage implementation.
Our new storage system is incompatible with how Workers Sites used to
work. This change implements `sitesGatewayGet` and `sitesGatewayList`
functions that behave like unsanitised `FileStorage` used to.

We serve directly from disk with Workers Sites to ensure we always
send the most up-to-date files. Otherwise, we'd have to copy files
from disk to our own SQLite/blob store whenever any of them changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tre Relating to Miniflare 3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants