Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vstorage): Disambiguate empty strings and deletions #6971

Merged
merged 6 commits into from
Mar 1, 2023

Conversation

mhofman
Copy link
Member

@mhofman mhofman commented Feb 10, 2023

refs: #5542

As usual, best reviewed commit-by-commit

Description

As preliminary changes for #5542, we need to support empty strings as actual values in vstorage, potentially for both leaf and ancestors. golang non-nullable values leaked through the design of vstorage, and setting a value of '' previously meant deletion.

To support empty strings as actual values, we need to:

  • Support explicit deletion
  • Encode noData in the cosmos KV store differently than empty string when an entry must exist (ancestor key)

Security Considerations

None

Scaling Considerations

An alternative would have been to encode or hilbert the state-sync values to avoid empty strings, but that would have been costly as state sync export is likely to become the biggest data generator, and would have leaked further a golang design choice.

Furthermore changing the interface allows batch setting multiple entries to avoid repeated call across the JS -> golang boundary. (Current vstorage users do not yet take advantage of this change, the mailbox probably could with some further refactoring)

Documentation Considerations

vstorage clients need to be updated if they iterate over keys, as they may encounter a value with an unexpected prefix. The casting library is updated as part of this PR.

A migration (or export/import) will be needed at the next upgrade containing this change. The migration was implemented as part of this change, and a note left in the current upgrade handler highlighting the need to call migration at the next upgrade.

Testing Considerations

Added unit tests, updated existing ones.

@mhofman mhofman force-pushed the mhofman/5542-refactor-vstorage branch from db9df6e to 0887c2c Compare February 11, 2023 04:47
@mhofman mhofman marked this pull request as ready for review February 11, 2023 04:48
Copy link
Member

@michaelfig michaelfig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions around the Golang code. JS looks good to me!

@gibson042 feel free to abstain if that's good enough for you, otherwise I won't approve until you've weighed in.

golang/cosmos/app/app.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/types.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/types.go Outdated Show resolved Hide resolved
packages/internal/src/lib-chainStorage.js Outdated Show resolved Hide resolved
@mhofman mhofman requested a review from JimLarson February 13, 2023 18:35
@mhofman mhofman force-pushed the mhofman/5542-refactor-vstorage branch 2 times, most recently from 023f68b to 75c211d Compare February 22, 2023 00:41
@mhofman
Copy link
Member Author

mhofman commented Feb 22, 2023

After a discussion with @warner and @gibson042 this morning, I realized that this change relied too much on JS producing entry tuples of the right length. For example in JS JSON.stringify(["foo", undefined]) === '["foo", null]' and then in golang var entry []string; json.Unmarshal(msg, &entry) results in ["foo", ""].

I have thus changed the Unmarshal path to explicitly check the type of the value, and added a couple constructors for StorageEntry, at which point I decided there was no reason to keep going with the []string representation.

I am still very uncertain whether any of this if colloquial golang, and I'm looking for experts opinions.

Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the overall purpose of differentiating empty-string values and nil values? I'm not against it, but I'm not clear on what such a change supports—what clients care about the difference. Is this for something like cascading deletion?

golang/cosmos/x/swingset/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/path_keys.go Show resolved Hide resolved
Comment on lines 60 to +78
case "set":
keeper.SetStorageAndNotify(cctx.Context, msg.Path, msg.Value)
for _, arg := range msg.Args {
var entry types.StorageEntry
entry, err = types.UnmarshalStorageEntry(arg)
if err != nil {
return
}
keeper.SetStorageAndNotify(cctx.Context, entry)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading correctly, "set" and all the other write methods now affect an arbitrary number of entries. Would it instead make sense to separate single-entry vs. multi-entry methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a reason to have different methods for single vs multiple entry updates if the length of the args indicates that intent.

golang/cosmos/x/vstorage/vstorage.go Outdated Show resolved Hide resolved
return "null", nil
}
//fmt.Printf("Keeper.GetStorage gave us %bz\n", value)
bz, err := json.Marshal(value)
//fmt.Printf("Keeper.GetStorage gave us %bz\n", entry.Value())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line has been carried forward long enough.

Suggested change
//fmt.Printf("Keeper.GetStorage gave us %bz\n", entry.Value())

DataPrefixBytes: string(keeper.GetDataPrefix()),
NoDataValue: string(keeper.GetNoDataValue()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, it stinks to leak these implementation details. What would be the consequence of not exporting NoDataValue here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding details were unfortunately already leaking to anyone that reads vstorage data directly from the chain, both in the path encoding, and in the data prefixing. It was a very unfortunate realization I couldn't get away with an internal only change. I don't fully grok how external clients like the casting package fetch data, and maybe we could have an adapter on the RPC server side to avoid leaking these details to clients? However I'd consider this out of scope for this PR. Ping @michaelfig

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way an external client can verify a Merkle proof for some on-chain data from an RPC node (trusting the validators but not the integrity of the RPC node) is by querying the exact IAVL key/value representation in bytes, and then knowing how they correspond to the application layer. This verification process necessarily crosses abstractions in an uncomfortable way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're proposing to remove getStoreKey and nothing in the SDK uses it, that's fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This verification process necessarily crosses abstractions in an uncomfortable way.

Makes sense

If you're proposing to remove getStoreKey and nothing in the SDK uses it, that's fine with me.

I remember seeing some potential uses. I didn't feel comfortable enough changing that wiring.

packages/casting/src/casting-spec.js Show resolved Hide resolved
Copy link
Member Author

@mhofman mhofman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the overall purpose of differentiating empty-string values and nil values? I'm not against it, but I'm not clear on what such a change supports—what clients care about the difference. Is this for something like cascading deletion?

The Swingstore replication into IAVL / the MultiStore will be done through a new vstorage path, and it requires being able to store empty strings. An empty string is a valid string value that is simply different than null in JS :)

golang/cosmos/x/swingset/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/types.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/types.go Outdated Show resolved Hide resolved
Path string `json:"key"` // TODO: rename JSON to "path"
Value string `json:"value"`
Method string `json:"method"`
Args []json.RawMessage `json:"args"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not backwards compatible, but the JS layer is updated in the same commit. There are no external clients.

Comment on lines 60 to +78
case "set":
keeper.SetStorageAndNotify(cctx.Context, msg.Path, msg.Value)
for _, arg := range msg.Args {
var entry types.StorageEntry
entry, err = types.UnmarshalStorageEntry(arg)
if err != nil {
return
}
keeper.SetStorageAndNotify(cctx.Context, entry)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a reason to have different methods for single vs multiple entry updates if the length of the args indicates that intent.

golang/cosmos/x/vstorage/vstorage.go Outdated Show resolved Hide resolved
DataPrefixBytes: string(keeper.GetDataPrefix()),
NoDataValue: string(keeper.GetNoDataValue()),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding details were unfortunately already leaking to anyone that reads vstorage data directly from the chain, both in the path encoding, and in the data prefixing. It was a very unfortunate realization I couldn't get away with an internal only change. I don't fully grok how external clients like the casting package fetch data, and maybe we could have an adapter on the RPC server side to avoid leaking these details to clients? However I'd consider this out of scope for this PR. Ping @michaelfig

@mhofman mhofman requested a review from gibson042 February 23, 2023 00:35
@mhofman
Copy link
Member Author

mhofman commented Feb 23, 2023

This PR has grown to a LOT of fixups to address review feedback. Here is what the rebased with squashing looks like: https://github.com/Agoric/agoric-sdk/compare/mhofman/5542-refactor-vstorage-rebase

Happy to push that (ultimately identical) version to this branch if all reviewers prefer.

Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after resolving remaining comments.

golang/cosmos/x/swingset/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/types/path_keys.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/vstorage.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/vstorage.go Outdated Show resolved Hide resolved
golang/cosmos/x/vstorage/vstorage.go Outdated Show resolved Hide resolved
@mhofman
Copy link
Member Author

mhofman commented Feb 28, 2023

Ugh I forgot to push the changes addressing feedback yesterday. PTAL @gibson042

golang/cosmos/x/vstorage/keeper/keeper.go Outdated Show resolved Hide resolved
@mhofman mhofman force-pushed the mhofman/5542-refactor-vstorage branch from cd1cf14 to 14ed493 Compare March 1, 2023 17:41
@mhofman
Copy link
Member Author

mhofman commented Mar 1, 2023

Clean squash/fixup of all the review feedback: cd1cf14..14ed493

@mhofman mhofman force-pushed the mhofman/5542-refactor-vstorage branch from 14ed493 to 3f32351 Compare March 1, 2023 17:46
@mhofman mhofman added the automerge:rebase Automatically rebase updates, then merge label Mar 1, 2023
@mergify mergify bot merged commit 3080ea7 into master Mar 1, 2023
@mergify mergify bot deleted the mhofman/5542-refactor-vstorage branch March 1, 2023 18:23
Copy link
Contributor

@JimLarson JimLarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One potential future cleanuup.

Comment on lines +33 to +54
var generic [2]interface{}
err = json.Unmarshal(msg, &generic)

if err != nil {
return
}

path, ok := generic[0].(string)
if !ok {
err = fmt.Errorf("invalid storage entry path: %q", generic[0])
return
}

switch generic[1].(type) {
case string:
entry = NewStorageEntry(path, generic[1].(string))
case nil:
entry = NewStorageEntryWithNoData(path)
default:
err = fmt.Errorf("invalid storage entry value: %q", generic[1])
}
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this instead:

var generic [2]*string{}
err = json.Unmarshal(msg, &generic)
if err != nil {
    return
}
if generic[0] == nil {
    err = fmt.Errorf("storage entry path cannot be null")
    return
}
if generic[1] == nil {
    entry = NewStorageEntry(generic[0], *generic[1])
} else {
    entry = NewStorageEntryWithNoData(generic[9])
}
return

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh json.Unmarshal understands string pointers? Yeah that'd have been simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge:rebase Automatically rebase updates, then merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants