Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested Value Codec Access #990

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

EasterTheBunny
Copy link
Collaborator

@EasterTheBunny EasterTheBunny commented Jan 9, 2025

This commit introduces the ability to use a codec to encode/decode individual fields of a struct, including traversal to nested structs.

NONEVM-1063

reductionista
reductionista previously approved these changes Jan 11, 2025
@reductionista reductionista dismissed their stale review January 11, 2025 03:29

Dismissing due to test failure

ntcwt, ok := c[itemType]
if !ok {
return fmt.Errorf("%w: cannot find type %s", types.ErrInvalidType, itemType)
ntcwt, err := getCodec(c, itemType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have already realized this, but unfortunately this doesn't work. You can't use a codec for a subfield to decode raw, since raw is the entire encoded struct, not just one subfield.

I wish this had occurred to me on Friday, but I think there's a fundamental problem with this interface--of having CreateType return only the subfield type and then passing that to Decode.

Stepping back a moment: The only reason we need LogPoller to decode the raw data at all is because, unlike with EVM-encoded data, there's no way to predict what offset into a Borsh-encoded struct a subfield will be at without Borsh-decoding the entire struct.

So whatever we do to solve this problem, we have to decode the whole struct--not just a subfield. Because Decode doesn't have type information for the whole struct, it can't do that. So returning only the subfield type from CreateType is not useful.

We could fix this by having CreateType return the type of the entire struct, and passing that to Decode. Then having Decode decode the whole struct, and then extract the requested element out of it based on the itemType path. But that's an unnecessarily convoluted way to solve a much simpler problem.

The first two steps of what we need to solve it in the most straightforward way are already working without this PR:

  1. Pass eventType as itemType to CreateType, and have it return the type of the whole struct.
  2. Pass that into Decode and have that return the decoded struct.

All we need that's new is the third step:
3. A function which takes a struct as input and returns a nested subfield of that struct based on a dot-separated path.

This is what I originally implemented using reflect, and could probably dig up again. I'm leaning now towards that approach, which wouldn't require modifying the codec... since doing this part doesn't require any knowledge of registered types or codecs, and doesn't involve any translation from on chain to off chain or vice versa. It still seems like something the codec package might be able to do more easily than writing it from scratch, but if so I have no idea how.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this only works if you want to encode/decode a single field. It was never expected to be able to extract a single field from a fully encoded struct without encoding that single field first.

Example:

type TypeA struct {
    A int
    B string
}

You can decode the whole struct with Decode(ctx, allData, &structInstance, "") where allData is the encoded value of the entire struct. However, you cannot do Decode(allData, &fieldA, "A") and get only the int data.

Alternatively you can do something like the following:

// this would be the entire encoded struct
var allData []byte

var structInstance TypeA
_ = theCodec.Decode(allData, &structInstance, "TypeA")

// this will encode the value of the B field
strBts, _ := theCodec.Encode(ctx, structInstance.B, "TypeA.B")

I would envision in LogPoller that you could have different encoding and decoding codecs where you want to decode a struct and encode a single field, but use different encodings for each.

Copy link
Contributor

@amit-momin amit-momin Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to mention that Silas wrote something that could be used for step 3 for CW here. If we go this route, it might be worth pulling it into a util file so we can use it for CR too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amit-momin Very interesting! Yes, that seems pretty similar to the helper I had to implement... we should def look at combining them when we get a chance. This is the one I needed to get everything working:
https://github.com/smartcontractkit/chainlink-solana/pull/1002/files#diff-43599bcdcf8c088e3eac4f93fd79202871475fcdeb2dcc9494a8935b9f991a07R439-R491

Copy link
Contributor

@reductionista reductionista Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// this would be the entire encoded struct
var allData []byte

var structInstance TypeA
_ = theCodec.Decode(allData, &structInstance, "TypeA")

// this will encode the value of the B field
strBts, _ := theCodec.Encode(ctx, structInstance.B, "TypeA.B")

That's very different from anything LogPoller does. Even if we completely redesigned it to work this way, it wouldn't be able to work. The reason we're decoding the data at all is because we need to know the offset of where the subkeys are in order to search for them in queries... if you don't know the offset, then any decoding or encoding you do of it (in place) is pointless--either way there is no way to search for it. There would also be no way to build an index on something like that, since every subkey would be at a different offset. That's why it has to extract them and index them separately in their own column.

Copy link
Contributor

@reductionista reductionista Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, you cannot do Decode(allData, &fieldA, "A") and get only the int data.

FWIW, this part actually did work. The test case I used is the same as what you have: it was able to decode A successfully, but only because it was the first element in the struct so the offset happened to be 0. That's why I thought it was working at one point. It's passing "TypeA.B" as the itemType that failed. It couldn't decode the string because it thought the string started at the beginning of the data representing the encoded data instead of the middle. So it interpreted the integer A as a string length instead of a value!

Copy link
Contributor

@ilija42 ilija42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach looks valid, I'll do a more thorough review after modifiers are done

Comment on lines 151 to 152

type extendedItemType string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
type extendedItemType string
// extendedItemType is used for parsing through nested item types
type extendedItemType string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up making this an exported function named ItemTyper.

Comment on lines 25 to 43
// RetypeToOffChain attempts to apply a modifier using the provided itemType. To allow access to nested fields, this
// function applies no modifications if a modifier by the specified name is not found.
Copy link
Contributor

@ilija42 ilija42 Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow access to nested fields, this function applies no modifications if a modifier by the specified name is not found.

This comment makes it sound as if the func won't throw an error if a modifier is not found

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this comment to have more clarity.

Comment on lines 53 to 83
if mod, ok := b.modByitemType[head]; ok {
return transform(mod, val, tail)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it now okay for the modifier to not exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to the way it was to maintain backward compatibility.

return nil, fmt.Errorf("%w: cannot find type %s", types.ErrInvalidType, itemType)
}

codec := s.fields[idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a length check to prevent panic, although these should always be the same size

}

codec := s.fields[idx]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if itemType wasn't referencing a nested field

@@ -113,3 +120,46 @@ func (s *structCodec) SizeAtTopLevel(numItems int) (int, error) {
}
return size, nil
}

func (s *structCodec) FieldCodec(itemType string) (TypeCodec, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt about adding a test that checks if a proper FieldCodec has been retrieved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants