Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow deserialization of candid values with unknown types #555

Merged
merged 13 commits into from
Apr 7, 2022

Conversation

frederikrothenberger
Copy link
Contributor

@frederikrothenberger frederikrothenberger commented Apr 7, 2022

Description

This change introduces IDL.Unknown to the candid library to specify that the type of the value is not known. IDL.Unknown can be use for deserialisation only.

The deserialisation of IDL.Unknown will be based on the type table included within the candid encoded value. This means that all field names will only be available in hashed form. The decoded value has a function type() that returns it's actual candid type.

How Has This Been Tested?

Unit tests have been added.

Checklist:

  • My changes follow the guidelines in CONTRIBUTING.md.
  • The title of this PR complies with Conventional Commits.
  • I have edited the CHANGELOG accordingly.
  • I have made corresponding changes to the documentation.

@frederikrothenberger frederikrothenberger changed the title Allow deserialization of candid values with unknown types feat: Allow deserialization of candid values with unknown types Apr 7, 2022
@frederikrothenberger
Copy link
Contributor Author

frederikrothenberger commented Apr 7, 2022

Open questions / things I'm unsure about:

  • Primitives are boxed, because otherwise the type() function cannot be added.
  • IDL.Null does not deserialize to null but to an empty object with just the type function.
  • The types do not always exactly correspond with the original definitions due to the structure of reference types. The resulting types are however compatible with the original types.
  • The candid type encoding seems to be lossy regarding Service and Func. Add a disclaimer, that those are not complete?
  • I did not add a disclaimer to discourage usage by other developers. Is this ok? Imho, IDL.Unknown should never break, as long as Candid remains self-describing (which it hopefully will stay).

@nomeata : For some reason, I cannot add you as a reviewer. That's why I'm pinging you this way.
I implemented this in order to be able to implement dfinity/interface-spec#20 in the service worker (specifically for parsing /* application-specific type */). What's your opinion on this?

@nomeata
Copy link
Contributor

nomeata commented Apr 7, 2022

I’m not a member of this repo, so you can’t add me as a reviewer. But pinging works.

My opinion is that the HTTP Gateway Spec is flawed by trying to use “generic Candid” in this ad-hoc and unsupported way. It is actually impossible to implement this correctly in in full generality, in the presence of “future types” as pointed out elsewhere.

But as long as the “service specific type” is “simple enough” it may be possible… so with this discouraging rant out of the way, let’s actually answer your questions :-)

IDL.Unknown can be use for deserialisation only.

Doesn’t that mean it can’t be used for the token, because you need to serialize that again? Or am I misreading that sentence (I must, because you later ask about encoding). So I am a bit confused.

@frederikrothenberger
Copy link
Contributor Author

Doesn’t that mean it can’t be used for the token, because you need to serialize that again?

The idea is to use IDL.Unknown for deserialization. This will yield some value for token. This value will have a function type() which returns the actual type that IDL.Unknown turned out to be. This type should be used to serialize the value again.

Take a look at the unit test, they should illustrate how to do this.

@nomeata
Copy link
Contributor

nomeata commented Apr 7, 2022

Ah, I see. Values of this type can be used for serialization (IDL.encode([value.type()], [value])) but they can only be created by deserialization. So you can’t (or shouldn’t) convert other types and values to IDL.Unknown. Is that correct? (I don’t really know the JS library, so I may be asking stupid questions).

@frederikrothenberger
Copy link
Contributor Author

Ah, I see. Values of this type can be used for serialization (IDL.encode([value.type()], [value])) but they can only be created by deserialization. So you can’t (or shouldn’t) convert other types and values to IDL.Unknown. Is that correct?

Yes, exactly.

@nomeata
Copy link
Contributor

nomeata commented Apr 7, 2022

Yeah, then it indeed looks like a reasonable best effort approach.

@frederikrothenberger frederikrothenberger marked this pull request as ready for review April 7, 2022 13:38
@chenyan-dfinity
Copy link
Contributor

chenyan-dfinity commented Apr 7, 2022

Generally LGTM. My sentiment is the same as @nomeata that this can only be a best effort support from Candid side, as this is not properly spec'ed.

  • Primitives are boxed, because otherwise the type() function cannot be added.
  • IDL.Null does not deserialize to null but to an empty object with just the type function.
  • The types do not always exactly correspond with the original definitions due to the structure of reference types. The resulting types are however compatible with the original types.

As it's not in the spec, you are free to choose the representation.

  • I did not add a disclaimer to discourage usage by other developers. Is this ok? Imho, IDL.Unknown should never break, as long as Candid remains self-describing (which it hopefully will stay).

Please do add a comment that this is for internal use only. Once we have proper support for generics, we want to migrate away from Unknown. Even before that, we don't want to encourage the use of Unknown unless you have no choice. It's easy to abuse this feature. For example, people can use Unknown to decode any message, thus avoid the subtyping guarantee.

Comment on lines +337 to +339
writable: true,
enumerable: false,
configurable: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these properties used?

docs/generated/changelog.html Outdated Show resolved Hide resolved
@nomeata
Copy link
Contributor

nomeata commented Apr 7, 2022

Technically it should be possible to represent a value of Unknown by the concrete byte sequence that you read from the incoming message, and then just put that into the output stream as is. Not sure if that'a easier or harder to implement.

And you still have to decose the type table, of course, that'd be the same

@chenyan-dfinity
Copy link
Contributor

Technically it should be possible to represent a value of Unknown by the concrete byte sequence that you read from the incoming message, and then just put that into the output stream as is. Not sure if that'a easier or harder to implement.

I like this approach better, because we can then still do the subtype check when we know the concrete expected type. Implementation-wise, the current approach is probably easier.

@krpeacock krpeacock enabled auto-merge (squash) April 7, 2022 17:44
@krpeacock krpeacock merged commit 65e17f9 into main Apr 7, 2022
@krpeacock krpeacock deleted the frederik/idl-unknown branch April 7, 2022 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants