Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Recursive Values #430

Open
jdegoes opened this issue Dec 13, 2024 · 4 comments
Open

Support Recursive Values #430

jdegoes opened this issue Dec 13, 2024 · 4 comments

Comments

@jdegoes
Copy link

jdegoes commented Dec 13, 2024

There are quite a few cases that require defining recursive values:

package wit:json@0.1.0;

interface types {
  variant json {
    null,
    boolean(bool),
    number(float64),
    string(string),
    array(list<json>),
    object(list<tuple<string, json>>)
  }
}

Currently, these are illegal due to the difficulty of supporting them, including, presumably, the challanges of defining an ABI for recursive values. However, there exists a simple translation that could form the basis of the ABI without adding significant complexity.

This transformation involves:

  1. Encoding the recursive references as an unsigned integer, so the type is no longer recursive, and which represents a simple and flat node.
  2. Wrapping the data type in a higher level data type, which has a list of nodes, where the first entry in the list is guaranteed to exist and which represents the top-level structure.

For the JSON example, we might call the node structure json-node and the top-level type json:

package wit:json@0.1.0;

interface types {
  variant json-node {
    null,
    boolean(bool),
    number(float64),
    string(string),
    array(list<u32>),
    object(list<tuple<string, u32>>)
  }

  record json {
    nodes: list<json-node>,
  }
}

This transformation can be generalized to mutually recursive structures.

Now, while it's possible to do this transformation manually, using such a data structure directly from any programming language suffers from poor ergonomics and the possibility of bugs that lead to runtime errors. So when encoding recursive values in this way, one should take care to define a translation from a higher-level, truly recursive structure, in the guest language, to the lower-level structure, and then back again, by using a hash map based on reference or pointer equality.

Perhaps recursive values could be supported by allowing higher-level syntax, but creating a formal specification as to how these recursive values are mapped to a lower-level subset of types, which could form the basis of the ABI?

@primoly
Copy link

primoly commented Dec 13, 2024

One alternative way the ABI could work is by representing recursive values via lazy value lowering as described here for lists: #383.

Maybe if we had lazy<T> (not my preferred name) we would’t need to special case the ABI for recursive variants, so you could use it for other values that want boxing.

@lukewagner
Copy link
Member

Definitely agreed that recursive types are valuable. There is an early issue filed in #56 which has a similar use case and in that issue I mention some of the implementation challenges (in the runtime and in bindings generators). I don't think these challenges are insurmountable, but it's a tricky enough feature that we've been focusing on other use cases first. But once the other more foundational use cases are addressed, I think this would be a great addition to prioritize.

@dicej
Copy link
Collaborator

dicej commented Dec 13, 2024

Another way to model this within the current state of WIT and the CM:

  variant json {
    null,
    boolean(bool),
    number(float64),
    %string(string),
    array(list<lazy-json>),
    object(list<tuple<string, lazy-json>>)
  }

  resource lazy-json {
    get: func() -> json;
  }

It's not necessarily better than the index-based approach, but might be more efficient e.g. when passing a large JSON document from one component to another such that the receiver only cares about parts of it.

@jdegoes
Copy link
Author

jdegoes commented Dec 14, 2024

Thanks for the updates and feedback--just knowing that recursive values will eventually be supported is encouraging.

There are many use cases we have encountered so far:

  • We have ADTs for types and values in the component model (wit-value and wit-type), which helps facilitate various examples of generic programming (including, for example, having a WIT interface that describes the ability to generically process input and output for any other function; and another WIT interface representing generic RPC invocation).
  • We are building wasm-postgres, wasm-mysql, etc., and recursive types bubble up from these. For example, a complex type in Postgres may be recursive; and even an array type is a limited form of a recursion.
  • We will soon run into many more recursive values as we attempt to bring gRPC and OpenAPI support into Golem.

@dicej Whoa, this is such a cool trick! It seems it would be highly usable from the "read" side and may not require a re-wrapper. I'm a touch concerned about performance implications in cases where the whole value is needed (e.g. RPC, database access, etc.), but it still seems useful enough for us to try it in some places.

@primoly Interesting, basically generalizing Joel's trick.

FWIW, in our use cases, either guest or host needs the "full" value in order to proceed, so using lazy values for the general-purpose solution would impose some overhead on these use cases (I don't know how much overhead but for finely recursive structures, I imagine it would be measurable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants