Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespacing for Worlds #177

Closed
eduardomourar opened this issue Mar 17, 2023 · 18 comments
Closed

Namespacing for Worlds #177

eduardomourar opened this issue Mar 17, 2023 · 18 comments

Comments

@eduardomourar
Copy link

My understanding is that right now there is a single namespace in the context with WIT. That can cause some confusion as you can see in this discussion here. For the case of wasi:http/outgoing-handler, the imports identifiers are resolved as follows:

  (import "types" "new-fields" (func (;4;) (type 5)))
  (import "types" "fields-entries" (func (;5;) (type 6)))
  (import "types" "drop-outgoing-request" (func (;6;) (type 2)))
  (import "types" "new-outgoing-request" (func (;7;) (type 7)))
  (import "types" "outgoing-request-write" (func (;8;) (type 6)))
  (import "types" "drop-incoming-response" (func (;9;) (type 2)))
  (import "types" "incoming-response-status" (func (;10;) (type 8)))
  (import "types" "incoming-response-headers" (func (;11;) (type 8)))
  (import "types" "incoming-response-consume" (func (;12;) (type 6)))
  (import "types" "future-incoming-response-get" (func (;13;) (type 6)))
  (import "default-outgoing-HTTP" "handle" (func (;14;) (type 9)))

Is there a plan to add a namespace based on the world name to avoid name collisions?

@eduardomourar
Copy link
Author

I expected those import to actually be something like this (by including the namespace at the world level and converting the identifiers to snake case):

  (import "wasi_http_types" "new_fields" (func (;4;) (type 5)))
...
  (import "wasi_http_default_outgoing_http" "handle" (func (;14;) (type 9)))

@lukewagner
Copy link
Member

Great question! So when compiling a component, the names used for the core imports and exports are completely encapsulated inside the component and so they can be arbitrarily chosen by the toolchain. Rather, it is the imports and exports in the containing component that are meant to identify the imported/exported interface uniquely to the outside world (specifically, using the id subfield of externname). Thus, the final component (produced by, e.g., wit-component) would look something like this:

(component
  (import "types" (id "wasi:http/types") (instance
    (export "new-fields" (func ...))
    (export "fields-entries" (func ...))
    ...
  ))
  (import "default-outgoing-HTTP" (id "wasi:http/outgoing-handler") (instance
    (export "handle" (func ...))
  ))
  ...
  (core module
    (import "types" "new-fields" (func ...))
    (import "types" "fields-entries" (func ...))
    ...
    (import "default-outgoing-HTTP" "handle" (func ...))
  )
  ...
)

That being said, if hosts wanted to implement core imports directly (so that they could accept core modules that were not wrapped by components), that would be an argument for somehow mangling the id into the core module import/export strings like you've suggested above. (And then wit-component would have the freedom to rewrite them into something shorter when wrapping the core module into a component as an unobservable size optimization.) This could even be part of what we specify when defining a single canonical mapping from Wit to core ABI (re-adding part of what was removed in #114, this time without the mangling of types). I could do this in the short term if it made sense to others.

(cc @peterhuene @alexcrichton @pchickey @sunfishcode b/c this relates to what we've been talking about w.r.t ids recently)

@alexcrichton
Copy link
Collaborator

I agree with Luke that at the component model level there's enough namespacing and such that I don't think there's any changes in the spec necessary. At the tooling level though thinking about this it seems like we may want different conventions around naming and/or transformation from WIT to the component model. For example while I think it makes sense for a wasi:http/types interface to exist I don't think it makes sense for it to claim the "types" import name because then nothing else can.

In some sense I would expect the name "wasi-http" to be claimed or maybe "wasi-http-types". These are possible but require different idioms in the upstream WIT files (or a different translation from the component model to WIT)

@eduardomourar
Copy link
Author

@alexcrichton, taking into consideration the union of worlds proposal, we might need to consider the target world namespace and renamed imports while mangling them, right? E.g. cloud-core proposal.

@alexcrichton
Copy link
Collaborator

Sorry I'm not quite sure what you mean? Can you elaborate more?

@lann
Copy link
Contributor

lann commented Apr 20, 2023

If I have a world:

default world http-server {
  export handler: wasi-http.incoming-handler
}

I implicitly get several imports to go with it:

default world http-server {
  ...
  import streams: ... // io/streams
  import types: ...   // http/types
  export handler: wasi-http.incoming-handler
}

If I'm understanding the situation here, the main issue is that "streams" and "types" aren't good import names for the purposes of avoiding name conflicts or identifying the original source of an interface in a component.

As an incremental improvement on the current situation, could we adopt a convention of including a prefix in every interface name, i.e. interface types -> interface wasi-http-types? Given the default interface feature I think it wouldn't even require a huge amount of renaming in existing wit files, though many generated bindings would see breaking changes.

@eduardomourar
Copy link
Author

I was thinking more along the lines of adding a new keyword to WIT (following other IDL examples like in the Apex lang here and in Smithy here):

namespace wasi:http

interface ...

And, then, the WIT tooling and codegen could automatically prefix the shapes with the namespace by default.

@lukewagner
Copy link
Member

Although transitive interface dependencies get implicitly imported with their default kebab-name, worlds have the option to override that default by explicitly importing the interface with whatever kebab-name they want. So w.r.t @lann's example, you could write:

default world http-server {
  import my-streams: io.streams
  import my-types: http.types
  export handler: http.incoming-handler
}

and there would be no streams or types kebab-names in this world.

That all being said, I can see how types is a highly conflict-prone default kebab-name even within WASI and thus it is probably a good idea to change the default name to HTTP-types.

@alexcrichton
Copy link
Collaborator

I've been trying to figure out the best way to solve this and I'm not coming up with something that "feels right", unfortunately, but I do have a few ideas perhaps. To me the problem here is the automatic flat injection of transitive dependencies of interfaces into a world, meaning now everyone's interface blocks have to care about everyone else's. That feels not great because that's generally not a good property of a package system where everyone should only ever have to worry about their local context.

One option, as recommended here, is to mangle more "stuff" into the name. WIT, for example, has a package namespace and a document namespace that are currently entirely unused when converting to components. That could mean, for example, that the wasi-http package with a types document that has default interface types could become wasi-http-types as a "default name for this interface". This would be an update to elaboration of world blocks where:

default world http-server {
  export handler: wasi-http.incoming-handler
}

would become:

default world http-server {
  ...
  import wasi-io-streams: ... 
  import wasi-http-types: ...   
  export handler: wasi-http.incoming-handler
}

which would largely solve the problem I think. This means that authors of packages, as usual, are cognizant of their own naming choices but need not concern themselves with the naming choices of others.

This still doesn't feel like a perfect solution, though. For example package names themselves could still conflict (e.g. same package from two different registries). Another option for this perhaps would be to translate worlds to WIT differently. For example the above world could be translated to a component as:

(component
  (import "wasi-http" (instance
    (export "types" (instance
      (export "request" ...)
    ))
  ))

  ;; ...
)

where top-level imports are package imports which have their own structure internally. This means that the "elaboration" of a world still has to worry about clashes between same-named packages but that somehow to me feels a bit better to resolve in this context (or maybe it's the same, unsure). This is a more significant change, however.

Neither way really feels like a great way forward to me. I'm having difficulty figuring out a better possibility though.

@lukewagner
Copy link
Member

Thinking about this some more, perhaps there's a nice solution if we revisit the structure of externname and update some of the original assumptions which have changed since it was originally merged. So currently an externname is defined as:

externname ::= <name> (id <URL>)?

The original expectation was that the id field could take many forms, including live http URLs resolving live to implementations, warg URLs referring to registry packages, or wasi URLs resolved by the standard. Because the URL could be long and gnarly, the assumption was that we'd always want this kebab-name as an alternative short name to be used in any source-code bindings.

More-recently, though, we've been talking about splitting out the separate cases of <URL> so that we're never needing to do ad hoc string analysis to pull out metadata like version, package name, etc. And indeed, some of these cases aren't real URLs at all: when identifying interfaces, it seems like we want something more like namespace:package/interface where wasi:http/handler is an example, but not because wasi: is a special case, but because we support any organization/project/company-defined namespace. But once we have the namespace:package/interface fixed structure as the id field, we basically get a nicely-prefixed kebab-name for free, so maybe we don't need or want to worry about a separate <name> at all.

To me, this suggests considering an alternative factoring like:

externname::= (id <ID>)
            | <name> (url <URL>)?

where <ID> is some grammar covering namespace:package/interface (e.g, maybe '"' <label> ':' (<label> '/')* <label> '"') that could soon be followed by version info.

Fleshing this idea out a bit more, in a Wit context, we could mostly start writing the <ID>s directly in place of the kebab-name, e.g.:

interface "wasi:http/types" {
  resource incoming-request { ... }
  resource response-outparam { ... }
}
interface "wasi:http/incoming-handler" {
  use "wasi:http/types".{incoming-request, response-outparam}
  handle: func(request: incoming-request, response: response-outparam)
}
default world "wasi:http/server" {
  export "wasi:http/incoming-handler"
}

Note that in a world context, the name: can be omitted from imports/exports. The expectation here is that language bindings generators would, as they already do with kebab-names, "localize" the ids into the source language. Also, I think this would be an addition, not change, to existing Wit; I think we still want to retain the existing option of using kebab-names to define and refer to interfaces and worlds in the case where you aren't defining or referring to a standardized or published interface. Lastly, I'm ambivalent on the bikeshedding question of whether to require quotes around the <ID> in Wit.

WDYT?

@eduardomourar
Copy link
Author

For the Binary format, I like the proposed changes of theexternname.

For the WIT IDL, I believe it will get verbose really quick besides requiring the current interface definitions to be updated.

@eduardomourar
Copy link
Author

eduardomourar commented Apr 26, 2023

I think we can follow the import map approach from the HTML spec to keep a little bit simpler and allow evolving over time.

The way I see it they can be placed in two locations:

  • Use an extra component.toml file that can contain that map (possibly it can be combined with the information being used by the wit-deps and cargo component)
  • Have a top-level metadata keyword that can hold that map within the WIT file itself

@lukewagner
Copy link
Member

I wonder if, due to the fact that Wit will mostly be (1) not super long, (2) written infrequently and read many many times, if the usual approach we'd take for source code programs (indirecting through a map) isn't worth the complexity. Additionally, it seems like the inline-id approach makes the Wit file more readable in isolation and with less context.

@alexcrichton
Copy link
Collaborator

Personally I don't think we should go the route of the inline ids like you mentioned with interface "wasi:http/types" { .. } in WIT files. I feel that this requires a bit too much repetition within WIT files which duplicates context already available elsewhere. For example currently you might have something like deps/http/types.wit which contains default interface types { ... }. From that we can infer the package name is http, the interface name is types, so the only thing missing is "wasi:" as the namespace. In that sense given today's structure of WIT we're almost able to entirely infer an ID of "wasi:http/types", so I'm not sure it needs to be duplicated from the existing hierarchy into each name.

Additionally I think another drawback is that if everything is written inline then it makes it harder to fork a package and modify it. For example if someone wanted to experiment with an updated version of wasi-http it wouldn't want to be published in the "wasi:.." namespace but that would require editing all of the files to ensure nothing was forgotten.

Finally I feel like the ID here is somewhat, although not entirely, orthogonal to the namespacing issue at hand with how imports are modelled. Even if IDs were all present that's still not solving the problem of "what's the kebab-string". That's where I think we'd still need something like I mentioned above where the package/document structure is perhaps reflected through nested instances, meaning that everything becomes namespaced. In such a world the "id" is only really necessary in the top-level package and while everything else inside it could have an "id" it may not be necessary.

@lukewagner
Copy link
Member

I guess there's two separable design changes I suggested in my last comment and it might be useful to start by just focusing on the first:

  1. saying that, in a component (AST and binary), when you have an id, you don't also have a kebab name, that the id is the (expanded) kebab-name
  2. allowing ids to be written inline in Wit

Just focusing on #1, and setting aside precisely how we derive the id from Wit, it seems like this gives source code generation the hierarchical information it needs and avoids the need to define a separate-but-potentially-different nesting/pathing scheme for a separate kebab name. Does that make sense as a starting point?

@guybedford
Copy link
Collaborator

As far as I understood the concept of extending externname metadata, we were including version constraints, and package resolution information in this structured format. I like that framing, but when considering this adjustment would be concerned about singular IDs firstly being representable and secondly being used correctly in the WIT format.

Portability benefits for developer files are important - and this does create a difference between WIT and component binaries in this regards.

Perhaps there's another way to remove unnecessary internal naming, while still aligning on structured metadata and unified internal pathing conventions. I can try and put together some further suggestions if that would be useful.

@alexcrichton
Copy link
Collaborator

I've made a proposal to help solve this issue at #193 which I'm hoping will address a number of related issues at the same time. I'd appreciate it if others could take the time to review (apologies it's long) and provide feedback!

@lukewagner
Copy link
Member

I think this has since been resolved by #198 and #205. Thanks again for bringing this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants