Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New targets #38

Open
Diggsey opened this issue Jan 30, 2018 · 23 comments
Open

New targets #38

Diggsey opened this issue Jan 30, 2018 · 23 comments

Comments

@Diggsey
Copy link

Diggsey commented Jan 30, 2018

Currently there is just a wasm32-unknown-unknown target. There's a lot of people with different goals for wasm, and I think as long as there's a single wasm target, there's going to be a lot of conflict about how it should behave.

Therefore, before any new features are added to rust itself, we should add new targets for the obvious cases.

Based on the current interest in compatibility with NPM packages, a good start might be wasm32-npm-unknown

This would be a target for building NPM packages. It is unspecified what environment these packages will run in (node/browser/etc.)

@aturon
Copy link
Contributor

aturon commented Jan 30, 2018

While I agree that we may want to have multiple targets in the long run (and that they could reduce conflict), I'm also wary of trying to anticipate targets before we have clear needs.

Do you have thoughts about what this "npm" target would provide? I'm a bit surprised to see a target related to a package manager, rather than host environment.

@Pauan
Copy link

Pauan commented Jan 30, 2018

@aturon I agree.

There are npm packages that work on Node only, there are npm packages that work in the browser only, and there are npm packages that work on both Node and the browser.

I think the only way an "npm-only" target makes sense is if compiling WebAssembly as an npm package is very different from compiling WebAssembly for other environments (such as Node or the browser), which seems unlikely to me.


As for the topic of new targets, I think there are two targets which are unambiguously needed (for obvious reasons): wasm32-node-unknown and wasm32-web-unknown

@Diggsey
Copy link
Author

Diggsey commented Jan 30, 2018

I chose "npm" because that seemed to be the goal - to write tooling to support deploying rust as an npm package. I'm using "npm" here in reference to the conventions it set around packages, such as package.json, rather than as a specific package manager or repository, obviously yarn would work just fine with this target too.

Perhaps it would be better to name it after the module system "es6", or just "javascript" given how common npm is now? Although the javascript ecosystem moves so quickly it's difficult to say that anything is de-facto standard and be sure that it will continue to be that way for very long...

As for the topic of new targets, I think there are two targets which are unambiguously needed (for obvious reasons): wasm32-node-unknown and wasm32-web-unknown

It doesn't make sense to jump straight to the most specific targets: if you do that then there's no easy way to write code that doesn't care whether it's in the browser or on node.

The initial changes that have been proposed so far are things like how webassembly modules are packaged, deployed, and then loaded by a javascript client - until we get to the point where we're actually implementing libstd for these targets (which I think is a way off yet) then there's no need to disambiguate the browser from node. We just to need to distinguish these packaging conventions from the totally language/platform/etc-agnostic wasm target.

@Pauan
Copy link

Pauan commented Jan 30, 2018

if you do that then there's no easy way to write code that doesn't care whether it's in the browser or on node.

#[cfg(all(target_arch = "wasm32", any(target_os = "node", target_os = "web")))]

If that's too verbose (and it probably is), then there can be a shorter synonym (similar to how we have the cfg(windows) and cfg(unix) synonyms). For example, we could have cfg(javascript). But it would just be a shorthand for the longer and more precise form.

until we get to the point where we're actually implementing libstd for these targets (which I think is a way off yet)

That's true, but we will need it, there's no doubt about that. I don't see a problem with discussing long-term plans (with the obvious caveat that it's hard to precisely predict the future, and some details will change).

Also, it's not just about libstd, targets can be used by libraries too: it would benefit things like stdweb as well.

Applications can also benefit: I might want to write a Rust application which can work on either Node or the browser, and it selects the best / most performant APIs depending on which target it is compiled for. Right now I have to use --features to get something similar.

@aturon
Copy link
Contributor

aturon commented Jan 30, 2018

So in my mind, the use of package.json should be "external", not baked into the target. In particular, the way that wasm is set up to do imports is:

  • You provide import specifications which give a module and function name. (Today in Rust we can't control the module name, but that will change soon.)
  • Separately, you provide some means of actually linking those imports to real function definitions.

On the second bullet, the idea is to eventually move away from manual wasm instantiations and instead use the host package manager/bundler/etc to do this "linking". There's a bit more detail here.

In my mind, then, what happens within pure Rust code is strictly the definition of the imports, via the evolving ABI. The specification of how those imports should be fulfilled is then managed by out-of-band tooling. Not unlike the Cargo.toml vs extern crate split.

@Diggsey
Copy link
Author

Diggsey commented Jan 30, 2018

#[cfg(all(target_arch = "wasm32", any(target_os = "node", target_os = "web")))]

That's not what I mean: there's no reason you should have to rebuild your rust code to run it in the browser compared to in node - you should just be able to build a host-agnostic npm package. Your solution doesn't allow for that.

Eventually we may have browser/node targets in addition to the javascript/es6 target, but I imagine those targets will inherit the semantics that have already been established for the javascript/es6 target.

In my mind, then, what happens within pure Rust code is strictly the definition of the imports, via the evolving ABI. The specification of how those imports should be fulfilled is then managed by out-of-band tooling. Not unlike the Cargo.toml vs extern crate split.

That makes sense - in that case it makes more sense to name it just the "javascript" target, or the "es6" target.

@Pauan
Copy link

Pauan commented Jan 30, 2018

That's not what I mean: there's no reason you should have to rebuild your rust code to run it in the browser compared to in node - you should just be able to build a host-agnostic npm package. Your solution doesn't allow for that.

That sounds like purely an optimization that could be done in rustc. In other words, if you have some code which specifies that it works with two different targets (e.g. node and web), then rustc would only need to compile the code once, rather than compiling it once for node and then again for web.

Also, you might need it to recompile! Let's say there is a function foo which works on either node or the web, but it uses different APIs on Node compared to the web:

#[cfg(all(target_arch = "wasm32", target_os = "node"))]
fn foo() {
  // Uses Node APIs
}

#[cfg(all(target_arch = "wasm32", target_os = "web"))]
fn foo() {
  // Uses web APIs
}

This is necessary because the APIs on the web are different from the APIs on Node. As an example, you might want to use network APIs, but the browser uses XMLHttpRequest (or the new fetch API), whereas Node uses the https module.

Okay, so far so good. But now let's say you create a bar function which uses the foo function:

#[cfg(all(target_arch = "wasm32", any(target_os = "node", target_os = "web")))]
fn bar() {
  foo()
}

In this case you are saying that bar works in either node or web (which is correct), but at runtime the implementation is different, because when compiling to wasm32-web-unknown it will use the web version of foo, and when compiling to wasm32-node-unknown it will use the node version of foo. This separate compilation is necessary for correctness.

What if you want to instead use some sort of polyfill based on runtime detection? Sure, you can do that too:

#[cfg(all(target_arch = "wasm32", any(target_os = "node", target_os = "web")))]
fn foo() {
  // Polyfill implementation based on runtime feature detection
}

And hopefully rustc will be able to avoid recompiling it repeatedly.

But separate compilation will still be useful, both for correctness and performance.

@Diggsey
Copy link
Author

Diggsey commented Jan 30, 2018

@Pauan that's just not possible - when you deploy your package, you'll be deploying compiled webassembly - you can't have two different targets and have the same package work for both, unless you compile in two versions of the webassembly, but then you're bloating the package for no reason.

In addition, rustc will likely never be able to cache any information about the build between different targets, that's just not how it works.

If you want to build a package that works in both node and the browser, then you need a single target for both, and either you stick to APIs available on both, or you switch at runtime.

@Pauan
Copy link

Pauan commented Jan 30, 2018

that's just not possible - when you deploy your package, you'll be deploying compiled webassembly - you can't have two different targets and have the same package work for both, unless you compile in two versions of the webassembly, but then you're bloating the package for no reason.

Yes you'll need to compile it twice, and JavaScript developers already do that. That's normal in the npm ecosystem.

Right now npm developers compile a CommonJS / UMD version of their code and also an ES6 version of their code. It's also somewhat common to have both TypeScript and JavaScript builds. Some packages also contain both browser and Node builds.

And it's common to include both regular and minified versions of the same code (especially code which is intended for the browser). Npm is quite used to bloat and code duplication.

In addition, there's a de-facto standard which uses the browser field in package.json for browser code and main for Node code. This is intended to allow for separate browser and Node builds.

Separate compilation isn't done for "no reason", it's done for increased performance, since it doesn't need to include a complex polyfill which slows down performance.

Another reason to avoid polyfills is that they bloat up the code size of the final compiled code. The code size of the package doesn't matter, what matters is the code size of the final compiled code, and JavaScript developers obsess over code size, almost as much as embedded programmers.

In addition, rustc will likely never be able to cache any information about the build between different targets, that's just not how it works.

I don't see any technical reason why rustc cannot cache the example I gave, although it might not do so right now. You're right that it's unlikely to happen anytime soon, but in any case that's an optimization, it shouldn't affect our decisions about semantics.

If you want to build a package that works in both node and the browser, then you need a single target for both, and either you stick to APIs available on both, or you switch at runtime.

Sure, if you don't want to compile twice then you should use the all(target_arch = "wasm32", any(target_os = "node", target_os = "web")) target, and then you only need to compile it once (since the outcome will be the same regardless of whether you compile using wasm32-web-unknown or wasm32-node-unknown)

Of course it's also possible to mix-and-match: you might compile to three separate WebAssembly modules:

  1. This module contains code specific to Node

  2. This module contains code specific to the browser

  3. This code contains target-agnostic code

Then the three modules are linked together at runtime (or during the wasm linking stage, which is separate from the Rust linking stage).

@aturon
Copy link
Contributor

aturon commented Jan 30, 2018

@Diggsey

That makes sense - in that case it makes more sense to name it just the "javascript" target, or the "es6" target.

So, I still don't quite understand how you envision this target differing from -unknown today. Can you elaborate a bit?

@Diggsey
Copy link
Author

Diggsey commented Jan 31, 2018

Yes you'll need to compile it twice, and JavaScript developers already do that. That's normal in the npm ecosystem.

It's not uncommon, but plenty of packages do not do that: they just expose a javascript module that can be imported anywhere, and tools like webpack mean that it can be optimised later. Without a generic "javascript" target rust can't do that.

So, I still don't quite understand how you envision this target differing from -unknown today. Can you elaborate a bit?

We will presumably want libstd to work when used from javascript modules: either we can base such an implementation off something similar to my "extensible syscall interface" PR, and have the javascript-specific parts sit outside the compiler, or we have a separate target that can do javascript-specific things.

The former solution is going to be much harder to stabilise, because it requires stabilising the interface between webassembly and the host, whereas a javascript-specific target could be stabilised without committing to a stable contract between rust and the host, we would simply need to provide a javascript package to implement the unspecified interface, and then we can ship that with the compiler, and update it if we need to make breaking changes to the interface.

@Diggsey
Copy link
Author

Diggsey commented Jan 31, 2018

In short, we'd have to solve https://github.com/rust-lang-nursery/rust-wasm/issues/16#issuecomment-358342548 before we could have a working libstd for javascript.

Nobody seems very happy with any of the solutions I presented there, and nobody has suggested any alternatives, so it looks like that's not going to be resolved any time soon.

Alternatively, we have a separate target, and can put off solving that particular issue.

@Pauan
Copy link

Pauan commented Jan 31, 2018

Without a generic "javascript" target rust can't do that.

Why can't it?

If you have some code which works on wasm32-node-unknown and wasm32-web-unknown (e.g. because it's generic JavaScript code, or because it has a polyfill), then you can just compile it once with wasm32-node-unknown (or wasm32-web-unknown, it doesn't matter since the WebAssembly output will be the same either way).

In other words, the WebAssembly doesn't care what target it was compiled with, it will be the same with either one.

Is it particularly clean or elegant? No. But it should work.

@Pauan
Copy link

Pauan commented Feb 1, 2018

I thought about this some more, and I think these should be the new targets:

wasm32-unknown-javascript-web
wasm32-unknown-javascript-node

This leaves open the possibility of later adding in a wasm32-unknown-javascript target (which is for generic code that works in any JavaScript environment).

If you want to write code that works in any JavaScript environment, you can do this:

#[cfg(all(target_arch = "wasm32", target_os = "javascript"))]

It will now work with the wasm32-unknown-javascript-web, wasm32-unknown-javascript-node, and wasm32-unknown-javascript targets (and any future JavaScript targets as well).

And if you want it to work with only a specific target (such as node) then you can do this:

#[cfg(all(target_arch = "wasm32", target_os = "javascript", target_env = "node"))]

@aturon
Copy link
Contributor

aturon commented Feb 1, 2018

@Diggsey Thanks for clarifying. If adding a separate target that enables the syscall interface by default would be helpful, that's definitely doable.

I do think we need to resolve the overall syscall question on #16, and I have some thoughts on that, but agree that we shouldn't block.

@Pauan
Copy link

Pauan commented Feb 10, 2018

I was originally in favor of adding more targets, but now I think we should just have wasm32-unknown-unknown and use @aturon 's portability lint.

The reason is because the situation in JavaScript is... complicated. To start with you have the web environments (i.e. web browsers), and you have Node.js, but you also have things like Electron which allow you to use both web and Node APIs at the same time. So we would need a new target for that.

So, what if you have an API which works on Node? You can't use #[cfg(target_env = "node")] because then it won't work on Electron. So you would have to use #[cfg(any(target_env = "node", target_env = "electron"))], which is a massive footgun: it's so easy to use node and forget to add in electron, and it also means every time we add a new target to Rust we have to add the new target to the cfg of all existing code.

Similarly, there's various headless browsers which give you access to both browser and Node APIs, we don't really want a new target for each of them. And then there's React Native...

So rather than tying it directly to targets, instead we need a more fine-grained way to slice-and-dice the APIs. Ideally we would just use #[cfg(javascript(node))] (or something similar) and it will work on Node, Electron, headless browsers, etc.

@newpavlov
Copy link

rand has encountered issues with wasm32-unknown-unknown which I think could've been more or less solved with additional WASM targets. See for example rust-random/rand#681.

@moshg
Copy link

moshg commented Mar 23, 2019

Oh, I have understood this issue. What Diggsey proposed was js-unknown-unknown-npm, wasn't it?
But how about js-unknown-unknown, js-web-unknown and js-nodejs-unknown?
I mean rustc compiles foo.rs as cdylib into foo.js and foo.wasm on the js targets.
In this case, rustc have to stabilize only how the exposed js files communicate with other js files, not how wasm files communicate with js files.

Without js targets, we cannot use the libraries that uses , for example, std::fs even in the projects that target node.js. So library developer should write code around filesystems twice for std::fs and nodejs-sys::fs (not exist yet) or should use, instead of std::fs, a crate that is a facade of std::fs and nodejs-sys::fs because std::fs is not portable.

@ashleygwilliams
Copy link
Member

ashleygwilliams commented Mar 23, 2019

hey! this issue seems to be getting some notice and the general idea of targets is something we‘re definitely exploring (albeit slightly differently than in the compiler itself, but more as a secondary tier of targets that the workflow tools expose) i think it may be beneficial to discuss in our next meeting so we can either close or make a comment on this issue re the working groups current thinking on this subject. cc @alexcrichton @fitzgen

@moshg
Copy link

moshg commented Mar 24, 2019

Thank you for response! I'm looking forward to see a comment.

@moshg
Copy link

moshg commented Mar 25, 2019

I may be wrong but I'm concerned that if imports are added in std on wasm32-unknown-js (not js-unknown-unknown), it is breaking change.
If this is acceptable, I agree with wasm32-unknown-js.

@CryZe
Copy link

CryZe commented Oct 29, 2019

What's the status of this? Not having a proper target for wasm-bindgen is causing major issues across the ecosystem. We should honestly just have a wasm32-web target and possibly an additional wasm32-web-node if anyone really needs the latter.

Motivation: Every crate needs to handle wasm-unknown and wasm-web very differently, but the only way to differentiate them is to pass features around. However cargo features are entirely broken for this use case (or in general really), so this doesn't work.

I'd say the cleanest solution is probably even to add wasm32-web and remove std from wasm32-unknown-unknown. Then the ecosystem can super cleanly handle all the different cases.

Although removing std from wasm32-unknown-unknown may not work just yet, because no_std is equally broken due to cargo's broken features.

@lukechu10
Copy link

lukechu10 commented May 16, 2021

What is blocking this? Does somebody need to write a RFC (or create a Pre-RFC)? Are there still unanswered questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants