-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JS / WebAssembly binding planned ? #63
Comments
Thank you for suggesting this. As far as I know, WebAssembly does not support multithreading so I don't know how this would integrate. We clearly don't want to sacrifice the speed on every other platform for the sake of having WebAssembly, but if we can work around this, it would indeed be really valuable to have such bindings! |
Thank @n1t0 and great job ! I started playing and understanding how it works. I will take a look at WebAssembly binding, if you are ok. |
Any updates on this? I'm also very interested in running the models in the browser. |
I made an experiment several months ago and got it running without multithreading. Here is my branch but totally out of sync: Now it should to be updated to master and make multithreading using webworkers. I need to find time to work on it again. |
Hi @AmitMY , It's definitely possible to do, I have also done in the past, but it cannot be easily done without breaking/slowing things on the compiled side which is the reason there is no support atm. You need to:
Does that help? |
Thanks, @Narsil , the process makes sense. I just don't understand why is it a problem to have it officially supported if it will only slow down the web solution and not the others? Personally I also never used Rust, so as a (mainly) web developer, it feels like a barrier to using hf/tokenizers |
Well So dropping support means not being able to run those tokenizers or maintaining another dependency on the web (large overhead as API is likely to be different) It's isn't about compilation time, or compilation errors, it's that the WASM version won't be able to run properly for some specific tokenizers, and maintaining that difference is going to be important. For context, this library is quite behind already in terms of library integrations in both Node and Python, and upgrading those versions would involve big rework of those bindings, which might have breaking backward compatibility issues, performance issues, and all those would have to be investigated. Maybe that can explain why adding new bindings is not something on the roadmap. But again, if you really want to create a PR (and maintain it, or not) your help is most definitely welcome and we would be happy to route users to your branch. |
Hey @Narsil, given that wasm now has threads (info here), and that there are wasm ports of oniguruma (like this one), do you think it has become more feasible to make a wasm build available? If not, what do you see as the major blockers/difficulties at this point? It would be fine if the wasm version were slower than the others, so conditionally swapping out libs as needed at build time (like |
Hey, I haven't taken the time to deeply look into your links for oniguruma (threads can be ignored anyway in WASM easily), to check for any caveat that might occur, but it definitely is a nice thing to look into. Would you be willing to start a PR on this ? |
@Narsil Thanks for your interest in this! I'd definitely be willing to spend a week or two on this if you think it's feasible to get a JS/web version working within that sort of time frame. If it's likely to take longer than that, I'm still happy to have a shot at solving a chunk of the problem - especially if it could be useful to the OSS community in its own right (e.g. creating a wasm build of a required cpp module). I've got a lot of JS experience, but only a bit of experience with Rust + wasm-pack. I'd probably need you to lay out the overall plan for me and ideally give me a heads up on which aspects might be harder than others. It would also be great if you could give me a few concrete first steps to get me rolling¹. Is that okay? No worries if you're currently too busy! ¹ My guess, just so you're aware of my mental models (which are probably quite bad), is that the best approach would, roughly, be to copy the node bindings, and then port the code to JS module syntax ( Edit: Or (speaking naively still) would it make more sense to just start with the Rust code itself and mark it up with conditional wasm-pack attributes like |
No it's good of you to ask ! I am not sure the various tradeoffs right now, so I think the first thing would be to make something workable. IMO, this was not good enough to get it merged, but you should start there, as it should be the fastest thing to do. THEN, re-enable onigruma with the wasm impl (hardcode it for starters). THEN add a feature-flag for wasm. that's as rarely used as possible in the source code that has at least simple test/example that it works. The goal is to aim for very low maintenance feature flag compatibility IMO. THEN, push a PR and start discussing what has happened (or before, we can discuss while you're iterating on it, especially when there are tradeoffs, or directions you are unsure about). IDEALLY, the
I think the goal for [dependencies]
tokenizers = { version = "0.11", features=["wasm"]} Then they are free to use it as they see fit. And do whatever they want with it. I don't think it should be a |
@Narsil Great, thanks for your advice! Question: Would it be okay to pull the regex out into its own module and then either use I created a pull request here for you to look over: #908 |
I don't think that's a good idea. The main problem with having different regex engines, is that the saved The premise of using Again, looking at my proposed suggestion. Try to first make it work without |
@Narsil Ah, that makes sense. By "make it work without Also, I'm guessing that you don't have access to the original wasm branch that you made? I think it would be helpful if I could look over it. (But again, I don't want to burden you too much, so feel free to put this on hold if you're too busy to provide further guidance at the moment. Just in case you overestimated my ability here 😅) |
I didn't even keep the branch. Weekend thing, trying to find something else than rust-wasm tutorial to chew on, and I decided it was interesting to try this here. It's OK to be very HACKY first, by just removing code you don't like. Perfectly fine in my book. It's not mergeable, but it's not the target you're after, you're after a working version asap. Add everything your scratched back later making clean switches first (like #[cfg(feature)]). And iteratively add support back of everything you can. If you can't add it back and/or it's hard (like threads) it's perfectly fine to just not support that thing on wasm I think. (Threads for instance don't really make sense in WASM imo, on the browser you're going to run inference mainly, where the tokenization is extremely short, so spawning a thread is likely to destroy performance instead of helping anyway) It's easier to add good support later, than adding clunky support now and fixing later (because you now have the backward compatibility problem which will make things harder on you). |
@Narsil Okay I will give this a shot. A couple of questions:
|
Ohhhhhh sneaaky, that's how they achieve it ! Then maybe |
@Narsil Yep! It seems to build fine with Open a Codespace on master, and then: (1) Install Emscripten as explained in docs:
(2) Comment out these
(3) Install
(4) Build:
|
Great start ! Now we need to just find a way to make the cli hidden without breaking it. |
@Narsil Okay, so it turns out the only reason it was building correctly was because emscripten expects a So the first problem that I run into after adding the
This SO answer explains that it's due to an incompatibility in the emscripten (
(Assuming you're in the emsdk folder. But it may be better to But that still didn't work. Someone mentioned in the comments to that SO answer that Rust 1.47 and Emscripten 1.39.20 works, and that did indeed do the trick. So now it compiles the simple hello world. Next I tried adding this line to pub use tokenizers::tokenizer::*; And I get this error:
The same thing happens with
but then I realised that you can only add flags with So that's as far as I've gotten at the moment. Any thoughts/ideas? Click here for the full build logs.
|
Sorry I have no clue out of my head for this, never attempted to target Try to target raw |
@Narsil I don't know enough about the internals of this lib RE regex serialisation, etc. but would another potential pathway be to switch to a pure-rust regex like |
You're welcome to try. But how do you cover all possible cases without actually proving this stuff ? I just did a little google and found this: bminixhofer/nlprule#18 (comment) That will prevent the thing from being merged in this library, but it definitely doesn't prevent you from having a branch that works, we can even maybe advertise that branch with all caveats plain and clear. But we can't add something into the library that might unexpectedly fail to work for users (like different tokenization on wasm than on python). If you manage to make it work, we could also investigate the reasons of why we use
I say this just to give you perspective of what the "switch" implies IMO. |
@Narsil Good summary - thank you!
Yep, I did see this thread. In a follow up comment after making the abstraction (that I copy-pasted in my original pull request) he says:
(I'm not sure what those tests actually involved - e.g. perhaps it was only matching and not serialisation, etc.) But because I don't know how the regex stuff is actually used in I wonder whether there could be some sort of config that is saved along with tokenizer data such that the regex engine that was used is specified? But again, I only have an "inference end-user" level of knowledge of this codebase, so I'm probably not making much sense here. To continue along a hacky approach, just to see what I can learn about wasm compilation, I've made some changes to my regex abstraction branch to make it successfully compile with This might be a case where it's best to just wait for the wasm build tooling to mature a bit more so, especially as someone like myself who has very little knowledge of both this codebase, and the wasm compilation side of things. |
Sorry I didn't reply earlier, I might have missed the notification.
This makes perfect sense. I don't enjoy the added complexity, but this is the safe road.
Definitely missed that it's interesting and might lower the barrier for wasm support (still warrants a big red sticker since we're not 100% sure). Calling the feature
This is the simpler but unsafe road.
Thank you for this, do you think we could add an example to maybe just load a tokenizer and run inference in the browser to provide that it works ? |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I see your Node.js binding using Neon. But have you considered WebAssembly ? There are some tools to compile Rust code easily. So you will get a browser compatiblity and node v13 with a low impact on speed.
The text was updated successfully, but these errors were encountered: