Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebAssembly implementation of Zarr #14

Open
jakirkham opened this issue Aug 4, 2018 · 26 comments
Open

WebAssembly implementation of Zarr #14

jakirkham opened this issue Aug 4, 2018 · 26 comments

Comments

@jakirkham
Copy link
Member

Would be really nice to have a WebAssembly implementation of Zarr. This could make it possible to load Zarr files in the browser for viewing or for computation. Could also be useful to be able to work with in-memory Zarr objects in the browser. Given there has already been some good work to get Python and NumPy into the browser using Emscripten, it may be possible to just run the Python Zarr implementation in the browser. Though other compression algorithms not in the standard Python library will require getting Blosc and Numcodecs into WebAssembly. ( Blosc/c-blosc#238 )

Note: WebAssembly is well supported. For older browsers one can convert WebAssembly to asm.js, which is pretty well supported. In the worst case, asm.js is still valid JavaScript. So can be run as JavaScript (albeit slowly).

@jakirkham
Copy link
Member Author

It's worth noting that Rust can compile to WebAssembly. This is a builtin feature of the Rust compiler. There is an N5 implementation for Rust, which could be a good place to start.

cc @aschampion

@aschampion
Copy link

I'm planning on creating a simple WASM-compatible flag (or new dependent crate) for the rust N5 implementation at some point. We plan to use this to improve viewing N5 volumes in CATMAID. Initially this should be as simple as adding a new backend to use the fetch API instead of filesystem and disabling compression modes that don't compile with WASM.

@alimanfoo alimanfoo transferred this issue from zarr-developers/zarr-python Jul 3, 2019
@dhirschfeld
Copy link

Related: https://gitter.im/zarr-developers/community?at=5ddc7f1c55bbed7ade461091
https://github.com/gzuidhof/zarr.js

Not WASM but aims to make zarr files accessible from the browser

@jakirkham
Copy link
Member Author

Also noting there is a Scala implementation that can be compiled to JavaScript. Relevant discussion and more info in issue ( #15 ). That said, I'm not aware of a good path from Scala to WebAssembly.

@vdwees
Copy link

vdwees commented Apr 7, 2020

I'd be interested in contributing to a rust and wasm implementation of zarr. Would anyone like to collaborate on this?

@thewtex
Copy link

thewtex commented Apr 7, 2020

A wasm implementation of Zarr-Blosc decompression is here:

https://github.com/Kitware/itk-vtk-viewer/blob/7f82bbff02b6e8d847c76457fc07979be07c7ad5/src/bloscZarrDecompress.js

If there is interest, this could be separated out into a new package, and the corresponding compression function added (it has already been implemented in C/Emscripten). It would make sense to use this as the decompression for JavaScript / Typescript libraries like @gzuidhof 's zarr.js or @freeman-lab 's zarr-js.

This implementation supports all blosc codec's. It also uses a pool of web workers to decompress a set of chunks in parallel and optimize wasm compilation.

Here is what it looks like in action:

https://kitware.github.io/itk-vtk-viewer/app/?fileToLoad=https://thewtex.github.io/allen-ccf-itk-vtk-zarr/average_template_50_chunked.zarr

@alimanfoo
Copy link
Member

@vdwees
Copy link

vdwees commented Apr 8, 2020

@thewtex It would be awesome to make it a separate package. I am new to WebAssembly, but I'd be happy to contribute where I can.

@thewtex
Copy link

thewtex commented Apr 9, 2020

Wow, that is mega cool. What is it?

A brain atlas averaged from 1675 mice 🐭 🐭 🐭

@thewtex It would be awesome to make it a separate package. I am new to WebAssembly, but I'd be happy to contribute where I can.

@vdwees Great, we'll create a package, your help is appreciated.

@manzt
Copy link
Member

manzt commented May 26, 2020

I just made numcodecs.js public which has a WASM blosc codec. Hopefully this will help others use blosc in their applications!

EDIT: It's a javascript module meant to be run in the browser and Node.

@jakirkham
Copy link
Member Author

I just made numcodec.js public which has a WASM blosc codec. Hopefully this will help others use blosc in their applications!

Nice work! Thanks for sharing @manzt. I wonder how hard it is to get Zarr usable from WebAssembly then (as it is pure Python at that point)

cc @rth @mdboom (who may be interested ;)

@rth
Copy link

rth commented May 27, 2020

I wonder how hard it is to get Zarr usable from WebAssembly then (as it is pure Python at that point)

If it is pure python (and has pure python wheels) you could install it from PyPi with pyodide, but you would still need to write some code to interact with those JS/WASM libraries where currently it uses other Python package with C-extensions..

@Farkal
Copy link

Farkal commented Oct 14, 2020

Hey there, i am very interested by the zarr format so i am available to create a WebAssembly/Rust lib but i would like to directly implement the v3 spec. After reading some of the differents topic of the zarr spec repo the spec for the v3 seems pretty great. Do you think i could start some implementation ? Or should i wait for the python impl first ?

Update: Sorry but I cannot work on this project due to some terms of my employment contract 😞

@jakirkham
Copy link
Member Author

Hey @Farkal, welcome! That sounds great! 😄

Would be nice to have other people trying out the spec in other languages. This can help inform whether what we have in the spec makes sense or if it needs further modification. FWIW there is a WIP Python implementation here ( https://github.com/alimanfoo/zarrita ). Also we have been engaging with some folks from QuantStack on the C++ side and with NetCDF on the C side. It would be really interesting to see whether things makes sense from the WebAssembly/Rust side.

Also we have a weekly spec meeting details in issue ( #33 ) if you would be interested in stopping by. Would be nice to say hi and learn a bit more about what you are working as well as how we can help 🙂

@oeway
Copy link

oeway commented Jan 8, 2021

FYI: zarr-python and numcodecs are compiled into WASM and available as modules in Pyodide after this PR.

Click here to try a live demo with Pyodide + zarr running completely in the browser.
(Only works with Chrome or FireFox)

A next goal is to add a custom storage backend for pyodide such that we can load zarr arrays via http. However, due to the browser limitations, we cannot use fsspec with its http backend directly. To enable this, we are currently working on the asyncio event loop, and we will likely also need to wait until we have the multi-threading supported in Pyodide.

@jakirkham
Copy link
Member Author

Very cool! Thanks for sharing Wei 😄

cc @martindurant (who may be interested in fsspec usage)

@martindurant
Copy link
Member

This was discussed a bit on gitter. fsspec for Pyodide seems like a big benefit with or without zarr, but indeed the sync/thread stuff adds complexity that in this environment I don't think I'm in the best place to tackle. Happy to help test, though!

@jakirkham
Copy link
Member Author

It looks like Pyodide is including Zarr & Numcodecs, which is cool to see 😄

@joshmoore
Copy link
Member

@jakirkham : where does that leave this issue? :)

@oeway
Copy link

oeway commented Feb 18, 2022

It looks like Pyodide is including Zarr & Numcodecs, which is cool to see 😄

I added that two libraries to pyodide a while ago, it works with in-memory data but still very limited for any real application because we cannot support remote storage backends.

Not sure if this is discussed already in the zarr community, but the key feature to make that work is to support async store (with asyncio). The native python implementation of fsspec uses threading to convert async calls into sync, but multi-threading in pyodide is not supported yet, it will only work if zarr supports async store (meaning the getitem function will be async).

@martindurant
Copy link
Member

Working on async zarr at https://github.com/martindurant/async-zarr as part of a company hack week

@martindurant
Copy link
Member

Already works in normal python asyncio, and maybe works in pyscript too, just need to write some HTML or something...

@MSanKeys963
Copy link
Member

Working on async zarr at https://github.com/martindurant/async-zarr as part of a company hack week

Thanks for sharing the details, @martindurant. May I know the exact dates for the hack week?
I'd like to post publicly about this to invite more contributors.

@jakirkham
Copy link
Member Author

IIUC it is a hack week Anaconda is running for its employees

@martindurant
Copy link
Member

That's correct; and the hack is now over.
I made this little video of the current state: https://drive.google.com/file/d/1Ll-Lr_3Ckf_-WIlBkIPx4H8Kmz9lz4b9/view?usp=sharing

@MSanKeys963
Copy link
Member

Thanks a lot, @martindurant. This is great.
I'll share this across our social media to look at so that we can get the word out and invite new users/contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests