Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable build on wasm32-wasi #17

Closed
2 tasks
adamnovak opened this issue Dec 5, 2023 · 4 comments
Closed
2 tasks

Enable build on wasm32-wasi #17

adamnovak opened this issue Dec 5, 2023 · 4 comments

Comments

@adamnovak
Copy link
Contributor

For vgteam/sequenceTubeMap#379 I'm trying to get gbz-base to build for WebAssembly. But it doesn't at the moment, because simple-sds can't. Here's the first 8 errors it throws up:

error[E0433]: failed to resolve: could not find `unix` in `os`
  --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:59:14
   |
59 | use std::os::unix::io::AsRawFd;
   |              ^^^^ could not find `unix` in `os`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:430:44
    |
430 |             MappingMode::ReadOnly => libc::PROT_READ,
    |                                            ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:431:43
    |
431 |             MappingMode::Mutable => libc::PROT_READ | libc::PROT_WRITE,
    |                                           ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:431:61
    |
431 |             MappingMode::Mutable => libc::PROT_READ | libc::PROT_WRITE,
    |                                                             ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find function `mmap` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:433:34
    |
433 |         let ptr = unsafe { libc::mmap(ptr::null_mut(), len, prot, libc::MAP_SHARED, file.as_raw_fd(), 0) };
    |                                  ^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:433:73
    |
433 |         let ptr = unsafe { libc::mmap(ptr::null_mut(), len, prot, libc::MAP_SHARED, file.as_raw_fd(), 0) };
    |                                                                         ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find function `munmap` in crate `libc`
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:491:27
    |
491 |             let _ = libc::munmap(self.ptr.cast::<libc::c_void>(), self.len);
    |                           ^^^^^^ not found in `libc`

   Compiling rusqlite v0.29.0
error[E0599]: no method named `as_raw_fd` found for struct `File` in the current scope
   --> /Users/anovak/.cargo/git/checkouts/simple-sds-95484d45b95fb50d/c2f8637/src/serialize.rs:433:90
    |
433 |         let ptr = unsafe { libc::mmap(ptr::null_mut(), len, prot, libc::MAP_SHARED, file.as_raw_fd(), 0) };
    |                                                                                          ^^^^^^^^^ method not found in `File`
   --> /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/std/src/os/fd/raw.rs:65:8
    |
    = note: the method is available for `File` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
53  + use std::os::fd::AsRawFd;
    |

I think I need to:

  • Replace any std::os::unix-isms that can be replaced with things just in std::os, and make any other ones optional somehow.
  • Enable building without memory mapping support.
@jltsiren
Copy link
Owner

jltsiren commented Dec 6, 2023

I could make memory mapping a feature that is enabled by default but can be disabled. However, the bigger issue is that simple-sds leans heavily on the assumption that usize and u64 are the same. Many low-level things will probably break with 32-bit integers.

Additionally, Rust uses usize for array indexing, which means that it's difficult to use arrays larger than 2^32 in a 32-bit environment. We can't import GBZs with more than ~4.29 Gbp of sequence, such as human graphs built with PGGB or full Minigraph–Cactus graphs. We also can't import GBZs where the run-length encoded BWT is larger than 4 GB, such as 1000GP graphs (and possibly final HPRC graphs with 700+ haplotypes).

@adamnovak
Copy link
Contributor Author

We might be able to get away with the max size limitations. I was thinking we'd convert from GBZ to database outside the browser, so all we really need is to be able to properly decode the blobs in the database files.

And if we want to use the databases in the browser, and if we need simple-sds to decode the blobs in the databases, then I don't know if there's an alternative to painstakingly unwinding the assumption that usize is u64 in the code that actually implements the data structures.

I managed to get simple-sds to build for wasm32-wasi with liberal use of #[cfg(not(target_family = "wasm"))]. Hopefully once I can get the full gbz-base binaries to link and load right I can start identifying places where the two builds can't agree on serialized representations.

@jltsiren
Copy link
Owner

jltsiren commented Dec 6, 2023

I don't think gbz-base will need anything from simple-sds once the database has been built. The sizes and identifiers of individual objects should fit in 32 bits, because we often do that in vg as well. The blobs are encoded either using gbwt::support::ByteCode / gbwt::support::RLE, which don't care about the size of usize as long as the numbers fit in it, or an internal encoding that packs three bases in a byte.

@jltsiren
Copy link
Owner

I think PR #18 also resolved this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants