-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: represent module source as [u8]
#152
Conversation
`Arc<String>` to `Arc<str>` (double heap pointer -> single heap pointer) Also enforces that BOM handling is a `deno_graph` concern. See denoland/deno_graph#152
src/graph.rs
Outdated
@@ -622,7 +622,7 @@ impl Module { | |||
kind: ModuleKind, | |||
parsed_source: ParsedSource, | |||
) -> Self { | |||
let source = parsed_source.source().text(); | |||
let source = parsed_source.source().text_str().as_bytes().to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will increase our memory usage which might be significant for large files. For maybe_source
, perhaps we should use an enum instead that is either Arc<str>
or Arc<u8>
and it shares the memory from parsed_source.source().text()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arc<str>
and Arc<[u8]>
technically have the same underlying memory layout, so we could convert between the two by just transmuting (this is unsafe though). Very painful that this is really just a Rust type system limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could change the internal property to non-public, for the instances when we don't have a parsed module (e.g. JSON modules) and store it as an Arc<[u8]>
only when we don't have a parsed source. Then provide a pub fn source() -> Option<Arc<str>>
and pub fn source_bytes() -> Option<Arc<[u8]>>
which we generate on the fly?
`Arc<String>` to `Arc<str>` (double heap pointer -> single heap pointer) Also enforces that BOM handling is a `deno_graph` concern. See denoland/deno_graph#152
4b380c5
to
6a574a5
Compare
Previously, if a specifier was imported more than once, with conflicting assertions, we'd only really consider the final import. This is wrong: conflicting assertions are an error in-themselves already. I think this behaviour is still wrong, because if a module is imported with incorrect assertions later in the graph after the module has already been loaded by an import with a correct assertion, no error will be raised.
This is a pre-requisite for WASM imports that additionally allows us to minimize clones when doing things like BOM stripping.
Previously module source was represented as `String`. This does not work for non-textual formats like WASM. As such, module source is now represented as `[u8]`.
6a574a5
to
bc1c5ea
Compare
`Arc<String>` to `Arc<str>` (double heap pointer -> single heap pointer) Also enforces that BOM handling is a `deno_graph` concern. See denoland/deno_graph#152
Previously module source was represented as
String
. This does not work fornon-textual formats like WASM. As such, module source is now represented as
[u8]
.In JS, the module source is now represented as a
Uint8Array
.Towards #150