-
-
Notifications
You must be signed in to change notification settings - Fork 780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sourcemap generation to JavaScript codegen target #3675
base: main
Are you sure you want to change the base?
Conversation
compiler-core/src/codegen.rs
Outdated
sourcemap.to_writer(&mut output).expect("Failed to write sourcemap to memory. This is a bug in sourcemap, please report."); | ||
let output = String::from_utf8(output).expect("Sourcemap did not generate valid UTF-8. This is a bug in sourcemap, please report."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really not a fan of those expect
- opened to any suggestion here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the bit about opening a bug ticket, all panics have this information printed for them
/// | ||
/// Used to produce SourceMaps. | ||
#[derive(Debug)] | ||
pub struct CursorPositionWriter<'a, W> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or move this to compiler-core/src/io/cursor_position_writer.rs
?
Also not a fan of the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I've left a bunch of notes inline
compiler-core/src/config.rs
Outdated
@@ -662,6 +662,8 @@ pub struct ErlangConfig { | |||
pub struct JavaScriptConfig { | |||
#[serde(default)] | |||
pub typescript_declarations: bool, | |||
#[serde(default)] | |||
pub sourcemap: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub sourcemap: bool, | |
pub sourcemaps: bool, |
compiler-core/src/javascript.rs
Outdated
type_reference, | ||
"export {}", | ||
line(), | ||
sourcemap_reference, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this is not with the type reference at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The declaration of the SourceMap URL does not have to be at a specific location, it can be anywhere in the JS code. It is something that's separate from the type reference (which is just TS-specific in my understanding), so I think it's best to keep them as separate declaration blocks for symmetry.
I take it from this comment that you'd rather have the sourcemap declaration at the top, so I moved it :)
WithSourceMapLocation { | ||
document: Box<Self>, | ||
start: LineColumn, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module should not know anything about code generation or Gleam, but here it knows about JavaScript specific implementation details. Any changes to this module should not be specific to anything other than pretty printing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When building a sourcemap, for each mapping we want to add, we need two things :
- The line+column of the mapping in the generated (JS) file
- The line+column inside the source (Gleam) file it maps to.
In my implementation, the field start
in this enum variant stores the information of where the Document
is in the source file (a SrcSpan
passed through the LineNumbers
util). The only place we know the line+column on the generated JS file is when the pretty printer is writing the Document
inside a file - then we know as we're writing "The document is being printed at the line x, column y", and this info is lost afterwards.
I agree that conceptually the pretty
module shouldn't know anything about JS, but given that the JS files that Gleam produces are pretty-printed, I don't see how it's possible to generate sourcemaps without hooking into the pretty-printer itself. Any idea I could explore ?
compiler-core/src/sourcemap.rs
Outdated
match self { | ||
SourceMapEmitter::Null => (), | ||
SourceMapEmitter::Emit(source_map) => { | ||
tracing::debug!("emitting one sourcemap entry"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this please
compiler-core/src/sourcemap.rs
Outdated
} | ||
|
||
impl SourceMapEmitter { | ||
pub fn add_mapping(&mut self, dst_line: u32, dst_col: u32, src_location: LineColumn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No abbreviations please
compiler-core/src/format.rs
Outdated
writer: &mut impl Utf8Writer, | ||
src: &EcoString, | ||
path: &Utf8Path, | ||
source_map_emitter: &mut SourceMapEmitter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting never uses a source map emitter, there should be no change to this API.
|
||
impl<'a, W: Utf8Writer + std::fmt::Write> Utf8Writer for CursorPositionWriter<'a, W> { | ||
/// A wrapper around `fmt::Write` that has Gleam's error handling. | ||
fn str_write(&mut self, str: &str) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This iterates the string twice. Can we do it once instead?
The pretty printer also iterates over and counts the characters, can that work be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reused LineColumn
inside of this now. As long as LineNumber does not change behavior (the TODOs mention graphemes which confuses me a bit), then this should work.
Whenever the behavior of LineColumn
changes, the test should break :)
compiler-core/src/io.rs
Outdated
self.line += newline_count; | ||
if newline_count > 0 { | ||
let lastline = str.lines().last().expect("Should have at least one line"); | ||
self.column = lastline.len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is number of bytes and not unicode graphemes or characters. Is what sourcemaps expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sourcemap works with byte indices, in my understanding this is correct. I'll add a test with emojis inside the code, see if that breaks anything.
assert_sourcemap!( | ||
"fn add_2(x) { | ||
x + 2 | ||
}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this single case enough to test all the given functionality? I presume it is not only implemented for addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it isn't. I have tested the sourcemaps with all examples from the Gleam tour on a personal project, I'm going to add those test cases in this PR, but I wanted to get some feedback on my snapshot testing before doing that. Indeed your comment below indicates that it's not clear how to test this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I add more test cases, let's agree on how to test this feature
source: compiler-core/src/javascript/tests/sourcemaps.rs | ||
expression: sourcemap_viz | ||
--- | ||
note: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This snapshot is rather verbose, to the extent of being hard to read. A more concise format would be very helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree, but discussions with Gleam folks on discord didn't really lead to a precise conclusion. What exactly would you be happy with?
Maybe each test could be "what is the Gleam location of this specific location in the JS file" - then the snapshot would contain the whole JS + Gleam code on top of each other, with the one position you're trying to assert?
As an example, say you write a test case that asks "What is the position in the Gleam code of the character 1 line 1 in the generated JS file ?", the snapshot would look like this:
┌─ original.gleam:1:1
│
1 │ fn add_2(x) {
│ ^ This code
│ x + 2
│ }
│
┌─ generated.js:2:1
│
2 │ function add_2(x) {
│ ^ Gets mapped to this
│ return x + 2;
│ }
}
When you are ready for a review please un-draft this PR. Thank you! |
e76e450
to
2c5e729
Compare
I think I need your opinion on stuff before I can write a bunch of unit tests or refactor the PR, so I'm marking this as ready. |
Hello! Are you waiting on something from me? I can't see anything here but wasn't sure if I missed anything. |
Hi @lpil, this PR got a bit old and so it has a few merging conflicts that I'll resolve today after work. There are 2 points which needs clarification before I proceed to add more unit tests : #3675 (comment) #3675 (comment) |
I noticed that if i already have a compiled build cache, then i switch to this compiler with sourcemaps turned on, maps arent emitted. i have to run a clean first, presumably to bypass a cache hit that does not consider this flag as input to the cache? |
@cdaringe, I believe the issue you ran into is orthogonal to this PR? I don't remember touching anything to do with caching. Do we need a special edge-case handling here ? I see the branch has conflicts, I'll rebase this :) |
…he sourcemap module
c747d28
to
56d12a1
Compare
Im on mobile, thus didn’t search the source, but iff the build cache algo can consider new inputs, im suggesting that the config options added here are missing in that input. It may ir may not be open for extension. I sorta kinda would think that a hash on gleam.toml would invalidate the build cache? |
Here goes!
The PR is still kind of WIP - mainly, I'd like to talk about how to approach testing (and actually add way more snapshot tests before this gets a chance to be merged).
I also have 2 expects on calls to
sourcemap
- I'm sure this is not the right way to do it, so I'm wondering what you guys think is the best approach ? Is it fine to have a "sourcemap error" variant in the big GleamError
enum ?Context
I highly recommend watching the following talk if you're unfamiliar with Sourcemaps : https://www.youtube.com/watch?v=6LI0BJIiamg
Long story short, a sourcemap is a file that defines mappings from specific places in a generated file (the generated javascript code - which needs to be referring to its sourcemap file) to a source file (the gleam module).
It's used for example :
Content
This PR adds sourcemap generation for the JavaScript codegen target.
Sourcemap generation is disabled by default, and needs to be manually enabled using the following config in the gleam.toml manifest :
This is done out of symmetry with other languages and tools - for example TypeScript does not automatically generate them. I imagine as your codebase grows large, it can end up being quite a heavy process. Also I'm assuming if this ever gets released, it should maybe be marked experimental.