Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swc): Add comment preservation #3815

Merged
merged 18 commits into from
Mar 11, 2022
Merged

Conversation

dpchamps
Copy link
Contributor

@dpchamps dpchamps commented Mar 2, 2022

Problem

Generated code will drop comments in a variety of situations. Some of these situations may include:

  • Comments that precede types
  • Comments that precede / trail nodes that are transformed / moved / removed

The JS Ecosystem is (unfortunately) reliant on maintaining comment continuity between source and compiled code. There are many existing tools that use comments as annotations for various functionality, the most entrenched probably being istanbul coverage annotations. Not maintaining good parity between source and compiled output causes adopting SWC for test runners to be non-tenable for a lot of customers.

Solution

Implement a "comment fixer" visitor, dropped_comments_preserver, and introduce a feature flag preserve_dropped_comments into the jsc configuration object.

Overview of changes

Given the way SWC internals track comments, the comment fixer makes a best effort to preserve comments that may have otherwise been dropped. The output may not be 1:1 with what babel or other transpilers output, but it will preserve all expected functionality from existing js tooling where possible.

When preserve_dropped_comments is enabled, all trailing and leading comments will be shifted to the first non-dummy span that exists in the ast. This guarantees that all existing comments will be transported to the compiled output, orphaned comments will precede comments that would otherwise be preserved, and comments that would be preserved remain in a consistent position.


I tried my best to organize things as I thought they should be according to the current standards of the repo. Open to any and all suggestions 🙏 would really like to support this feature as it is currently blocking us over here at Walmart from adopting the SWC jest transformer fully. The gains are too good to ignore.

BREAKING CHANGE:

**Related issue (if exists): #2964 **


Closes #2964

@dpchamps dpchamps changed the title Issue 2964 dpchamps feat(swc): Add comment preservation Mar 2, 2022
noop_visit_mut_type!();

fn visit_mut_span(&mut self, span: &mut Span) {
if span.is_dummy() || self.is_first_span {
Copy link
Contributor Author

@dpchamps dpchamps Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rearranging comments, we have to stop and think what we should do with the module / script span. A comment at the top could be dropped, could be preserved in place, or could be transported to the next available span. Example of this ambiguity:

On the one hand, I could have something like this:

/* Some doc block, this very clearly should remain at the top of the file */
type X = number;
...

Or I could have something like this:

/* istanbul ignore next-line */
type X = number;
const somethingFancy = (...x)  => x;

Because during transpilation, code such as the second example will be shifted down to support utility functions, I opted to solve this ambiguity by assuming that the comment must be moved to the next nearest existing span.

This way at the very minimum tooling is supported.

Copy link
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes looks fine, but I think we need some tests to ensure that we does not use comments of other modules.

Can you add some tests js test suite at https://github.com/swc-project/swc/tree/ce78344a7c3597559ef32b0ee727dc0c7512c492/node-swc/__tests__ ?

Also, please fix the clippy errors.

@kdy1 kdy1 marked this pull request as draft March 3, 2022 06:46
@dpchamps
Copy link
Contributor Author

dpchamps commented Mar 3, 2022

@kdy1 happy to add the tests, will start early tomorrow pst.

Could you help me understand what you mean by "use comments of other modules"?

Lets say I call transformSync(someSource, config) with config including preserveDroppedComments: true

I'd expect to see all comments within someSource to be preserved. Is there a case where there's potentially other comments that might get included from somewhere else? I'm having a hard time envisioning it, but I'm certain you're calling out an important edge case.

I'll also update the node-swc config types in the follow-up commit.

Thanks!

@kdy1
Copy link
Member

kdy1 commented Mar 3, 2022

SWC has a global storage for comments, and the storage is shared among files. And the storage uses BytePos to store/get comments.
The problem I'm worrying about is BytePos used incorrectly so the storage takes comments of other module, or stores comments in other module.

This may change in the future, but for now, we have to ensure that BytePos is used correctly.

@@ -1021,6 +1028,9 @@ pub struct JscConfig {

#[serde(default)]
pub lints: LintConfig,

#[serde(default)]
pub preserve_dropped_comments: bool,
Copy link
Member

@magic-akari magic-akari Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary?
I suppose swc should always preserve comments.
And It's ok to remove comments at the minify phase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because this patch tries to preserve comments on dropped nodes.

Btw, I think preserve_all_comments is better name. (Even if it does not preserve all comments)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magic-akari I added this because it was requested here: #2964 (comment)

@kdy1 : agreed, see 3ce4af7

@dpchamps dpchamps force-pushed the issue-2964-dpchamps branch from 986242e to 8b971af Compare March 3, 2022 23:45
@dpchamps
Copy link
Contributor Author

dpchamps commented Mar 3, 2022

@kdy1 lmk if this is what you had in mind: https://github.com/swc-project/swc/blob/8b971afa46f9359b7b7ac8fa5d843b35f3de4dd0/node-swc/__tests__/preserve_comments.mjs

Clippy errors fixed, unmarking this as a draft and requesting another pass. thanks!

@dpchamps dpchamps marked this pull request as ready for review March 3, 2022 23:51
@kdy1
Copy link
Member

kdy1 commented Mar 3, 2022

Yes, that's what I meant.
Thank you!

Copy link
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the way to fix this, so I'll think about it more.

let mut leading_comments = Vec::new();
let mut trailing_comments = Vec::new();

for idx in (self.comment_position..=span.lo.0).map(BytePos) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this will be extremely slow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It removes whole advantage of storing comments in a shared hash map.

Copy link
Contributor Author

@dpchamps dpchamps Mar 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdy1 Sorry, it was an extremely busy day for me. I'm just now sitting down and thinking about how to make this more this more performant.

Reviewing the discord convo from last night, your suggestion to use SingleThreadedComments is unclear to me. I don't see a way to depart from the SwcComments data structure that is used top-level in Compiler without replacing the compiler type, because this needs to be implemented as a pass baked into existing compiler.

Instead, I think we need to pull hashing operations out of the visitor entirely.

This seems to be a more appropriate way to solve this problem:

  • Collect all candidate spans with a visitor
  • iterate exactly once over the comments entries, shifting orphaned comments to candidate spans

I understand the reasons that iterating over the concurrent hashmaps is undesirable, but reducing that iteration to one time outside of the visitor sounds like an acceptable concession. Do you agree?

If not, could you provide some guidance on how exactly you intend single threaded comments to be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this needs to be implemented as a pass baked into existing compiler

No, it should not and it does not have to.
It will used inside fn transform and it will be dropped once fn transform() is finished.

@kdy1 kdy1 marked this pull request as draft March 5, 2022 04:13
@dpchamps dpchamps force-pushed the issue-2964-dpchamps branch from 8b971af to 891a316 Compare March 9, 2022 04:55
@dpchamps
Copy link
Contributor Author

dpchamps commented Mar 9, 2022

@kdy1 addressed the performance concerns you had by reworking the approach to comment shifting. Also followed the general feedback from our discord chat.

The summary of the comment shifting refactor is:

  • Use SingleThreadedComments
  • Shift known comment positions to known spans only, no unnecessary checks

Otherwise, moved dropped_comments_preserver into the swc crate as per your suggestion.

Unmarking as draft and requesting another pass, thanks a ton!

@dpchamps dpchamps marked this pull request as ready for review March 9, 2022 04:59
@dpchamps dpchamps force-pushed the issue-2964-dpchamps branch from 891a316 to bcd52f0 Compare March 9, 2022 05:00
Copy link
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

let (comments_to_move, next_byte_positions): (CommentEntries, CommentEntries) =
existing_comments
.drain(..)
.partition(|(bp, _)| bp <= &span.lo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use *bp <= span.lo instead

/// comments.

pub fn dropped_comments_preserver(
comments: Option<&SingleThreadedComments>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need a lifetime here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are cheap to clone, and it's part of the api.

/// similar location
fn shift_trailing_comments(&mut self, remaining_comment_entries: CommentEntries) {
let last_trailing =
self.known_spans.iter().fold(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use .iter().copied() instead.
Span is Copy, so we don't need indirection here.

@dpchamps
Copy link
Contributor Author

Feedback addressed 7b3d8e3

Copy link
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!


swc-bump:

  • swc --breaking

@kdy1 kdy1 added this to the v1.2.153 milestone Mar 11, 2022
@kdy1 kdy1 enabled auto-merge (squash) March 11, 2022 07:06
@kdy1 kdy1 merged commit c5a0c9a into swc-project:main Mar 11, 2022
dpchamps added a commit to dpchamps/website that referenced this pull request Mar 11, 2022
Docs update following swc-project/swc#3815

Will be valid post `v1.2.153` milestone release
kdy1 pushed a commit to swc-project/website that referenced this pull request Mar 12, 2022
Docs update following swc-project/swc#3815

Will be valid post `v1.2.153` milestone release
bestlucky0825 pushed a commit to bestlucky0825/swc-website that referenced this pull request Jun 1, 2022
Docs update following swc-project/swc#3815

Will be valid post `v1.2.153` milestone release
@swc-project swc-project locked as resolved and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Comments on types are stripped
3 participants