fix(linter/plugins): handle utf16 characters within comment spans #14768

lilnasy · 2025-10-19T04:13:19Z

Part of #14564. Corrects start and end offsets to accommodate two byte characters. Conversion of UTF-8 indices to UTF-16 takes place on the rust side for performance.

graphite-app · 2025-10-19T04:13:25Z

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

codspeed-hq · 2025-10-19T04:19:36Z

CodSpeed Performance Report

Merging #14768 will not alter performance

_{Comparing lilnasy:fix/linter/plugins/utf16 (baf7192) with main (c6395c7)¹}

Summary

✅ 4 untouched
⏩ 33 skipped²

No successful run was found on main (cd266b4) during the generation of this report, so c6395c7 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
33 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Copilot

Pull Request Overview

This PR fixes UTF-16 character handling within comment spans for the linter plugin system. When source code contains multi-byte UTF-16 characters (like emojis or non-Latin scripts), comment span offsets need to be correctly converted from UTF-8 to UTF-16 indices to ensure accurate positioning.

Key Changes:

Added UTF-16 conversion for comment spans in the linter
Introduced comprehensive test coverage for Unicode characters in comments

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
crates/oxc_linter/src/lib.rs	Added call to convert comment spans to UTF-16 offsets
apps/oxlint/test/fixtures/unicode-comments/plugin.ts	New test plugin that reports all comments with their types and values
apps/oxlint/test/fixtures/unicode-comments/output.snap.md	Expected snapshot output showing correctly extracted comments with Unicode characters
apps/oxlint/test/fixtures/unicode-comments/files/unicode-comments.js	Test fixture file containing various Unicode characters in comments (emojis, Chinese, Greek, Hebrew, etc.)
apps/oxlint/test/fixtures/unicode-comments/.oxlintrc.json	Configuration file enabling the unicode-comments test plugin
apps/oxlint/test/e2e.test.ts	Added end-to-end test case for UTF-16 character handling in comments

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

overlookmotel

Great stuff!

All looks correct, so merging. But I'd suggest as follow-up to expand the tests:

Add start and end of each comment to snaphot.
Add range to snapshot (or assert comment.range[0] === comment.start && comment.range[1] === comment.end for all comments).
Add loc to snapshot for each comment, and eyeball that they look correct.

It might also be good to call context.report() for each comment individually, passing the Comment as node property. That'd:

Test the translation of offsets back to UTF-8 when reporting errors (that happens on Rust side).
Make the snapshot more readable and easier to eyeball for errors.

lilnasy · 2025-10-19T11:59:04Z

@overlookmotel I left out testing start and end explicitly for brevity. The comments' value being correct implies they were correct when used to slice sourceText. I can add the assertions and loc right away!

github-actions bot added A-linter Area - Linter A-cli Area - CLI C-bug Category - Bug labels Oct 19, 2025

fix(linter/plugins): handle utf16 characters within comment spans

baf7192

lilnasy force-pushed the fix/linter/plugins/utf16 branch from ad44a8d to baf7192 Compare October 19, 2025 04:20

lilnasy marked this pull request as ready for review October 19, 2025 04:29

lilnasy requested a review from camc314 as a code owner October 19, 2025 04:29

Copilot AI review requested due to automatic review settings October 19, 2025 04:29

Copilot AI reviewed Oct 19, 2025

View reviewed changes

lilnasy mentioned this pull request Oct 19, 2025

Linter plugins: Add comments-related APIs #14564

Closed

overlookmotel approved these changes Oct 19, 2025

View reviewed changes

overlookmotel merged commit 78ee7b8 into oxc-project:main Oct 19, 2025
21 checks passed

lilnasy deleted the fix/linter/plugins/utf16 branch October 19, 2025 11:59

Boshen mentioned this pull request Oct 22, 2025

release(oxlint): v1.24.0 #14893

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(linter/plugins): handle utf16 characters within comment spans #14768

fix(linter/plugins): handle utf16 characters within comment spans #14768

Uh oh!

lilnasy commented Oct 19, 2025

Uh oh!

graphite-app bot commented Oct 19, 2025

Uh oh!

codspeed-hq bot commented Oct 19, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

overlookmotel left a comment •

edited

Loading

Uh oh!

Uh oh!

lilnasy commented Oct 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix(linter/plugins): handle utf16 characters within comment spans #14768

fix(linter/plugins): handle utf16 characters within comment spans #14768

Uh oh!

Conversation

lilnasy commented Oct 19, 2025

Uh oh!

graphite-app bot commented Oct 19, 2025

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #14768 will not alter performance

Summary

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

overlookmotel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lilnasy commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Oct 19, 2025 •

edited

Loading

overlookmotel left a comment •

edited

Loading

lilnasy commented Oct 19, 2025 •

edited

Loading