Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

feat(rome_css_parser): CSS lexer #4682 #4684

Merged
merged 2 commits into from
Jul 11, 2023
Merged

Conversation

denbezrukov
Copy link
Contributor

@denbezrukov denbezrukov commented Jul 10, 2023

@netlify
Copy link

netlify bot commented Jul 10, 2023

Deploy Preview for docs-rometools canceled.

Name Link
🔨 Latest commit 94b7174
🔍 Latest deploy log https://app.netlify.com/sites/docs-rometools/deploys/64ad0ae8d9a1a40007244a2c

@github-actions github-actions bot added A-Parser Area: parser A-Tooling Area: our own build, development, and release tooling labels Jul 10, 2023
@github-actions
Copy link

Parser conformance results on ubuntu-latest

js/262

Test result main count This PR count Difference
Total 48863 48863 0
Passed 47810 47810 0
Failed 1053 1053 0
Panics 0 0 0
Coverage 97.84% 97.84% 0.00%

jsx/babel

Test result main count This PR count Difference
Total 40 40 0
Passed 37 37 0
Failed 3 3 0
Panics 0 0 0
Coverage 92.50% 92.50% 0.00%

symbols/microsoft

Test result main count This PR count Difference
Total 6212 6212 0
Passed 1764 1764 0
Failed 4448 4448 0
Panics 0 0 0
Coverage 28.40% 28.40% 0.00%

ts/babel

Test result main count This PR count Difference
Total 639 639 0
Passed 573 573 0
Failed 66 66 0
Panics 0 0 0
Coverage 89.67% 89.67% 0.00%

ts/microsoft

Test result main count This PR count Difference
Total 17224 17224 0
Passed 13121 13121 0
Failed 4103 4103 0
Panics 0 0 0
Coverage 76.18% 76.18% 0.00%

// Interpret the hex digits as a hexadecimal number. If this number is zero, or
// is for a surrogate, or is greater than the maximum allowed code point, return
// U+FFFD REPLACEMENT CHARACTER (�).
let hex = match hex {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we need to convert the escaped sequence to a char. swc uses two buffers for this: raw and buf.
https://github.com/swc-project/swc/blob/e4f9f734ad1f92c6f05e8cb4c2d799679cca9f79/crates/swc_css_parser/src/lexer/mod.rs#L692-L723

I'm not sure if we can implement this with our CST, but we can potentially implement such logic at the syntax level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, in CSS we need two values 🤔 we can think about it later, as you suggested

COMMENT
}
}
Some(b'/') => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// comments are forbidden in CSS, but I prefer to keep it.
We can implement an option to turn on/off such comments.

@denbezrukov denbezrukov force-pushed the feat/css-lexer-string branch from 5771f83 to f71436e Compare July 10, 2023 20:25
@denbezrukov denbezrukov force-pushed the feat/css-lexer-string branch from f71436e to ae13e2a Compare July 10, 2023 20:32
Copy link
Contributor

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great start! Thank you @denbezrukov

Comment on lines 24 to 25
quickcheck = "1.0.3"
quickcheck_macros = "1.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point we could move this crate in the workspace, what do you think?

/// Lexes the next token.
///
/// ## Return
/// Returns its kind and any potential error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns its kind and any potential error.
/// Returns its kind

It doesn't seem to turn any error

// Interpret the hex digits as a hexadecimal number. If this number is zero, or
// is for a surrogate, or is greater than the maximum allowed code point, return
// U+FFFD REPLACEMENT CHARACTER (�).
let hex = match hex {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, in CSS we need two values 🤔 we can think about it later, as you suggested

@github-actions github-actions bot added A-Core Area: core A-Formatter Area: formatter labels Jul 11, 2023
@denbezrukov denbezrukov merged commit 18b6a2a into main Jul 11, 2023
@denbezrukov denbezrukov deleted the feat/css-lexer-string branch July 11, 2023 08:13
@DeadlyMissile
Copy link

Is there a reason why you are creating a css lexer from scratch? Why not use one that exists already? This feels like ALOT of work

@ematipico
Copy link
Contributor

Is there a reason why you are creating a css lexer from scratch? Why not use one that exists already? This feels like ALOT of work

The lexers that already exists are not suitable for CSTs, and tree-sitter APIs are very poor

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Core Area: core A-Formatter Area: formatter A-Parser Area: parser A-Tooling Area: our own build, development, and release tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants