Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Style from (and serialize to?) CSS syntax #440

Open
SimonSapin opened this issue Apr 14, 2023 · 14 comments · May be fixed by #460
Open

Parse Style from (and serialize to?) CSS syntax #440

SimonSapin opened this issue Apr 14, 2023 · 14 comments · May be fixed by #460
Labels
enhancement New feature or request

Comments

@SimonSapin
Copy link
Contributor

SimonSapin commented Apr 14, 2023

This is a roadmap item (#345) that I’m interesting in working on. (Though I make no promise in terms of timeline or my future availability.) Below is an opinionated proposal of how this could be done.

User-visible changes

The taffy crate gains a new Cargo feature named css, disabled by default. Enabling it:

  • Adds a dependency from taffy to https://crates.io/crates/cssparser
  • Adds a dependency from taffy to taffy-css-derive, a new proc macro crate that lives in the same repository. Since many Rust types are involved, manually implementing everything without proc-macros would be quite repetitive and prone copy-paste errors. This crate is similar to Servo’s style_derive. It depends on syn and quote.
  • Adds these APIs to taffy::style:
    impl Style {
        pub fn parse_css(input: &str) -> (Self, Vec<CssParseError>) {..}
        pub fn serialize_css(&self) -> String {..} // maybe?
    }
    
    #[derive(Clone, Debug)]
    pub struct CssParseError {..}
    impl Display for CssParseError {..}
    impl Error for CssParseError {..}

Specifications

As an "input" of layout, taffy::style::Style is the internal representation of computed values for a given element. Style::DEFAULT is effectively initial values.

Style::parse_css returns computed values as if the input string was a list of declarations from an HTML style attribute (so with syntax like display: flex; margin: 10%) and as if there were no other applicable declarations and no parent element for inheritance. Parsing is infallible: a Style struct is always returned. Unspecified properties get their initial value. Parse errors (reported on the side) cause parts of the input to be ignored until the parser can recover. (For a declaration list, recovery is at the next semicolon.) Parse errors include invalid declaration syntax, unsupported property, invalid/unsupported value, etc.

Relative length units (font-relative like 1.5em or viewport-relative like 80vw) are not supported. (Per the above, using them cause the declaration to be ignored and to emit a parse error.)

Style::serialize_css is similar to serialize a CSS declaration block. It omits properties that have their initial values (so Style::DEFAULT.serialize_css() is the empty string). It round-trips without errors: Style::parse_css(x.serialize_css()) == (x, vec![]) for any x: Style

Testing

Use snapshot testing with https://insta.rs/ to check the derive(Debug) output for various CSS inputs

Open API and behavior questions

  • API naming. Feel free to propose any change. (from_css / to_css, parse / impl Display, …)

  • Should serialization to CSS syntax be included? It’s relatively easy to do while also doing parsing, but maybe not as useful.

  • How are absolute length units anchored? CSS has seven of them. For historical reasons they all have a fixed ratio to each other: 1in is 96px is 72pt is 6pc is 2.54cm is 25.4mm is 101.6Q. See reference pixel discussion.

    Taffy documents Dimension::Points as: “Points are abstract absolute units. Users of Taffy may define what they correspond to in their application (pixels, logical pixels, mm, etc) as they see fit.” The enum variant name suggests 100pt should parse as Dimension::Points(100_f32), but on the web the px unit is much more common and getting Dimension(75_f32) from 100px might be unexpected. I don’t really like this idea but maybe to force users to consider units, all units except pt could be artificially unsupported?

    (By the way, I didn’t find this explicitly documented but I assume Dimension::Points(100_f32) maps to 100_f32 in taffy::layout::Layout?)

    Or there could be a configurable "DPI" or "device pixel ratio". A taffy::node::Taffy method seems like the natural place for it (next to enable_rounding / disable_rounding), but would require either Taffy::parse_css() or Style::parse_css(&Taffy) instead of Style::parse_css().

Open licensing questions

cssparser is licensed under MPL-2.0. Is that OK for a dependency?

Similarly, would it be OK for taffy-css-derive to be licensed under MPL-2.0? This would allow copying some code from style_derive. Alternatively taffy-css-derive could be entirely new code, MIT-licensed like taffy, obviously at the cost of more time and effort.

Disclaimer: although it’s been years since I’ve looked at it closely, I’ve contributed to style_derive in the past.

@SimonSapin SimonSapin added the enhancement New feature or request label Apr 14, 2023
@nicoburns
Copy link
Collaborator

Units

Taffy's Dimension::Points is best thought of as pixels. I believe every consumer of Taffy is using it as such. However, there is a slight complication in that it could be interpreted as either logical pixels (as in CSS) or physical device pixels. At least bevy is passing Taffy physical device pixels. And Taffy will optionally round the final layout to the nearest pixel which makes more sense to do with physical pixels.

I would design your code to parse px to Taffy points, and either use the releveant ratio for other absolute units or only support px/%. I definitely agree that vw, em, etc shouldn't be supported. Prior art has added a second "advanced" API that takes a configuration struct (in that case for named border widths). That could be a good place to add a device_pixel_ratio. I feel like it'd be ok to have it set separate for parsing and Taffy. We could always add a Taffy::parse_css method that does passes it automatically.

I didn’t find this explicitly documented but I assume Dimension::Points(100_f32) maps to 100_f32 in taffy::layout::Layout

Yes, correct.

Serialization

I'm definitely keen to see serialization support. Use cases:

  • Debug tooling
  • As an actual serialization format. CSS is a well defined format. There's no reason not to use it.

I think both parsing and serialization are useful on their own, and I'd be happy to merge either both at once or one without the other.

API

You mention a derive macro. Presumably the idea is that this would be used on the Style struct (and perhaps other style types?) And perhaps also that users would be able to use it on their own style types? Could you perhaps give some indication of how that would work / example usage?

It would be cool if it could be made to be somewhat extensible, as I know a lot of Taffy's users are working with systems that also use non-layout related CSS properties for various purposes. But I also wouldn't consider this necessary.

Licensing

We prefer MIT, but I think MPL is fine (but CC @alice-i-cecile). My understanding is that MPL can be used in proprietary code, and as cssparser is already MPL so it seems like it would be hard to avoid. I'd ideally like to keep any MPL code in it's own crate so as to keep the licensing situation clear for users.

Prior art

  • I have an incomplete PR for this, which uses lightningcss (which builds on top of cssparser) for parsing. That was paused partly because of concerns around binary size of WASM builds (I'm not sure how much of that is due to cssparser and how much comes from lightningcss). However: on reflection I'm not sure how much we need to care about this as I'd anticipate WASM Taffy being used primarily in a node/deno context rather than a web context (web already has CSS layout!). It'd still be good to at least somewhat keep this in mind though.
  • This was based on code from dioxus-native-core
  • This branch contains a parser based on swc_css (used in the context of our test runner). This one was abandoned as swc_css is nightly-only and the authors are not willing to change that.

Note: I'm not actively working on any of these, so very happy for you to take this on.

@nicoburns
Copy link
Collaborator

I quick experiment shows that lightningcss balloons wasm binary size from ~300kb to ~2.1mb, even with default features disabled. cssparser by itself seems to have no affect (although obviously that will increase once actual parsing code is implemented).

@SimonSapin
Copy link
Contributor Author

It sounds like there should a configurable pixel ratio of Taffy length units per CSS px that defaults to 1. To enable that, should Style::parse_css(&str) -> … be Taffy::parse_style(&self, &str) -> … instead? Something else?

To avoid confusion with CSS pt, what do you think of renaming the Points variant (of enums that have one) to Length?

I was thinking that derive macro(s) would be for internal use only and implement private trait(s). If there’s interest for a making it a public API that could also be considered, but be mindful that having &mut cssparser::Parser in public trait method parameter makes cssparser a public dependency so its breaking changes become Taffy breaking changes.

Sketch of a trait to be derived:

trait CssValue {
    fn parse(parser: &mut cssparser::Parser) -> Result<Self,>;
    fn serialize(&self, dest: &mut impl std::fmt::Write) -> std::fmt::Result;
}

If f32 values representing a length can be replaced by a struct Length(pub f32); newtype, the macro could then interpret field-less enum variants as keywords (automatically generating the CSS kebab-case from the Rust CamelCase variant name) and forward calls on single-field variants to the field type. And maybe other conventions like this.

If taffy-css-derive is MPL2 then the crate boundary (which is necessary anyway for proc macros) would be the licensing boundary.

@SimonSapin
Copy link
Contributor Author

It looks like lightningcss aims to implement support for "everything". This is fine for things like a minifier, but when implementing CSS layout like Taffy does the spec requires that unsupported properties and values should not be parsed: https://drafts.csswg.org/css-syntax-3/#w3c-partial

This is why cssparser ends up with this trait-heavy API that lets its users define value parsing.

@SimonSapin
Copy link
Contributor Author

Oh. Shorthand v.s. longhand properties is nothing insurmountable but a headache I’d forgotten about 😅

@nicoburns
Copy link
Collaborator

when implementing CSS layout like Taffy does the spec requires that unsupported properties and values should not be parsed: https://drafts.csswg.org/css-syntax-3/#w3c-partial
This is why cssparser ends up with this trait-heavy API that lets its users define value parsing.

Couldn't you produce an equivalent result (and therefore spec compliance) by parsing everything and then later (before returning from the parse function) throwing away values you don't understand?

In any case, the fact that lightningcss includes support for every CSS property, but doesn't allow you to compile only a subset seems like a big strike against it. It seems pretty ridiculous to include a LUT for named color values when Taffy doesn't support colours for example!

@nicoburns
Copy link
Collaborator

I'm somewhat questioning the value of a derive macro here. We only support ~20 properties, and even fewer of those are enums of keywords. It seems to me that it would be pretty easy to just handwrite parse/serialize function for those. And this would presumably be better for compile times. I guess a derive macro would make adding new properties easier though, so 🤷

I believe that properties that have a value of:

  • LengthPercentage
  • LengthPercentageAuto
  • Dimension
  • MinTrackSizingFunction
  • MaxTrackSizingFunction

could be implemented using a single parsing function for MaxTrackSizingFunction + TryInto impls from MaxTrackSizingFunction to the other types. That would leave:

  • aspect-ratio
  • `grid-{row,column}-{start,end}
  • `grid-{auto,template}-{rows,columns}

needing parsers. I think parsing the css function syntax for grid-template-columns is going to be by far the most complex thing here.

On the topic of grid styles, we currently don't support:

  • Named grid lines
  • Named grid areas, grid-template-areas or grid-template syntax

I'm not sure how that would interact with not parsing unsupported features, but I thought it bore mentioned.

@Weibye
Copy link
Collaborator

Weibye commented Apr 15, 2023

Just for context, over the last few days I started exploring parsing / validating html in taffy. My goal are the following:

  1. Validate the contents of our test fixtures. This would be for development and testing.
  2. Parse html into a taffy node-tree. This would be as a user facing feature.

My findings so far is that parsing html into a taffy node tree is not very useful unless we can also parse the accompanying css into Style. I will try to get things into a working state then leave it as a draft that can be pulled in once css parsing is up and running.

@Weibye
Copy link
Collaborator

Weibye commented Apr 15, 2023

If you take a look at #422, we found that we need to be able to control which elements are supported or not. Any parsing or validation we do in taffy needs to correspond tightly with whatever html or css properties we support in taffy.

@SimonSapin
Copy link
Contributor Author

FWIW I wrote this issue with the scope in mind of only parsing "a list of CSS declaration" for creating a single taffy::style::Style struct. In HTML terms, this would help support style="…" attributes but not <style> elements.

If you want to parse entire stylesheets together with entire HTML documents, presumably you also to apply selectors and run the whole CSS cascade, inheritance, etc. This is a lot more involved than just parsing. In my opinion this would add an order of magnitude to the scope of the entire Taffy project, especially if you want it to be fast and support incremental updates.

@nicoburns
Copy link
Collaborator

nicoburns commented Apr 16, 2023

@SimonSapin So I believe @Weibye's primary objective here is to write a lint script for our test fixtures as we've a had a couple of cases of silly typos which caused a style not to apply. For this purpose a declaration list parser (and specifically the ability to parse inline styles from a style"..." attribute should be sufficient. We will also need basic HTML parsing (but I believe @Weibye is intending to implement that part themselves). Resolving the CSS cascade or any kind of updates (incremental or otherwise) will not be required.

Our test fixtures do use an external CSS file, however it's tiny (68 lines) and the one file is shared across all test fixtures so it's pretty easy to audit manually. The actual test cases are just inline CSS, but there are a lot of them so automated checking would be quite helpful.

@Weibye
Copy link
Collaborator

Weibye commented Apr 17, 2023

Yes, the primary idea was to solve validation of our test fixtures.

If you want to parse entire stylesheets together with entire HTML documents, presumably you also to apply selectors and run the whole CSS cascade, inheritance, etc. This is a lot more involved than just parsing. In my opinion this would add an order of magnitude to the scope of the entire Taffy project, especially if you want it to be fast and support incremental updates.

The secondary "goal" is to be able to parse html + css into a fully featured taffy tree, but as you point out that is complicated and perhaps not something worth pursuing at this time. It might also be that this functionality is better suited to a separate crate that depend on taffy, if this ever is implemented.

@SimonSapin
Copy link
Contributor Author

I'm somewhat questioning the value of a derive macro here.

Alright, I’ll start without and see how it goes.

Note however that cssparser already has proc-macros that use syn and quote, so that "compile time tax" is already paid either way when enabling the new taffy/css feature flag.

SimonSapin added a commit to SimonSapin/taffy that referenced this issue Apr 19, 2023
The old name for this one-dimensional length:

* Suggests PostScript points, but is documented as an abstract unit:
  “Users of Taffy may define what they correspond to in their application
   (pixels, logical pixels, mm, etc) as they see fit.”

* Is spelled similar to the two-dimensional `Point` though its meaning is unrelated.

* In CSS syntax, `1px = 0.75pt` but the former is much more commonly used.
  When [parsing CSS into a `Style`](DioxusLabs#440)
  we’ll probably want the default mapping to be `1px` => 1f32 abstract unit,
  so naming the abstract unit "points" would be confusing.
SimonSapin added a commit to SimonSapin/taffy that referenced this issue Apr 19, 2023
⚠️ This is a fairly disruptive breaking changes,
although the fix for users is mostly search-and-replace.

The old name for this one-dimensional length:

* Suggests PostScript points, but is documented as an abstract unit:
  “Users of Taffy may define what they correspond to in their application
   (pixels, logical pixels, mm, etc) as they see fit.”

* Is spelled similar to the two-dimensional `Point` though its meaning is unrelated.

* In CSS syntax, `1px = 0.75pt` but the former is much more commonly used.
  When [parsing CSS into a `Style`](DioxusLabs#440)
  we’ll probably want the default mapping to be `1px` => 1f32 abstract unit,
  so naming the abstract unit "points" would be confusing.
SimonSapin added a commit to SimonSapin/taffy that referenced this issue Apr 19, 2023
⚠️ This is a fairly disruptive breaking changes,
although the fix for users is mostly search-and-replace.

The old name for this one-dimensional length:

* Suggests PostScript points, but is documented as an abstract unit:
  “Users of Taffy may define what they correspond to in their application
   (pixels, logical pixels, mm, etc) as they see fit.”

* Is spelled similar to the two-dimensional `Point` though its meaning is unrelated.

* In CSS syntax, `1px = 0.75pt` but the former is much more commonly used.
  When [parsing CSS into a `Style`](DioxusLabs#440)
  we’ll probably want the default mapping to be `1px` => 1f32 abstract unit,
  so naming the abstract unit "points" would be confusing.
SimonSapin added a commit to SimonSapin/taffy that referenced this issue Apr 21, 2023
⚠️ This is a fairly disruptive breaking changes,
although the fix for users is mostly search-and-replace.

The old name for this one-dimensional length:

* Suggests PostScript points, but is documented as an abstract unit:
  “Users of Taffy may define what they correspond to in their application
   (pixels, logical pixels, mm, etc) as they see fit.”

* Is spelled similar to the two-dimensional `Point` though its meaning is unrelated.

* In CSS syntax, `1px = 0.75pt` but the former is much more commonly used.
  When [parsing CSS into a `Style`](DioxusLabs#440)
  we’ll probably want the default mapping to be `1px` => 1f32 abstract unit,
  so naming the abstract unit "points" would be confusing.
nicoburns pushed a commit that referenced this issue Apr 21, 2023
⚠️ This is a fairly disruptive breaking changes,
although the fix for users is mostly search-and-replace.

The old name for this one-dimensional length:

* Suggests PostScript points, but is documented as an abstract unit:
  “Users of Taffy may define what they correspond to in their application
   (pixels, logical pixels, mm, etc) as they see fit.”

* Is spelled similar to the two-dimensional `Point` though its meaning is unrelated.

* In CSS syntax, `1px = 0.75pt` but the former is much more commonly used.
  When [parsing CSS into a `Style`](#440)
  we’ll probably want the default mapping to be `1px` => 1f32 abstract unit,
  so naming the abstract unit "points" would be confusing.
@nicoburns nicoburns mentioned this issue Apr 24, 2023
37 tasks
@devongovett
Copy link

devongovett commented Apr 30, 2023

hey, I'm the maintainer of lightningcss and stumbled across this issue. If there's something I can do to help here please let me know. Adding more feature flags is totally an option, for example. Parsing all the CSS properties/values/rules is a lot of work so if there's something that makes sense to reuse that might be good. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

4 participants