CSS content property and unicode escape sequences #399

Dakror · 2022-12-28T12:23:15Z

This PR adds the content property in a very simple first version. Only text literals are supported, and will be instantiated as inner RML. Additionally, unicode unescaping according to the CSS spec is added as well.

I wanted this feature in order to be able to use icon web-fonts out of the box using their stylesheet with as little modification as possible (i think we currently don't support ::before, do we?)

After implementing the backslash escaping, i noticed however that currently backslashes are handled like windows path delimiters, which would of course clash with the escaping syntax. Thus the current escaping tests fail.

So maybe we need to introduce a spec-diverging escape sequence like \u or something. But that would still cause unwanted behavior in the rare cases where Windows folders would start with a u.

Dakror · 2022-12-28T17:19:00Z

Speaking of my desired usecase. Now that i have the (giant) stylesheet loaded, i noticed it took significantly longer to boot up. Measuring it shows that the StyleSheetNode CompoundSelector equals check takes long, which makes sense considering that all nodes lie in a single level in the hierarchy, with different class names.

@mikke89 Do you know how other engines deal with this? I've tried googling possible data structures or approaches to hash the compound selector into some non-string based form that would be easier to check against? Just plainly hashing the compound selector components would probably be very collision-prone.

mikke89 · 2022-12-30T16:00:23Z

Thanks for the PR!

From what I can tell, it seems like CSS doesn't actually support the content property for normal elements, except to add images. So this would be non-standard use of it, which isn't ideal for a new property. And I don't think it's supposed to affect the DOM like here. We don't currently support ::before and ::after which would be the proper place for generated content, but I do have some larger plans for improving the node tree, which would serve as a foundation for supporting exactly these kind of features.

I wonder if we really need the unicode escape support? Our assumption is that users have complete control over their assets and environment, so it's really just a matter of storing the style sheets or documents as UTF-8? I think the backward incompatibility is an issue.

By the way, we already have a UTF8-encoder in StringUtilities.

We've made a lot of effort in optimizing the element style rule selection, see a detailed discussion here: #293. However, we've probably made less effort for the style sheet merging, which seems to be the case here. I guess it's just because I haven't really seen this pop up in any benchmarks before.

I did some more benchmarking, and I thought at least it would appear in the ElementDocument benchmark. But for me it is only around 3%. Perhaps you could design a benchmark better suited for this situation?
I guess we might get far by some hashing? At least if there are a lot of false negatives (calls to operator== that return false), perhaps measure that first.

It's hard to find detailed, good references for how other browsers deal with these things, but at least here's a pretty good reference for how they apply style rules in general. Otherwise, you'll probably have to dive into their source code.

Dakror · 2022-12-30T16:19:41Z

Oh damn i completely oversaw the fact that content does not work for real dom elements.
I'm definitely advocating for some form of unicode unescaping because otherwise from a development perspective, pasting utf-8 codepoints for some icon font icon into a document is a worse experience than using a human readable escape sequence representing the code point entry.

Yes the performance issue is at the stage of merging the style sheet and upon media-query re-evaluation.

Dakror · 2022-12-30T16:21:15Z

I've switched to use the proper library utf-8 encoder function

mikke89 · 2022-12-30T16:50:23Z

Oh damn i completely oversaw the fact that content does not work for real dom elements. I'm definitely advocating for some form of unicode unescaping because otherwise from a development perspective, pasting utf-8 codepoints for some icon font icon into a document is a worse experience than using a human readable escape sequence representing the code point entry.

Yeah, that's a good argument for unicode escapes. We'll have to wait for this until the next major version, due to the backward incompatibility. The content property also needs ::before and ::after which won't happen until a new major version.

With that said, I think the unescape conversion logic should happen at an earlier stage in the pipeline, perhaps in the StyleSheetParser or its utilities?

Also, while we're at it, we might want to add support for html unicode escapes for the same reason?

Yes the performance issue is at the stage of merging the style sheet and upon media-query re-evaluation.

Yeah, we should continue performance discussion in #400.

Dakror added 6 commits December 28, 2022 11:58

initial definitions

2342882

Property working, test succeeding

028bd7e

unicode test

60c8a90

unicode escape parser

26d29d9

additional test case

cf95a0e

fix indentation

a9e0ae4

Dakror mentioned this pull request Dec 28, 2022

Sorted StyleSheetNodes #400

Closed

better decode function

4969dae

mikke89 added the enhancement New feature or request label Dec 30, 2022

switch to library utf-8 encode function

46b481e

Dakror closed this Dec 31, 2022

Dakror mentioned this pull request Jan 6, 2023

HTML unicode escaping sequences #401

Merged

mikke89 mentioned this pull request Oct 3, 2024

Question: How to use font icon in generated elements? #679

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSS content property and unicode escape sequences #399

CSS content property and unicode escape sequences #399

Dakror commented Dec 28, 2022 •

edited

Loading

Dakror commented Dec 28, 2022

mikke89 commented Dec 30, 2022

Dakror commented Dec 30, 2022 •

edited

Loading

Dakror commented Dec 30, 2022 •

edited

Loading

mikke89 commented Dec 30, 2022 •

edited

Loading

CSS content property and unicode escape sequences #399

CSS content property and unicode escape sequences #399

Conversation

Dakror commented Dec 28, 2022 • edited Loading

Dakror commented Dec 28, 2022

mikke89 commented Dec 30, 2022

Dakror commented Dec 30, 2022 • edited Loading

Dakror commented Dec 30, 2022 • edited Loading

mikke89 commented Dec 30, 2022 • edited Loading

Dakror commented Dec 28, 2022 •

edited

Loading

Dakror commented Dec 30, 2022 •

edited

Loading

Dakror commented Dec 30, 2022 •

edited

Loading

mikke89 commented Dec 30, 2022 •

edited

Loading