-
Notifications
You must be signed in to change notification settings - Fork 79
Add "associated data" to support namespaces #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is true for parsing the HTML syntax, but it's not quite true within Servo. For example we need to recognize that <?xml version="1.0" encoding="UTF-8"?>
<html xmlns:bleh="http://www.w3.org/1999/xhtml">
<bleh:img src="http://www.rust-lang.org/logos/rust-logo-256x256-blk.png" />
</html> contains an match node.name() {
atom!(img) => ... will produce pretty complicated machine code. (And even getting the macro to work at all will be tricky.) A name that matches could be a static/inline atom or it could be a dynamic atom with any prefix or none. The best hope is to destructure For this reason I think atoms should include the namespace URI, which is canonical and part of the atom's identity, but not the prefix, which is a detail about how that URI was obtained. html5ever will have its own "atom with prefix" type which Servo can also use. A remaining problem is that the default namespace for attributes is |
I don't think this design makes sense conceptually for prefixes; prefixes are completely independent of the namespace outside the HTML parser. |
I’m also not convinced about this proposal, and we’ve been doing ok without it all this time. And with #178 string-cache is becoming a more general-purpose library that known nothing about HTML or the DOM. Closing as wontfix. |
The conceptual model for
Atom
becomesAny atom where
assoc.is_some()
is interned in the dynamic table. html5ever and Servo will use this to store namespace and prefix (see #26), withNone
representing the HTML namespace and an empty prefix. This shrinks a qualified name to a single word, without any performance penalty to HTML names.One complication is that we need two different notions of equality on this associated data. An XML document can contain nodes which use different prefixes to produce the same qualified name. We can't combine these in the interning table, because it's possible to read the original prefix out of the DOM. But when we're comparing atoms for equality we need to ignore the prefix.
Also, the global interning table needs to be aware of the type
T
somehow. We could have a set of tables indexed by the typeT
, hopefully resolving the polymorphism at compile time. (It should really be more like the entire crate is parametrized on T.) In C++ I would use a static member variable in a class template, but I don't know of any analogous mechanism in Rust (I checked andstatic mut
s inside generic functions get combined.)The text was updated successfully, but these errors were encountered: