-
Notifications
You must be signed in to change notification settings - Fork 0
Typed XML
Scala brought XML into the language which, being a controversial feature, has been used to great advantage for *ML-intensive tasks (Lift Framework being the most outstanding example).
Still, the current feature has some important shortcomings:
-
No XML validation: well-formedness is the only check performed by the compiler. Applications requiring specific XML tags or grammar (most, I would say) can only detect programmer errors at test- or run-time.
-
Bound to the scala.xml classes, which at least part of the scala community doesn't seem very happy with.
-
No typing: since all XML literals and expressions belong to the same scala.xml classes, there is effectively no type checking (one type == no types).
This is a proposal to improve the situation while maintaining backward compatibility.
The proposed changes can be separated in 4 sets of functionality:
-
Decouple the compiler from a specific XML representation
Instead of generating code to construct scala.xml objects, the compiler generates calls to a provided "unmarshaller" object, which will do the construction -- or whatever is appropriate to the application. The programmer can provide his unmarshaller via a scala processing instruction (e.g.
<?scala new my.Unmarshaller()?>
or similar).More:
Declaring and using unmarshallers -- processing instruction or implicit? If PI: which syntax?
Unmarshaller API -- lexical, semantical or keep current mix? how to handle attributes? Is "unmarshaller" the right name?
Implementation -- discussion of implementation details. Plug-in or patch?
Status: experimentally implemented for literals and patterns. Backward compatibility implemented (all tests pass, including quite a few new ones) but not verifiable by reading code or changes. There are some serious implementation detail leaks in compiler error messages.
-
Allow unmarshallers to provide type-based XML validation and type checking
If the methods used to call the unmarshaller contain the names of tags, attributes, and namespaces, the unmarshaller will be able to deliver different types of objects depending on the xml and its structure. Any regular tree grammar can be enforced by creating one class for each non-terminal, one method for each terminal. The power of this schema for XML validation is close to that of RELAX, except that implementing interleaving is difficult to impossible -- though I am sure that clever use of generics and type algebra will enable even that problem to be resolved.
More:
Unmarshaller API -- method names for applyDynamic
Handling of error messages -- need ideas to stop the leak!
Status: implemented for literals and patterns. Namespaces not yet supported (but trivial to do). Requires using the currently experimental scala.Dynamic / appyDynamic feature. Verification against Scala Language Specification is possible; verification of backward compatibility more difficult. There is a performance issue with stack space at the Typer (thre is a plan to fix this). Another problem is that compiler error messages seriously leak implementation details... they already do in 2.9: try
<a x={1}/>
(plan to fix this later one unclear). -
Support multiple unmarshallers via namespace prefixes
Extended syntax for the scala processing instruction would allow declaring different unmarshallers for different name prefixes (or namespace URIs).
Declaring and using unmarshallers -- which syntax? prefixes or URIs?
Status: idea stage.
-
Real validation
A schema can be (optionally) associated to each unmarshaller via annotations. The compiler can then use it to validate the XML as it compiles it. This will make validation more powerful and, assuming correctness of the schema-unmarshaller pair, stop the implementation details leaking through compiler error messages.
Validation -- RELAX or other? How to handle scala expressions and patterns?
Status: idea stage.
Use Anti-XML non-violently
<?scala com.codecommit.antixml.Unmarshaller ?>
Status: idea stage.
val chance= <i18n:msg name="chance">You have won second prize in a beauty contest, collect <currency/>.</msg>
println(chance(10.00))
With strong typing via HLists
Status: prototyped.
<table><td>Amount</td><td>10.00</td></table>
^
<td> must appear inside a <tr>...</tr> block
This would require schema validation.
Status: idea stage.
Status: not even an idea.