Skip to content

Fix #44 SI-9060 XML 5th edition name characters #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 24, 2017

Conversation

ashawley
Copy link
Member

Refactored PR from #44 by @som-snytt to only make minimal changes to tokenization code, namely the acceptance of the "MIDDLE DOT" character, #xB7, from the original ticket.

To follow the standard, a colon should be allowed to start a name. It seems that a colon could always start an XML name -- in addition to a letter or an underscore. Not sure why it was avoided, but it's probably pretty rare, anyway.

https://www.w3.org/TR/2008/REC-xml-20081126
https://www.w3.org/TR/2006/REC-xml-20060816
https://www.w3.org/TR/2004/REC-xml-20040204
https://www.w3.org/TR/2000/REC-xml-20001006
https://www.w3.org/TR/1998/REC-xml-19980210

@som-snytt
Copy link
Contributor

xml parsing used to be shared with compiler, or cut & pasted from it.

The compiler has to know when to enter "xml mode". (To answer your question why colon was disallowed.)

val <: = 42
val `<:` = 42

I don't see why you'd want to make something minimally correct with a minimal change.

@ashawley
Copy link
Member Author

Thanks for the feedback, @som-snytt.

I have added a new issue, #94, for looking further at names starting with a colon. I added some examples, but please add more so I can understand how the colon should ideally work with XML names.

Thanks again,
Aaron

@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from 695fa5e to de786b1 Compare June 12, 2016 22:22
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch 2 times, most recently from 578d372 to b26bc19 Compare September 20, 2016 22:08
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from b26bc19 to 457972e Compare October 18, 2016 15:45
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from 457972e to 8943924 Compare December 1, 2016 20:54
@SethTisue
Copy link
Member

I don't see why you'd want to make something minimally correct with a minimal change.

@som-snytt I don't understand what you mean...?

@som-snytt
Copy link
Contributor

@SethTisue I don't understand why my other PR was cannibalized instead of merged. I think my comment meant, Why take correctness piecemeal instead of whole hog?

@ashawley
Copy link
Member Author

ashawley commented Feb 4, 2017

@som-snytt It's been a while, but I recall that the minimal changes pass your unit test you provided. I was attempting to improve the likelihood that a fix would get merged, by making a more minimal change based on your work. Evidently, that didn't speed anything up.

@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch 4 times, most recently from 6af8f0a to 986874f Compare February 7, 2017 22:21
@SethTisue
Copy link
Member

needs rebase because of tests moving around

@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from 986874f to d1ba890 Compare February 15, 2017 23:53
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch 5 times, most recently from 55e9887 to 74f8b7a Compare April 28, 2017 18:28
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch 4 times, most recently from 9b65348 to 445ac6c Compare May 12, 2017 10:14
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from 445ac6c to b4e883f Compare May 24, 2017 15:27
som-snytt and others added 2 commits May 24, 2017 11:48
* jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameChar):
Add middle dot, #xB7, as specified at [4a] of XML 1.0 5th edition.

* jvm/src/test/scala/scala/xml/XMLTest.scala (t9060): New test.

* jvm/src/test/scala/scala/xml/parsing/ConstructingParserTest.scala
(t9060): New test.
* jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameChar):
Add middle dot.

* jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameStart):
Allow colon to start a name.

* jvm/src/test/scala/scala/xml/UtilityTest.scala (isNameStart): Colon
should be able to start a name.
@ashawley ashawley force-pushed the fix-44-si-9060-xml-5ed-names branch from b4e883f to ee91a55 Compare May 24, 2017 15:49
@ashawley ashawley merged commit 7d0e442 into scala:master May 24, 2017
@ashawley ashawley deleted the fix-44-si-9060-xml-5ed-names branch June 7, 2017 20:07
@ashawley ashawley mentioned this pull request Oct 9, 2017
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants