-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PrettyPrinter strips newlines from text in nodes, even pcdata #4303
Comments
Imported From: https://issues.scala-lang.org/browse/SI-4303?orig=1 |
@axel22 said: |
Francois Armand (fanf) said: So bad that the doPreserve method is private... |
Michael Beckerle (mbeckerle.dfdl) said: XML 1.1 spec is very clear that if you insert a CR into text using via an "entity value literal" then that character must be preserved. This suggests to me that the only reasonable implementation would not do any whitespace normalization on output, as all the various unicode line-ending characters can be inserted by this same mechanism. This from the XML 1.1 spec (this clarification is not in the original XML 1.0 spec, but I suggest it is the "right thing" to do for XML 1.0 implementations anyway) 2.3 Common Syntactic Constructs This section defines some symbols used widely in the grammar. S (white space) consists of one or more space (#x20) characters, carriage returns, line feeds, or tabs. Note:The presence of #xD in the above production is maintained purely for backward compatibility with the First Edition. As explained in 2.11 End-of-Line Handling, all #xD characters literally present in an XML document are either removed or replaced by #xA characters before any other processing is done. The only way to get a #xD character to match this production is to use a character reference in an entity value literal. |
@som-snytt said: scala> import xml._
import xml._
scala> val n = new PCData("hi there.")
n: scala.xml.PCData = <![CDATA[hi there.]]>
scala> val p = new PrettyPrinter(80,5)
p: scala.xml.PrettyPrinter = scala.xml.PrettyPrinter@c86b9e3
scala> p format n
res0: String = <![CDATA[hi there.]]>
scala> val n = new PCData("""hi there,
| is there any way to fix this?""")
n: scala.xml.PCData =
<![CDATA[hi there,
is there any way to fix this?]]>
scala> p format n
res1: String =
<![CDATA[hi there,
is there any way to fix this?]]>
scala> p format <a>{n}</a>
res2: String = <a><![CDATA[hi there, is there any way to fix this?]]></a> Footnote, you don't get incomplete parses from embedded Scala blocks: scala> <a>{ PCData("""
<console>:1: error: in XML literal: expected end of Scala block
<a>{ PCData("""
^ |
@som-snytt said (edited on Dec 23, 2014 9:46:38 PM UTC): scala> val xx = <a>{ PCData("Here is some very long text\nto split.") }</a>
xx: scala.xml.Elem =
<a><![CDATA[Here is some very long text
to split.]]></a>
scala> val pp = new PrettyPrinter(1000,2)
pp: scala.xml.PrettyPrinter = scala.xml.PrettyPrinter@13275d8
scala> pp format xx
res7: String = <a><![CDATA[Here is some very long text to split.]]></a>
scala> val pp = new PrettyPrinter(10,2)
pp: scala.xml.PrettyPrinter = scala.xml.PrettyPrinter@673919a7
scala> pp format xx
res8: String =
<a>
<![CDATA[Here is some very long text
to split.]]>
</a>
scala> val pp = new PrettyPrinter(2,2)
pp: scala.xml.PrettyPrinter = scala.xml.PrettyPrinter@41853299
scala> pp format xx
res9: String =
"<a><![CDATA[Here is some very long text to split.]]></a>
"
|
Michael Beckerle (mbeckerle.dfdl) said: |
@som-snytt said: Maybe a student co-majoring in History. The "digital humanities" are huge these days. |
@SethTisue said: Interested community members: if you consider this issue significant, feel free to open a new issue for it on GitHub, with links in both directions. |
Michael Beckerle (mbeckerle.dfdl) said: |
=== What steps will reproduce the problem ===
The text was updated successfully, but these errors were encountered: