Skip to content

scala.xml.Utility.trim() doesn't properly handle adjacent Text nodes #3062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
scabug opened this issue Feb 16, 2010 · 6 comments
Closed

scala.xml.Utility.trim() doesn't properly handle adjacent Text nodes #3062

scabug opened this issue Feb 16, 2010 · 6 comments

Comments

@scabug
Copy link

scabug commented Feb 16, 2010

if Text("My name is ") followed by Text("Harry") the space following the word "is" will be incorrectly trimmed out. Adjacent Text nodes need to be combined before whitespace is removed.

scala> import scala.xml._
import scala.xml._

scala> <div>{Text("My name is ")}{Text("Harry")}</div>
res0: scala.xml.Elem = <div>My name is Harry</div>

scala> Utility.trim(res0)
res1: scala.xml.Node = <div>My name isHarry</div>

This is important when modifying XML and then trimming it. For example we might start with

My name is user:name/
and then replace the user:name/ Elem with "Harry" thus leading to the adjacent Text nodes.

@scabug
Copy link
Author

scabug commented Feb 16, 2010

Imported From: https://issues.scala-lang.org/browse/SI-3062?orig=1
Reporter: Harry Heymann (harryh)

@scabug
Copy link
Author

scabug commented Mar 16, 2012

@acruise said:
Is it ever important to retain the separateness of the Text nodes, or is it OK to coalesce them?

@scabug
Copy link
Author

scabug commented Mar 19, 2012

@acruise said (edited on Mar 19, 2012 4:27:24 AM UTC):
So, just to be clear, what we want is that when an element contains more than one consecutive text node, the first one should only trim at the beginning, the last one only at the end, and the middle ones should be left alone?

If there are two text nodes, should the left one be trimmed at the beginning and the right one at the end?

Should singleton text nodes be trimmed at both beginning and end?

@scabug
Copy link
Author

scabug commented May 2, 2014

Michael Beckerle (mbeckerle.dfdl) said (edited on May 2, 2014 1:27:22 AM UTC):
I think the semantics that were intended for trim are "collapse the whitespace". That means you can't traverse the tree node by node making decisions about one at a time. You have to examine sequences of text nodes.

  @Test def testTrim() {
    val a = new Text("a")
    val sp = new Text(" ")
    val b = new Text("b")
    val data = <data>{ Seq(a, sp, b) }</data>
    println(data) // prints <data>a b</data>
    val trimmed = Utility.trim(data)
    println(trimmed) // prints <data>ab</data> - wrong. 
  }

You can't trim anything off anything here.

@scabug
Copy link
Author

scabug commented Jul 17, 2015

@SethTisue said:
The scala-xml library is now community-maintained. Issues with it are now tracked at https://github.com/scala/scala-xml/issues instead of here in the Scala JIRA.

Interested community members: if you consider this issue significant, feel free to open a new issue for it on GitHub, with links in both directions.

@scabug scabug closed this as completed Jul 17, 2015
@scabug
Copy link
Author

scabug commented Jul 29, 2015

Michael Beckerle (mbeckerle.dfdl) said:
Please see scala/scala-xml#73 on GitHub scala-xml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant