Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML::Document#xpath inconsistent use of assumed namespaces #357

Closed
jrochkind opened this issue Oct 26, 2010 · 6 comments
Closed

XML::Document#xpath inconsistent use of assumed namespaces #357

jrochkind opened this issue Oct 26, 2010 · 6 comments

Comments

@jrochkind
Copy link
Contributor

Report as script, as requested. Ooh, ruby code doesn't paste in to a ticket well, gist instead:

http://gist.github.com/647802

@flavorjones
Copy link
Member

The problem with this code isn't about implicit namespaces, it's about the query's path.

Here's some code:

    # your original example
    xml = %Q{
      <root xmlns:example="http://example.org">
        <example:element>
          <example:child>
        </example:element>
      </root>
    }
    doc = Nokogiri::XML xml

    puts doc.root.xpath("example:element").size   #=> 1
    puts doc.root.xpath("./example:element").size #=> 1
    puts doc.root.xpath("//example:element").size #=> 1

    puts doc.xpath("example:element").size   #=> 0
    puts doc.xpath("./example:element").size #=> 0
    puts doc.xpath("//example:element").size #=> 1

    puts "---"

    # move example:element from root into a child of the root
    xml = %Q{
      <root xmlns:example="http://example.org">
        <div>
          <example:element>
            <example:child>
          </example:element>
        </div>
      </root>
    }
    doc = Nokogiri::XML xml

    puts doc.root.xpath("example:element").size   #=> 0
    puts doc.root.xpath("./example:element").size #=> 0
    puts doc.root.xpath("//example:element").size #=> 1

    puts doc.xpath("example:element").size   #=> 0
    puts doc.xpath("./example:element").size #=> 0
    puts doc.xpath("//example:element").size #=> 1

If the query doesn't start with any explicit context (e.g., "/", "//" or ".") then a search on a node will implicitly search from the context node. A query on a document without an explicit context is simply a failure to write a good xpath query.

You might ask, "why doesn't an xpath query on a document use the root node as its context node?" Let me get back to you on that.

@jrochkind
Copy link
Contributor Author

Aha! You're absolutely right, thanks, sorry for the false bug. Namespaces are fine, rather I got confused about xpath on document not using the root node as context (which I'm not really sure if it should or not, but if it's not the developer like me has to realize that).

So, wait, any xpath query on document without an explicit context (in fact, any xpath query on document that doesn't begin //, I think) will always return the empty set, right? In that case, yeah, it seems like it might as well use the #root as the context node. Or raise an exception for any xpath that doesn't begin //. But yeah, nothing to do with namespaces.

@flavorjones
Copy link
Member

I'm investigating the "implicit context" for documents. Similar problem exists for fragments. I'll probably open a new issue for it, and close this one. Will let you know.

@jrochkind
Copy link
Contributor Author

Hmm, more complicated. It's not true, as I suggested before, that ANY query without a context on document will return empty set. There is ONE query that won't -- on the root element itself.

doc.xpath("root") => works
or for that matter
doc.xpath("*") => single root node returned

So, I dunno, I honestly have no idea what the least confusing behavior to the developer is here, if the implicit context were changed that query would stop working, breaking some backwards compatibility for anyone depending on it. I do know this isn't the first time I've gotten confused about this, yet failed to remember what I learned from the last time(s) I got confused about it, and i'm probably not alone, but breaking backwards compatibility is probably bad. I dunno.

@flavorjones
Copy link
Member

OK, after writing some tests to characterize current behavior, I believe that Nokogiri is doing the right thing. From a document, we need to be able to access the root node, so:

doc.xpath("./root")

needs to return the root node. But the same XPath query from the root node:

doc.root.xpath("./root")

should not return the root node. Therefore, the document context is different from the root node's context.

I'm sure you have questions. Fire away.

@jrochkind
Copy link
Contributor Author

Nope, no questions actually, I think you are quite right. Thanks, sorry for the mistaken bug report.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants