Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying first result with chained xpth return entire array #1156

Closed
davidsulc opened this issue Sep 7, 2014 · 2 comments
Closed

Querying first result with chained xpth return entire array #1156

davidsulc opened this issue Sep 7, 2014 · 2 comments

Comments

@davidsulc
Copy link

Hopefully, this is in fact unexpected behavior and not me misunderstanding xpath...

This might in some way be related to #370

#!/usr/bin/ruby

require 'nokogiri'

html_doc = <<DOC
    <span>Stuffe we don't want</span>
    <h2>Start selection</h2>
        <p>Content we're interested in</p>
        <p>More content</p>
    <h2>End selection</h2>
        <p>Not intersted in this bit</p>
    <h2>Another title</h2>
        <p>Not intersted in this bit either</p>
DOC

page = Nokogiri::HTML(html_doc)
selection_start = page.xpath('//h2[ . = "Start selection"]/following::node()')
content = selection_start.xpath('./following-sibling::h2[1]')
puts content.inspect

The before last line returns the entire results array for following-sibling::h2 instead of (as I would expect) the first following sibling match. Interestingly, using xpath following-sibling::h2[2] behaves as expected and returns the second following sibling match.

The reason I'm attempting to use this, is I want to select the content from the h2 with text "Start selection", but only until the next h2 element.

@knu
Copy link
Member

knu commented Nov 6, 2014

I think the current behavior is correct. //h2[ . = "Start selection"]/following::node() will match all following nodes of the first h2 element including the third p element, which has just one following sibling with a local-name of h2, which is the last h2 element. So, the last h2 element is indeed following-sibling:h2[1] for the third p element that matches //h2[ . = "Start selection"]/following::node().

I think changing the first XPath expression to //h2[ . = "Start selection"]/following::node()[1] will fix your problem.

@knu knu closed this as completed Nov 6, 2014
@davidsulc
Copy link
Author

Thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants