You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#!/usr/bin/ruby
require 'nokogiri'
html_doc = <<DOC
<span>Stuffe we don't want</span>
<h2>Start selection</h2>
<p>Content we're interested in</p>
<p>More content</p>
<h2>End selection</h2>
<p>Not intersted in this bit</p>
<h2>Another title</h2>
<p>Not intersted in this bit either</p>
DOC
page = Nokogiri::HTML(html_doc)
selection_start = page.xpath('//h2[ . = "Start selection"]/following::node()')
content = selection_start.xpath('./following-sibling::h2[1]')
puts content.inspect
The before last line returns the entire results array for following-sibling::h2 instead of (as I would expect) the first following sibling match. Interestingly, using xpath following-sibling::h2[2] behaves as expected and returns the second following sibling match.
The reason I'm attempting to use this, is I want to select the content from the h2 with text "Start selection", but only until the next h2 element.
The text was updated successfully, but these errors were encountered:
I think the current behavior is correct. //h2[ . = "Start selection"]/following::node() will match all following nodes of the first h2 element including the third p element, which has just one following sibling with a local-name of h2, which is the last h2 element. So, the last h2 element is indeed following-sibling:h2[1] for the third p element that matches //h2[ . = "Start selection"]/following::node().
I think changing the first XPath expression to //h2[ . = "Start selection"]/following::node()[1] will fix your problem.
Hopefully, this is in fact unexpected behavior and not me misunderstanding xpath...
This might in some way be related to #370
The before last line returns the entire results array for
following-sibling::h2
instead of (as I would expect) the first following sibling match. Interestingly, using xpathfollowing-sibling::h2[2]
behaves as expected and returns the second following sibling match.The reason I'm attempting to use this, is I want to select the content from the
h2
with text "Start selection", but only until the nexth2
element.The text was updated successfully, but these errors were encountered: