Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DocumentFragment#search and #xpath #370

Closed
flavorjones opened this issue Nov 3, 2010 · 11 comments
Closed

DocumentFragment#search and #xpath #370

flavorjones opened this issue Nov 3, 2010 · 11 comments

Comments

@flavorjones
Copy link
Member

DocumentFragment#search and #xpath should work properly. The current 'test_fragment_search' should be replaced with the following test cases:

  def test_fragment_css
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.css('#content').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

  def test_fragment_search
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.search('#content').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

  def test_fragment_xpath
    frag = Nokogiri::XML::Document.new.fragment '<p id="content">hi</p>'

    p_tag = frag.xpath('.//*[@id="content"]').first
    assert_equal 'p', p_tag.name

    assert_equal p_tag, frag.children.first
  end

note that #331 and #357 might be related.

@SeanTAllen
Copy link
Contributor

I've implemented search w/ tests. However, things get weird with xpath. I emailed the list about that. Once that is worked out on the list, I'll submit the fixes for this plus for the bug that the fragment search uncovered.

@SeanTAllen
Copy link
Contributor

a couple of tests are missing... test_fragment_css and test_fragment_search dont check for depth searches of children. will add tests for those.

@SeanTAllen
Copy link
Contributor

should this apply to xpaths? if i ask for xpath search current node then searching children is wrong as it would be out of context. that would be an issue for search if it is an xpath.

@flavorjones
Copy link
Member Author

Sean Allen said on the mailing list:


I'm underway on this ( aka issue 370 ) and stumbled across something else I think might be a bug that is getting in the way of xpath support working.
I think xpath support is broken right now so I'm not going to be able to get this working without fixing the other issues...

I dont think this is issue 357 but might be related in some fashion.

require 'rubygems'
require 'nokogiri'

frag = Nokogiri::XML.fragment '<p id="header">header</p><p id="content">content</p><p id="footer">footer</p>'

puts "First we search for p: "
puts frag.xpath( ".//p" )

puts "Then we search for id header: "
puts frag.xpath( ".//*[@id='header']" )

puts "Ok... nothing there, lets see if we can find ids..."
puts frag.xpath( ".//@id" )

puts "Interesting... can we find all p's with an id?..."
puts frag.xpath( ".//p[@id]" )
puts "Nope..."

Is this a libxml bug?

more xpath weirdness...

require 'rubygems'
require 'nokogiri'

xml = "<p id='content'>hi</p>"

frag = Nokogiri::XML.fragment xml

puts "Lets find p..."
puts frag.xpath( ".//p" )
puts "Good.."

puts "Lets find @id..."
puts frag.xpath( ".//@id" )
puts "Good.."

puts "Lets find p on first child..."
puts frag.children.first.xpath( ".//p" )
puts "Not so good.."

puts "Lets find @id on first child..."
puts frag.children.first.xpath( ".//@id" )
puts "Interesting, something seems to be happening in Node.xpath..."

@flavorjones
Copy link
Member Author

Hi Sean,

I appreciate your work on this issue! If I don't get back to you right away, it's because I have a day job, not because I don't care. :)

There are definite weirdnesses going on. I need to track them down. If you want to continue working, please feel free, but understand that I can't give you good guidance without understanding the root issue, and if I understood that I'd fix it. :) :)

I will be working on this.

@SeanTAllen
Copy link
Contributor

I've gotten at least as far as knowing that the 'more xpath weirdness' is XPathContext related in some fashion but haven't gotten into the libxml code to try and see why... small demonstration in this gist ( I am incapable of using gh markdown for some reason ):

https://gist.github.com/663525

@SeanTAllen
Copy link
Contributor

Follow up question... I have search for css which was my primary immediate concern..
if I do a quick hack to keep xpaths processing 'in the old way' so that they don't get broken, can I pull request over the just search for css and get that in a hopeful soon version of nokogiri? The project I'm working on requires that fix to function so I'm highly interested in getting that out there asap.

I'm also going to need xpath support soon, so I'll keep banging away at that although I might be a little slow as I'm not real familiar with the code at this point ( but I am way more familiar than I was 24-48 hours ago ).

@flavorjones
Copy link
Member Author

If you need something immediately, my recommendation would be to monkeypatch NokogirI::XML::DocumentFragment with your current fix.

Please feel free to send pull requests.

@rtomayko
Copy link

Another strange behavior that seems somewhat related:

require 'pp'
require 'nokogiri'

doc = Nokogiri::HTML::DocumentFragment.parse("hello world")
pp doc.search('text()')
# => []

doc = Nokogiri::HTML::DocumentFragment.parse("<div>hello world</div>")
pp doc.search('text()')
# => [#<Nokogiri::XML::Text:0x83423adc "hello world">]

DocumentFragment should probably have the same basic behavior as an element node when it contains text, yeah?

@flavorjones
Copy link
Member Author

@rtomayko - Yes, definitely related. You can work around by using more specific xpath:

doc = Nokogiri::HTML::DocumentFragment.parse("hello world")
pp doc.xpath('./text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]

doc = Nokogiri::HTML::DocumentFragment.parse("<div>hello world</div>")
pp doc.xpath('./div/text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]
pp doc.xpath('.//text()')
# => [#<Nokogiri::XML::Text:0x3ff06b9e25c8 "hello world">]

@flavorjones
Copy link
Member Author

Folding this into #572, the underlying issue is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants