Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathname special case on parse fixes #2110, #1821 #2111

Merged

Conversation

doriantaylor
Copy link
Contributor

@doriantaylor doriantaylor commented Nov 12, 2020

What problem is this PR intended to solve?

The mangled behaviour of Pathname objects; closing #2110 and #1821.

Have you included adequate test coverage?

I hope so? The logic on Nokogiri.parse is a little funky but it seemed a little funky to begin with.

Does this change affect the C or the Java implementations?

From what I can tell this change operates above the underlying implementation.

@codeclimate
Copy link

codeclimate bot commented Nov 12, 2020

Code Climate has analyzed commit 88479cb and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (80% is the threshold).

This pull request will bring the total coverage in the repository to 94.3% (0.0% change).

View more on Code Climate.

@doriantaylor
Copy link
Contributor Author

lol code climate you are dumb

@AppVeyorBot
Copy link

@flavorjones
Copy link
Member

Don't worry about Code Climate, I marked it as WONTFIX. Looking into the other build failure now.

@flavorjones
Copy link
Member

The failing test is a flake, I kicked it and it should go green shortly. Looking at the proposed changes now.

@doriantaylor
Copy link
Contributor Author

doriantaylor commented Nov 12, 2020

I'm kinda lukewarm on having three distinct sites where is_a? Pathname is tested but I fear consolidating it would entail rethinking the overall parser dispatch logic. That said I can probably get rid of the one in nokogiri.rb. As for Code Climate, I'm just trolling yr robots. ;)

@flavorjones
Copy link
Member

@doriantaylor I agree, and the reason I haven't moved yet on my promise in #1821 to write my own PR is because I intended to overhaul the parsing logic and try to consolidate it into fewer places; and I haven't had the time; and it's been a low priority because of the ease of the workaround.

I'm cleaning up a couple of issues and will force-push to this branch in a bit along with an explanation of what I'm changing. Hope that's OK, and I'll keep your name on the commits.


if string_or_io.is_a?(Pathname)
# resolve the Pathname to the file and open it as an IO object, see #2110
string_or_io = string_or_io.expand_path.open
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenderlove can I get your eyes on this? I'm pretty sure File objects autoclose when they are GCed (and I tested that), but figured it was worth asking explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(My inclination was to throw expand_path in there because you can have legitimate stuff like Pathname('~/foo.xml') and the overhead is minimal.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh btw putting the url assignment after the Pathname test was deliberate because Pathname does not respond to path but Pathname#open will produce a File which will respond to path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@doriantaylor I don't have any concerns about #expand_path, I just want to confirm with @tenderlove that the File object we're creating will get cleaned up and file descriptors will get closed during object finalization.

Your url comment makes sense -- what confused me was that after your initial patch set (which I've squashed), it was done one way in HTML::Document and the other way in XML::Document. I'll add some test coverage to make sure it's working the way we expect.

@AppVeyorBot
Copy link

doriantaylor and others added 2 commits November 12, 2020 00:56
- no need to resolve Pathname in Nokogiri.parse
- added test coverage for XML::Document and HTML::Document#parse
- moved resolution of Pathname to after `url` is set in HTML::Document#parse
- open files with implict "r" because "b" assumes ASCII-8BIT encoding

Part of sparklemotion#2110, sparklemotion#1821
- added test coverage for objects that respond to `#path`
- added test coverage for `Pathname` specifically

Related to sparklemotion#2110.
@AppVeyorBot
Copy link

@flavorjones flavorjones merged commit f7edd3a into sparklemotion:master Nov 13, 2020
@flavorjones
Copy link
Member

@doriantaylor Merged! Thanks for your contribution!

@doriantaylor
Copy link
Contributor Author

Nice! Thanks for merging so promptly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants