Blank spaces being accepted #83

badosu · 2019-05-21T01:00:30Z

After #74 got merged, the following happens:

pry> new_record.url
=> "http://this is not a URL"
pry> new_record.valid?
=> true

It was merged as a fix for #73. The issue complains that a certain url should be valid, but it contains an unescaped non-ASCII character (ç) which, per rfc3986, is invalid.

The text was updated successfully, but these errors were encountered:

minifast-winston · 2019-10-17T21:11:40Z

Made a pull request (#85) to revert the URL encoding.

minifast-winston · 2019-10-21T17:58:36Z

It looks like this gem doesn't properly validate URL paths in some cases. For example, validating http://example.com/some/? doodads=ok against the default validator says the URL is valid, when it should be invalid because of the space.

It seems to be a combination of the presence of both the ? and characters in the path.

minifast-winston · 2019-10-21T18:12:51Z

It looks like this is behavior of Ruby's URI.parse(). http://ex ample.com throws an error, but http://example.com/? fun encodes to http://example.com/?%20fun.

It seems that either their is a problem with URI.parse(), or that its authors do not intend for it to validate querystrings - which seems odd.

minifast-winston · 2019-10-21T18:23:49Z

It looks like this is part of the implementation of Generic.parse() (https://github.com/ruby/ruby/blob/trunk/lib/uri/generic.rb#L75)

# At first, tries to create a new URI::Generic instance using
# URI::Generic::build. But, if exception URI::InvalidComponentError is raised,
# then it does URI::Escape.escape all URI components and tries again.

minifast-winston · 2019-10-21T18:33:49Z

It seems like relying on URI.parse() alone for validation isn't ideal since the authors have decided it should try to encode the URL in event of an exception. This means that invalid strings will pass validation.

minifast-winston · 2019-10-21T21:14:43Z

Added another commit in the pull request to handle the spaces in querystring issue. In addition to the usual checks, we now match the raw url value against the URI::regexp pattern for its scheme.

minifast-winston mentioned this issue Oct 17, 2019

Don't encode URLs during validation #85

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blank spaces being accepted #83

Blank spaces being accepted #83

badosu commented May 21, 2019 •

edited

Loading

minifast-winston commented Oct 17, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

Blank spaces being accepted #83

Blank spaces being accepted #83

Comments

badosu commented May 21, 2019 • edited Loading

minifast-winston commented Oct 17, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

minifast-winston commented Oct 21, 2019

badosu commented May 21, 2019 •

edited

Loading