matching URL with .* less correct than with .+ #854

hallvors · 2015-12-11T00:56:57Z

In this regular expression:
https://github.com/webcompat/webcompat.com/blob/master/webcompat/helpers.py#L368
when we do (.*) we do not actually intend to look for "0 or more characters" - a 0 characters long URL isn't a URL. We should use + instead of *.

The text was updated successfully, but these errors were encountered:

miketaylr · 2015-12-11T14:18:06Z

Is a 1 character long URL a valid URL? If not, I'm not sure it makes much of a difference, but a good first patch indeed. 😜

hallvors · 2015-12-11T15:10:41Z

Well, there's code elsewhere to handle URLs without a http(s):// prefix I think - certainly at Opera "t" was a URL and a pretty important one ;)
We can always do {3,} or something like that to require three or more characters but who knows what will be a minimum valid URL length down the road? I'm however pretty sure that a 0 character long string is going to remain invalid. It's good with some certainties in life :D

hallvors added the prio: good first bug label Dec 11, 2015

miketaylr closed this as completed in 99df1c2 Mar 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

matching URL with .* less correct than with .+ #854

matching URL with .* less correct than with .+ #854

hallvors commented Dec 11, 2015

miketaylr commented Dec 11, 2015

hallvors commented Dec 11, 2015

matching URL with .* less correct than with .+ #854

matching URL with .* less correct than with .+ #854

Comments

hallvors commented Dec 11, 2015

miketaylr commented Dec 11, 2015

hallvors commented Dec 11, 2015