Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify regex and test on Turkish locale #270

Merged
merged 2 commits into from
Jul 14, 2015

Conversation

kant2002
Copy link
Contributor

I remove RegexOptions.IgnoreCase since somehow this break the whole recognition on Turkish locale and I don't see any point have it. We already define both uppercase and lower case letters in the regex. Alternative for A-Za-z is to use \w and for 0-9 use \d, so maybe this will lead to more reliable version recognizer. If strict fragment rules needed they should be placed to RegexOptionalFragment, but I was unable to make this work reliable so far.

See #253

@billti
Copy link
Member

billti commented Jul 10, 2015

Great job figuring this out! The docs do actually call out the ignoreCase can cause issues on Turkish locale (see https://msdn.microsoft.com/en-us/library/yd1hzczs.aspx#Invariant ). Using \w or \d would include non-ASCII characters in a .NET RegEx unless ECMAScript mode is enabled, so using the character ranges explicitly is still the best way to go here I think. Other comments on the code.

+ "(?:-(?<prerelease>[…0-9A-Za-z-]+(\\.[…0-9A-Za-z-]+)*))?"
+ "(?:\\+(?<buildmetadata>[…0-9A-Za-z-]+(\\.[…0-9A-Za-z-]+)*))?$",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
+ "(?:-(?<prerelease>[…0-9A-Za-z-.]+))?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get adding the "." to allow multi-part pre-release tags (i.e. "1.0.1-beta.1"), but doesn't this also allow invalid formats such as "1.0.1-..5.."? Also, I though the "-" had to be immediately after the opening "[" or before the closing "]" to be treated as a regular character rather than the range operator.

@mousetraps
Copy link
Contributor

Nice job tracking this down! In addition to Bill's comments... let's go ahead and catch this exception during GetCatalog (and print to the output window) so that people aren't left empty handed if parsing fails (as we do with ArgumentException)
https://github.com/Microsoft/nodejstools/blob/master/Nodejs/Product/Npm/SPI/NpmGetCatalogCommand.cs#L143-L148

@kant2002 kant2002 force-pushed the issue-253 branch 3 times, most recently from 3b30052 to 31948ef Compare July 12, 2015 09:59
@kant2002
Copy link
Contributor Author

Probably was too late to debug this issue, now patch is much more clean. I just have to remove RegexOptions.IgnoreCase
@mousetraps and @billti please review.

builder.LatestVersion = SemverVersion.Parse(latestVersion);
try {
builder.LatestVersion = SemverVersion.Parse(latestVersion);
} catch (SemverVersionFormatException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also print out a message here - we want to know about the issue, but don't want it to block the user.

@mousetraps
Copy link
Contributor

minor comment, but lgtm other than that!

@kant2002 kant2002 closed this Jul 14, 2015
@kant2002 kant2002 deleted the issue-253 branch July 14, 2015 03:10
@kant2002 kant2002 restored the issue-253 branch July 14, 2015 03:10
@mousetraps mousetraps reopened this Jul 14, 2015
@mousetraps
Copy link
Contributor

@kant2002 I presume you mistakenly closed this PR since it looks like you restored your branch too, so I went ahead ad re-opened it 😄

kant2002 added 2 commits July 14, 2015 10:49
1. I remove RegexOptions.IgnoreCase since somehow this break the whole recognition on Turkish locale and I don't see any point have it. We already define both uppercase and lower case letters in the regex. Alternative for A-Za-z is to use \w and for 0-9 use \d, so maybe this will lead to more reliable version recognizer.
2. Catch SemverVersionFormatException inside NpmGetCatalogCommand
@kant2002
Copy link
Contributor Author

True. I just add reporting of the message in case of invalid package semversion

mousetraps added a commit that referenced this pull request Jul 14, 2015
Fix #253 Simplify regex and test on Turkish locale
@mousetraps mousetraps merged commit 8bacc39 into microsoft:master Jul 14, 2015
@kant2002 kant2002 deleted the issue-253 branch July 14, 2015 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants