Skip to content

Fix feet inches #803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

Conversation

tmilnthorp
Copy link
Collaborator

I think we need to fix #794 by constructing the Regex with the separators.

This now passes except for two tests:

  • ParseWithCultureUsingDotAsThousandSeparators_ThrowsExceptionOnInvalidString actually now passes (more forgiving regex parses successfully)
  • ParseLengthToMetersUsEnglish with 1e-3 km fails (regex doesn't handle engineering notation).

I feel it's closer, but we should handle engineering notation.

@tmilnthorp tmilnthorp mentioned this pull request Jun 1, 2020
new object[]{"-1′1″", -1.08333333, new CultureInfo("en-US")}, // Without space
new object[]{"-1 ft 1 in", -1.08333333, new CultureInfo("en-US")},
new object[]{"-1ft 1in", -1.08333333, new CultureInfo("en-US")},
new object[]{"1’000′", 1000, new CultureInfo("de-CH")}, // Feet only, with seperator
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new object[]{"1’000′", 1000, new CultureInfo("de-CH")},             // Feet only, with seperator
new object[]{"1’000′ 6\"", 1000.5, new CultureInfo("de-CH")},       // Normal form, using separators for culture

Are the new tests with culture specific separators. The rest are all en-US format so being explicit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgrawher also pointed out that in some Windows/Linux/Mac versions, the apostrophe symbol can vary for the same culture. But, unless we actually run into this issue on our own machines and the AppVeyor VMs, then I think a hard coded string is better. I really don't expect this character to change much over time for a given culture unless they are fixing a consistency bug in one of the OS versions, but I am merely speculating.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of reviewing this PR, it would be better if you did the refactoring to memberdata in a separate PR, but it helps to know that these were the only two new cases plus the new culture parameter.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you missing the problematic test case? Where the thousand separator character exactly matches the foot or inches unit? Requires a culture that has this exact character though, can construct our own if necessary.

new object[]{"1′000′", 1000, new CultureInfo("de-CH")},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use new CultureInfo("...", false) everywhere. This makes sure that at least when developers use a similarly current version of Windows, the settings will be the same. (This ignores any manual overrides in control panel)

new object[]{"-1′1″", -1.08333333, new CultureInfo("en-US")}, // Without space
new object[]{"-1 ft 1 in", -1.08333333, new CultureInfo("en-US")},
new object[]{"-1ft 1in", -1.08333333, new CultureInfo("en-US")},
new object[]{"1’000′", 1000, new CultureInfo("de-CH")}, // Feet only, with seperator
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgrawher also pointed out that in some Windows/Linux/Mac versions, the apostrophe symbol can vary for the same culture. But, unless we actually run into this issue on our own machines and the AppVeyor VMs, then I think a hard coded string is better. I really don't expect this character to change much over time for a given culture unless they are fixing a consistency bug in one of the OS versions, but I am merely speculating.

new object[]{"-1′1″", -1.08333333, new CultureInfo("en-US")}, // Without space
new object[]{"-1 ft 1 in", -1.08333333, new CultureInfo("en-US")},
new object[]{"-1ft 1in", -1.08333333, new CultureInfo("en-US")},
new object[]{"1’000′", 1000, new CultureInfo("de-CH")}, // Feet only, with seperator
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of reviewing this PR, it would be better if you did the refactoring to memberdata in a separate PR, but it helps to know that these were the only two new cases plus the new culture parameter.

// Match entire string exactly
string pattern = $@"^(?<negativeSign>\-?)(?<feet>{footRegex})\s?(?<inches>{inchRegex})$";
var feetMatch = new Regex(footRegex, RegexOptions.Singleline).Match(str);
var inchesMatch = new Regex(inchRegex, RegexOptions.Singleline).Match(str);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this allow invalid formats like inches first, feet after? 5" 1'
Or is this actually an acceptable form?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it wouldn't. I'll make sure to enforce ordering.

new object[]{"-1′1″", -1.08333333, new CultureInfo("en-US")}, // Without space
new object[]{"-1 ft 1 in", -1.08333333, new CultureInfo("en-US")},
new object[]{"-1ft 1in", -1.08333333, new CultureInfo("en-US")},
new object[]{"1’000′", 1000, new CultureInfo("de-CH")}, // Feet only, with seperator
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you missing the problematic test case? Where the thousand separator character exactly matches the foot or inches unit? Requires a culture that has this exact character though, can construct our own if necessary.

new object[]{"1′000′", 1000, new CultureInfo("de-CH")},

@tmilnthorp
Copy link
Collaborator Author

Indeed I did miss the problematic case!

I wonder if it would help simplify everything if we just search for the abbreviation ending, and just do a double.Parse or double.TryParse on the prefix.

In other words, parse the abbreviation suffix ourselves and let .NET parse everything before it. They've already done all the work - no sense in recreating the parsing regex.

@angularsen
Copy link
Owner

In other words, parse the abbreviation suffix ourselves and let .NET parse everything before it.

Isn't that how the regex is designed though? To only split the string on known abbreviations and use double.TryParse on whatever was in front of the abbreviation?

I may remember it wrong.

@tmilnthorp
Copy link
Collaborator Author

In other words, parse the abbreviation suffix ourselves and let .NET parse everything before it.

Isn't that how the regex is designed though? To only split the string on known abbreviations and use double.TryParse on whatever was in front of the abbreviation?

I may remember it wrong.

It is, kind of. It fails when there's an extra apostrophe as this defect shows :)

My regex knowledge fails me here. We want to take everything until the last abbreviation string at the end. Almost as if to reverse the string or do a maximum look-ahead of some sort.

@angularsen
Copy link
Owner

angularsen commented Jun 18, 2020

Are we simply overcomplicating this?
Do we really need to support 1'500' 2" ? Is this a realistic usecase?
Can we throw if there is more than two matches of each foot/inch abbreviation?

I'm sure we can get it working, but it just feels very complicated for something perhaps no one will ever benefit from. Maybe we can implement it properly if someone complains about this exception for scenarios like that?
At least we don't give out incorrect values, which is a whole lot worse.

In my feeble metric mind, I speculate these are the typical use cases:

  • Integers: 5' 2"
  • Fractions: 5 1/4"
  • Decimal point for either foot or inch, not both: 1.5' or 1.5", but I think fractions are more common?
  • Thousand separators for either foot or inch, not both: 1'500' or 1'500" (one thousand five hundred feet)

@pgrawehr
Copy link
Contributor

I agree that we do not need the perfect solution here. I think it's ok to throw if something is ambiguous. But we should a) throw in a defined way and b) have all the unit tests pass regardless of the dev's culture settings. The fact that b) is currently not true was the reason I started all this.

@tmilnthorp
Copy link
Collaborator Author

I disagree, we should handle any culture (or grouping separator) gracefully.

We can't even round-trip right now for some cultures. The following throws a FormatException:

var feetInches = new FeetInches( 1234, 5 );

var culture = new CultureInfo( "de-CH" );

var deCHString = feetInches.ToString( culture );
var parsed = Length.Parse( deCHString, culture );

deCHString is: 1’234 ft 5 in

@angularsen
Copy link
Owner

angularsen commented Jun 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants