strict mode for parsing numbers #6

russaa · 2018-09-05T19:28:23Z

added a strict-mode for parsing numbers (see issue #4 ):

the basic idea is the same as parsing for "numbers only" in jednano/parse-css-font#11:
manipulating the input-string so that parseFloat().toString() would return the exact same string.

There are some special cases for which this does not work (namely e-notations; here it just makes sure that the exponent is an integer, but does not compare the complete evaluated number against the input-string) or where some more elaborate manipulation of the input-string is required (namely related to zeros at the beginning or the end), in order to successfully match a valid number against the the parseFloat() result.

This approach does somewhat favor handling the standard cases efficiently, and applies more involved processing for the special cases.

And it does not yet support numbers with leading zeros like 00056.

I would like a general feedback, before continuing to work on this ;-)

codecov · 2018-09-05T19:31:45Z

Codecov Report

Merging #6 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master     #6   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files           1      1           
  Lines          57     91   +34     
  Branches       11     25   +14     
=====================================
+ Hits           57     91   +34

Impacted Files	Coverage Δ
src/index.ts	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77735aa...3321734. Read the comment docs.

jednano · 2018-09-06T03:26:54Z

src/index.test.ts

+	'23e+0.07',
+];
+
+test('throws in mode when a number with invalid e-notation is provided', (t) => {


Did you mean throws "in strict mode"?

jednano · 2018-09-06T03:33:05Z

src/index.ts

 		if (/%$/.test(value)) {
 			this.type = 'percentage';
-			this.value = tryParseFloat(value);
+			this.value = tryParseNumber(value.substring(0, value.length - 1), strict);


What's going on here with the substring?

it removes the percent sign so that (potentially) only the number itself will be parsed (removal of the percent sign could be omitted in non-strict mode)

jednano · 2018-09-06T04:25:57Z

src/index.ts

@@ -8,37 +8,46 @@ const cssResolutionUnits: string[] = require('css-resolution-units');
 const cssFrequencyUnits: string[] = require('css-frequency-units');
 const cssTimeUnits: string[] = require('css-time-units');

+const numberPrefixPattern = /^(\+|-)?(\.)?\d/;


Why is this so far away from the implementation? Also, I can remove the trailing \d and all the tests still pass.

yes, there needs to be be test added for this:
without this you could enter "NaN%" and it would falsely validate it a number

jednano · 2018-09-06T04:26:39Z

src/index.ts

+		dots = countDots(value);
+	}
+	if (dots > 0) {
+			if (!allowDot) {


Some weird indentation in this block.

jednano · 2018-09-06T04:49:27Z

src/index.ts

@@ -82,6 +95,69 @@ function tryParseFloat(value: string) {
 	return result;
 }

+function normalizeNumber(value: string, allowDot: boolean = true): string {


I was able to crunch this function down to the following without breaking tests:

function normalizeNumber(value: string, allowDot = true) { const match = numberPrefixPattern.exec(value); if (!match) { throw new Error(`Invalid number: ${value}`); } const [, sign, dot] = match; if (sign === '+') { value = value.substr(1); } if (dot) { if (!allowDot) { throw new Error(`Invalid number (too many dots): ${value}`); } if (sign === '-') { value = '-0' + value.substr(1); } else { value = '0' + value; } } return (dot || countDots(value)) ? value.replace(/\.?0+$/, '') : value; }

yes -- the previous version did focus on computing as little as possible, e.g. run countDots() only if necessary, and also apply replace(/\.?0+$/, '') only on input where it would have an effect

but the gain of that is probably negligible

Yeah and it still has to do the check before doing the replace, so it probably ends up being about the same performance there.

jednano · 2018-09-06T04:51:30Z

src/index.ts

+function tryParseStrict(value: string): number {
+	const nval = normalizeNumber(value);
+	const result = parseFloat(nval);
+	if (result.toString() !== nval) {


Simplify:

if (result.toString() !== nval && !verifyZero(value) && !verifyENotation(value)) { throw new Error(`Invalid number: ${value}`); } return result;

jednano · 2018-09-06T04:51:40Z

src/index.ts

+	return value;
+}
+
+function tryParseStrict(value: string): number {


Please remove the return type here.

jednano · 2018-09-06T04:54:00Z

src/index.ts

+}
+
+function verifyZero(value: string) {
+	return /^[-+]?0\.0+$/.test(value);


The following works w/o breaking tests:

return parseFloat(value) === 0;

I guess so, but parseFloat() will always ignore trailing non-number parts in the string -- I have not thought too hard about this (in this instance), but it may be that there is such a case that it would fail here (i.e. falsely claim a non-valid input as number)

Yes, this is the discussion I was trying to spark here. Those other cases might be worth testing. You could also consider Math.abs(value) === 0.

jednano · 2018-09-06T15:20:13Z

src/index.ts

+	const nval = normalizeNumber(value);
+	const result = parseFloat(nval);
+	if (result.toString() !== nval) {
+		if (verifyZero(value) || verifyENotation(value)) {


Honestly, I think if we just inlined Math.abs(value) === 0 here it would be pretty clear.

do you mean Math.abs(parseFloat(value)) === 0?

No I don't mean that. Did you have issues w/o the parseFloat? Because I wasn't having issues in my console.

russaa · 2018-09-06T16:25:09Z

I made those changes where did not have any questions & pushed them into the branch/PR

Also:
I thought about a different approach for parsing the number, not using parseFloat().toString(), but instead using regular expression and parsing according to the number-token grammar rule in CSS spec

I'll open another PR for this alternative approach and maybe you can say which one is more appropriate here.

strict mode for parsing numbers

87d15c6

jednano requested changes Sep 6, 2018

View reviewed changes

russaa added 7 commits September 6, 2018 16:05

added test for invalid 'NaN' input

604707f

simplified normalizeNumber function as suggested by @jedmao

9464034

simplified tryParseStrict function as suggested by @jedmao

707532f

removed return-type declaration from functions

8cffa15

refactor: return NULL in normalizeNumber() instead of throwing error

1a379c6

extended test cases

8f6cdab

added support for numbers with leading zeros in strict mode

3321734

jednano reviewed Sep 6, 2018

View reviewed changes

russaa mentioned this pull request Sep 6, 2018

strict mode for parsing numbers via regular expressions #7

Open

strict mode for parsing numbers #6

Are you sure you want to change the base?

strict mode for parsing numbers #6

Conversation

russaa commented Sep 5, 2018

Uh oh!

codecov bot commented Sep 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jednano Sep 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russaa commented Sep 6, 2018

Uh oh!

Uh oh!

codecov bot commented Sep 5, 2018 •

edited

Loading

jednano Sep 6, 2018 •

edited

Loading