Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(NODE-5648): add Long.fromStringStrict() #675

Merged
merged 21 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
c0f5350
initial blah
aditi-khare-mongoDB Apr 18, 2024
272e4f1
another tempp commit
aditi-khare-mongoDB Apr 18, 2024
4483ac4
finished validateStringCharacters
aditi-khare-mongoDB Apr 18, 2024
a87dade
feat(NODE-5648): add Long.fromStringStrict method
aditi-khare-mongoDB Apr 18, 2024
76801e8
EOD progress
aditi-khare-mongoDB Apr 18, 2024
a001590
ready for review
aditi-khare-mongoDB Apr 19, 2024
c4818d9
make error message for overflows more accurate
aditi-khare-mongoDB Apr 19, 2024
59f3911
Merge branch 'main' into NODE-5648/long-validateString
aditi-khare-mongoDB Apr 19, 2024
4c7d5b5
add whitespace test case
aditi-khare-mongoDB Apr 19, 2024
8939fe9
fix test naming
aditi-khare-mongoDB Apr 19, 2024
667168e
requested changes
aditi-khare-mongoDB Apr 22, 2024
88fb767
lint fix
aditi-khare-mongoDB Apr 22, 2024
e616e88
requested changes team review
aditi-khare-mongoDB Apr 24, 2024
70c6162
add argument parsing check within fromStringStrict
aditi-khare-mongoDB Apr 24, 2024
1d261c5
removed extraneous functionality from _fromString and moved it to fro…
aditi-khare-mongoDB Apr 24, 2024
60b2272
remove unneeded param from _fromString
aditi-khare-mongoDB Apr 24, 2024
9f308aa
requested changes: alter support for infinity/nan cases and add expli…
aditi-khare-mongoDB Apr 25, 2024
63beeea
optimize removeLeadingZerosAndExplicitPlus function and fix name typo
aditi-khare-mongoDB Apr 25, 2024
d212f7e
requested changes - check str length in removeLeadingZerosAndExplicit…
aditi-khare-mongoDB Apr 29, 2024
c98b513
remove fromString fix
aditi-khare-mongoDB Apr 29, 2024
152db8f
lint fix
aditi-khare-mongoDB Apr 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions src/int_32.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { BSON_INT32_MAX, BSON_INT32_MIN } from './constants';
import { BSONError } from './error';
import type { EJSONOptions } from './extended_json';
import { type InspectFn, defaultInspect } from './parser/utils';
import { removeLeadingZerosAndExplicitPlus } from './utils/string_utils';

/** @public */
export interface Int32Extended {
Expand Down Expand Up @@ -48,11 +49,7 @@ export class Int32 extends BSONValue {
* @param value - the string we want to represent as an int32.
*/
static fromString(value: string): Int32 {
const cleanedValue = !/[^0]+/.test(value)
? value.replace(/^0+/, '0') // all zeros case
: value[0] === '-'
? value.replace(/^-0+/, '-') // negative number with leading zeros
: value.replace(/^\+?0+/, ''); // positive number with leading zeros
const cleanedValue = removeLeadingZerosAndExplicitPlus(value);

const coercedValue = Number(value);

Expand Down
147 changes: 136 additions & 11 deletions src/long.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { BSONError } from './error';
import type { EJSONOptions } from './extended_json';
import { type InspectFn, defaultInspect } from './parser/utils';
import type { Timestamp } from './timestamp';
import * as StringUtils from './utils/string_utils';

interface LongWASMHelpers {
/** Gets the high bits of the last operation performed */
Expand Down Expand Up @@ -246,29 +247,24 @@ export class Long extends BSONValue {
}

/**
* @internal
* Returns a Long representation of the given string, written using the specified radix.
* Throws an error if `throwsError` is set to true and any of the following conditions are true:
* - the string contains invalid characters for the given radix
* - the string contains whitespace
* @param str - The textual representation of the Long
* @param unsigned - Whether unsigned or not, defaults to signed
* @param radix - The radix in which the text is written (2-36), defaults to 10
* @returns The corresponding Long value
*/
static fromString(str: string, unsigned?: boolean, radix?: number): Long {
private static _fromString(str: string, unsigned: boolean, radix: number): Long {
if (str.length === 0) throw new BSONError('empty string');
if (str === 'NaN' || str === 'Infinity' || str === '+Infinity' || str === '-Infinity')
return Long.ZERO;
if (typeof unsigned === 'number') {
// For goog.math.long compatibility
(radix = unsigned), (unsigned = false);
} else {
unsigned = !!unsigned;
}
radix = radix || 10;
if (radix < 2 || 36 < radix) throw new BSONError('radix');

let p;
if ((p = str.indexOf('-')) > 0) throw new BSONError('interior hyphen');
else if (p === 0) {
return Long.fromString(str.substring(1), unsigned, radix).neg();
return Long._fromString(str.substring(1), unsigned, radix).neg();
}

// Do several (8) digits each time through the loop, so as to
Expand All @@ -291,6 +287,135 @@ export class Long extends BSONValue {
return result;
}

/**
* Returns a signed Long representation of the given string, written using radix 10.
* Will throw an error if the given text is not exactly representable as a Long.
* Throws an error if any of the following conditions are true:
* - the string contains invalid characters for the radix 10
* - the string contains whitespace
* - the value the string represents is too large or too small to be a Long
* Unlike Long.fromString, this method does not coerce '+/-Infinity' and 'NaN' to Long.Zero
* @param str - The textual representation of the Long
* @returns The corresponding Long value
*/
static fromStringStrict(str: string): Long;
/**
* Returns a Long representation of the given string, written using the radix 10.
* Will throw an error if the given parameters are not exactly representable as a Long.
* Throws an error if any of the following conditions are true:
* - the string contains invalid characters for the given radix
* - the string contains whitespace
* - the value the string represents is too large or too small to be a Long
* Unlike Long.fromString, this method does not coerce '+/-Infinity' and 'NaN' to Long.Zero
* @param str - The textual representation of the Long
* @param unsigned - Whether unsigned or not, defaults to signed
* @returns The corresponding Long value
*/
static fromStringStrict(str: string, unsigned?: boolean): Long;
/**
* Returns a signed Long representation of the given string, written using the specified radix.
* Will throw an error if the given parameters are not exactly representable as a Long.
* Throws an error if any of the following conditions are true:
* - the string contains invalid characters for the given radix
* - the string contains whitespace
* - the value the string represents is too large or too small to be a Long
* Unlike Long.fromString, this method does not coerce '+/-Infinity' and 'NaN' to Long.Zero
* @param str - The textual representation of the Long
* @param radix - The radix in which the text is written (2-36), defaults to 10
* @returns The corresponding Long value
*/
static fromStringStrict(str: string, radix?: boolean): Long;
/**
* Returns a Long representation of the given string, written using the specified radix.
* Will throw an error if the given parameters are not exactly representable as a Long.
* Throws an error if any of the following conditions are true:
* - the string contains invalid characters for the given radix
* - the string contains whitespace
* - the value the string represents is too large or too small to be a Long
* Unlike Long.fromString, this method does not coerce '+/-Infinity' and 'NaN' to Long.Zero
* @param str - The textual representation of the Long
* @param unsigned - Whether unsigned or not, defaults to signed
* @param radix - The radix in which the text is written (2-36), defaults to 10
* @returns The corresponding Long value
*/
static fromStringStrict(str: string, unsigned?: boolean, radix?: number): Long;
static fromStringStrict(str: string, unsignedOrRadix?: boolean | number, radix?: number): Long {
let unsigned = false;
if (typeof unsignedOrRadix === 'number') {
// For goog.math.long compatibility
(radix = unsignedOrRadix), (unsignedOrRadix = false);
} else {
unsigned = !!unsignedOrRadix;
}
radix ??= 10;

if (str.trim() !== str) {
throw new BSONError(`Input: '${str}' contains leading and/or trailing whitespace`);
}
if (!StringUtils.validateStringCharacters(str, radix)) {
throw new BSONError(`Input: '${str}' contains invalid characters for radix: ${radix}`);
}

// remove leading zeros (for later string comparison and to make math faster)
const cleanedStr = StringUtils.removeLeadingZerosAndExplicitPlus(str);

// check roundtrip result
const result = Long._fromString(cleanedStr, unsigned, radix);
if (result.toString(radix).toLowerCase() !== cleanedStr.toLowerCase()) {
nbbeeken marked this conversation as resolved.
Show resolved Hide resolved
throw new BSONError(
`Input: ${str} is not representable as ${result.unsigned ? 'an unsigned' : 'a signed'} 64-bit Long ${radix != null ? `with radix: ${radix}` : ''}`
);
nbbeeken marked this conversation as resolved.
Show resolved Hide resolved
}
return result;
}

/**
* Returns a signed Long representation of the given string, written using radix 10.
* @param str - The textual representation of the Long
* @returns The corresponding Long value
*/
static fromString(str: string): Long;
/**
* Returns a signed Long representation of the given string, written using radix 10.
* @param str - The textual representation of the Long
* @param radix - The radix in which the text is written (2-36), defaults to 10
* @returns The corresponding Long value
*/
static fromString(str: string, radix?: number): Long;
/**
* Returns a Long representation of the given string, written using radix 10.
* @param str - The textual representation of the Long
* @param unsigned - Whether unsigned or not, defaults to signed
* @returns The corresponding Long value
*/
static fromString(str: string, unsigned?: boolean): Long;
/**
* Returns a Long representation of the given string, written using the specified radix.
* @param str - The textual representation of the Long
* @param unsigned - Whether unsigned or not, defaults to signed
* @param radix - The radix in which the text is written (2-36), defaults to 10
* @returns The corresponding Long value
*/
static fromString(str: string, unsigned?: boolean, radix?: number): Long;
static fromString(str: string, unsignedOrRadix?: boolean | number, radix?: number): Long {
let unsigned = false;
if (typeof unsignedOrRadix === 'number') {
// For goog.math.long compatibility
(radix = unsignedOrRadix), (unsignedOrRadix = false);
} else {
unsigned = !!unsignedOrRadix;
}
radix ??= 10;
if (str === 'NaN' && radix < 24) {
// radix does not support n, so coerce to zero
return Long.ZERO;
} else if ((str === 'Infinity' || str === '+Infinity' || str === '-Infinity') && radix < 35) {
// radix does not support y, so coerce to zero
return Long.ZERO;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking the radix here? This changes the behavior of Long.fromString - something we want to avoid in this PR. The logic was previously:

if (str === 'NaN' || str === 'Infinity' || str === '+Infinity' || str === '-Infinity')
      return Long.ZERO;

Copy link
Contributor Author

@aditi-khare-mongoDB aditi-khare-mongoDB Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does change the behavior of Long.fromString, but its another bug fix, (see updated release notes).

We previously did not allow valid inputs to radices, because we interpreted them as 'not a number,' but in radix >= 24, 'nan' is a valid number (since 'n' and 'a' are in the character set for numbers).

Example:
Long.fromString('NaN', 27); // new Long(17060)

The original Long.fromString would return Long.ZERO on '+/- Infinity' and 'NaN' cases, and then check for radix later. This behavior is still preserved, other than in the case that the user is not actually inputting 'not a number' or an infinite value, but a valid non-infinite Long written in a high radix.

return Long._fromString(str, unsigned, radix);
}

/**
* Creates a Long from its byte representation.
* @param bytes - Byte representation
Expand Down
45 changes: 45 additions & 0 deletions src/utils/string_utils.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
/**
* @internal
* Removes leading zeros and explicit plus from textual representation of a number.
*/
export function removeLeadingZerosAndExplicitPlus(str: string): string {
if (str === '') {
return str;
}

let startIndex = 0;

const isNegative = str[startIndex] === '-';
const isExplicitlyPositive = str[startIndex] === '+';

if (isExplicitlyPositive || isNegative) {
startIndex += 1;
}

let foundInsignificantZero = false;

while (str[startIndex] === '0') {
baileympearson marked this conversation as resolved.
Show resolved Hide resolved
foundInsignificantZero = true;
startIndex += 1;
}

if (!foundInsignificantZero) {
return isExplicitlyPositive ? str.slice(1) : str;
}

return `${isNegative ? '-' : ''}${str.length === startIndex ? '0' : str.slice(startIndex)}`;
}

/**
* @internal
* Returns false for an string that contains invalid characters for its radix, else returns the original string.
* @param str - The textual representation of the Long
* @param radix - The radix in which the text is written (2-36), defaults to 10
*/
export function validateStringCharacters(str: string, radix?: number): false | string {
radix = radix ?? 10;
const validCharacters = '0123456789abcdefghijklmnopqrstuvwxyz'.slice(0, radix);
// regex is case insensitive and checks that each character within the string is one of the validCharacters
const regex = new RegExp(`[^-+${validCharacters}]`, 'i');
return regex.test(str) ? false : str;
}
4 changes: 3 additions & 1 deletion test/node/int_32_tests.js
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,9 @@ describe('Int32', function () {
['a string with zero with leading zeros', '000000', 0],
['a string with positive leading zeros', '000000867', 867],
['a string with explicity positive leading zeros', '+000000867', 867],
['a string with negative leading zeros', '-00007', -7]
['a string with negative leading zeros', '-00007', -7],
['a string with explicit positive zeros', '+000000', 0],
['a string explicit positive no leading zeros', '+32', 32]
];
const errorInputs = [
['Int32.max + 1', '2147483648', 'larger than the maximum value for Int32'],
Expand Down
120 changes: 120 additions & 0 deletions test/node/long.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -163,4 +163,124 @@ describe('Long', function () {
});
});
});

describe('static fromString()', function () {
const successInputs: [
name: string,
input: string,
unsigned: boolean | undefined,
radix: number | undefined,
expectedStr?: string
][] = [
['radix 36 Infinity', 'Infinity', false, 36],
['radix 36 -Infinity', '-Infinity', false, 36],
['radix 36 +Infinity', '+Infinity', false, 36, 'infinity'],
['radix < 35 Infinity', 'Infinity', false, 34, '0'],
['radix < 35 -Infinity', '-Infinity', false, 23, '0'],
['radix < 35 +Infinity', '+Infinity', false, 12, '0'],
['radix < 24 NaN', 'NaN', false, 16, '0'],
['radix > 24 NaN', 'NaN', false, 25]
];

for (const [testName, str, unsigned, radix, expectedStr] of successInputs) {
context(`when the input is ${testName}`, () => {
it(`should return a Long representation of the input`, () => {
expect(Long.fromString(str, unsigned, radix).toString(radix)).to.equal(
expectedStr ?? str.toLowerCase()
);
});
});
}
});

describe('static fromStringStrict()', function () {
const successInputs: [
name: string,
input: string,
unsigned: boolean | undefined,
radix: number | undefined,
expectedStr?: string
][] = [
['basic no alphabet low radix', '1236', true, 8],
['negative basic no alphabet low radix', '-1236', false, 8],
['valid upper and lower case letters in string with radix > 10', 'eEe', true, 15],
['hexadecimal letters', '126073efbcdADEF', true, 16],
['negative hexadecimal letters', '-1267efbcdDEF', false, 16],
['negative leading zeros', '-00000032', false, 15, '-32'],
['leading zeros', '00000032', false, 15, '32'],
['explicit positive leading zeros', '+00000032', false, 15, '32'],
nbbeeken marked this conversation as resolved.
Show resolved Hide resolved
['max unsigned binary input', Long.MAX_UNSIGNED_VALUE.toString(2), true, 2],
['max unsigned decimal input', Long.MAX_UNSIGNED_VALUE.toString(10), true, 10],
['max unsigned hex input', Long.MAX_UNSIGNED_VALUE.toString(16), true, 16],
['max signed binary input', Long.MAX_VALUE.toString(2), false, 2],
['max signed decimal input', Long.MAX_VALUE.toString(10), false, 10],
['max signed hex input', Long.MAX_VALUE.toString(16), false, 16],
['min signed binary input', Long.MIN_VALUE.toString(2), false, 2],
['min signed decimal input', Long.MIN_VALUE.toString(10), false, 10],
['min signed hex input', Long.MIN_VALUE.toString(16), false, 16],
['signed zeros', '+000000', false, 10, '0'],
['unsigned zero', '0', true, 10],
['explicit positive no leading zeros', '+32', true, 10, '32'],
// the following inputs are valid radix 36 inputs, but will not represent NaN or +/- Infinity
['radix 36 Infinity', 'Infinity', false, 36],
['radix 36 -Infinity', '-Infinity', false, 36],
['radix 36 +Infinity', '+Infinity', false, 36, 'infinity'],
['radix 36 NaN', 'NaN', false, 36],
['overload no unsigned and no radix parameter', '-32', undefined, undefined],
['overload no unsigned parameter', '-32', undefined, 12],
['overload no radix parameter', '32', true, undefined]
];

const failureInputs: [
name: string,
input: string,
unsigned: boolean | undefined,
radix: number | undefined
][] = [
['empty string', '', true, 2],
['invalid numbers in binary string', '234', true, 2],
['non a-z or numeric string', '~~', true, 36],
['alphabet in radix < 10', 'a', true, 9],
['radix does not allow all alphabet letters', 'eee', false, 14],
['over max unsigned binary input', Long.MAX_UNSIGNED_VALUE.toString(2) + '1', true, 2],
['over max unsigned decimal input', Long.MAX_UNSIGNED_VALUE.toString(10) + '1', true, 10],
['over max unsigned hex input', Long.MAX_UNSIGNED_VALUE.toString(16) + '1', true, 16],
['over max signed binary input', Long.MAX_VALUE.toString(2) + '1', false, 2],
['over max signed decimal input', Long.MAX_VALUE.toString(10) + '1', false, 10],
['over max signed hex input', Long.MAX_VALUE.toString(16) + '1', false, 16],
['under min signed binary input', Long.MIN_VALUE.toString(2) + '1', false, 2],
['under min signed decimal input', Long.MIN_VALUE.toString(10) + '1', false, 10],
['under min signed hex input', Long.MIN_VALUE.toString(16) + '1', false, 16],
['string with whitespace', ' 3503a ', false, 11],
['negative zero unsigned', '-0', true, 9],
['negative zero signed', '-0', false, 13],
['radix 1', '12', false, 1],
['negative radix', '12', false, -4],
['radix over 36', '12', false, 37],
// the following inputs are invalid radix 16 inputs
// this is because of the characters, not because of the values they commonly represent
['radix 10 Infinity', 'Infinity', false, 10],
['radix 10 -Infinity', '-Infinity', false, 10],
['radix 10 +Infinity', '+Infinity', false, 10],
['radix 10 NaN', 'NaN', false, 10],
['overload no radix parameter and invalid sign', '-32', true, undefined]
];

for (const [testName, str, unsigned, radix, expectedStr] of successInputs) {
context(`when the input is ${testName}`, () => {
it(`should return a Long representation of the input`, () => {
expect(Long.fromStringStrict(str, unsigned, radix).toString(radix)).to.equal(
nbbeeken marked this conversation as resolved.
Show resolved Hide resolved
expectedStr ?? str.toLowerCase()
);
});
});
}
for (const [testName, str, unsigned, radix] of failureInputs) {
context(`when the input is ${testName}`, () => {
it(`should throw BSONError`, () => {
expect(() => Long.fromStringStrict(str, unsigned, radix)).to.throw(BSONError);
});
});
}
});
});
1 change: 1 addition & 0 deletions test/node/release.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ const REQUIRED_FILES = [
'src/utils/byte_utils.ts',
'src/utils/node_byte_utils.ts',
'src/utils/number_utils.ts',
'src/utils/string_utils.ts',
'src/utils/web_byte_utils.ts',
'src/utils/latin.ts',
'src/validate_utf8.ts',
Expand Down
Loading