Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client returns integers beyond JavaScript precision #32

Closed
penberg opened this issue May 1, 2023 · 11 comments
Closed

Client returns integers beyond JavaScript precision #32

penberg opened this issue May 1, 2023 · 11 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@penberg
Copy link
Contributor

penberg commented May 1, 2023

Numnumberry reports:

it seems the libsql client will return integers beyond what JavaScript can precisely represent
pg's solution to this was to return 64-bit integers as strings https://stackoverflow.com/questions/39168501/pg-promise-returns-integers-as-strings

@penberg penberg added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels May 1, 2023
@CodingDoug
Copy link
Contributor

CodingDoug commented May 1, 2023

If I'm understanding the linked SO issue correctly, the rendering as a string is applied only when the type of the column is bigint, so it's predictable when the query returns a string, even if the actual number value is within range a normal JS number. However, SQLite's integer type doesn't have any sort of indicator about the potential size of the integer in its column types (it could be 0 to 8 bytes).

IMO it would be odd to always get a string where a SQLite integer was defined (otherwise I might unknowingly assume to do math on a string). It would still be odd if they were always rendered as bigints if most of my app's math was using numbers. And, if a ResultSet contained mixed types based on the magnitude of the integer value, it would be inconvenient if I had to check each those types on every row before using the value correctly.

It might be a good idea to let the caller say what happens with SQLite integer column values. I think in most cases, the caller is going to be satisfied with a JS number, accepting inaccuracy if the actual value was OOB. If the caller needs to defend against that, they could either 1) configure the client for all queries or 2) configure the individual query to perform a conversion of all integer values in a ResultSet to strings or bigints so the rendered types are uniformly predictable.

So, for example, option 1 could show up like this in the API:

createClient({
    url: string,
    authToken: string | undefined,
    integersAs: "string" | "bigint" | "number" | undefined (default "number")
})

@honzasp
Copy link
Contributor

honzasp commented May 1, 2023

In Hrana (and thus also in HTTP), attempting to read an integer that is outside of the safe range of JavaScript numbers throws an error:
https://github.com/libsql/hrana-client-ts/blob/main/src/value.ts#L55-L62

However, this does not apply to the local client, which uses better-sqlite3. Did numnumberry specify for which client he observed this issue?

@honzasp
Copy link
Contributor

honzasp commented May 1, 2023

The current approach for handling large integers is as follows:

  • When writing to the database, JavaScript number is stored as a SQLite float, and JavaScript bigint is stored as a SQLite text.
  • When reading from the database, a SQLite integer is returned as a JavaScript number, and an error is thrown when the range is exceeded. If the user wants to read larger integers, they need to convert them to strings in SQL.

This behavior is a bit weird. However, I don't like that such core behavior would depend on some configuration value: to understand any piece of code using @libsql/client, I would have to dig up this configuration, and it is also extremely broad (it applies to every value ever returned by the client).

It seems to me that the best solution would be to always return SQLite integers as JavaScript bigints, and store JavaScript bigints as SQLite integers (throwing an error on overflow). The disadvantage of this scheme is that users may be unfamiliar with bigints and may be surprised that they get bigints instead of numbers (i.e., rs = await c.execute("SELECT 1"); x = rs.rows[0][0]; x === 1 would produce false, because 1n !== 1.

(Of course, implementing this behavior would be a breaking change in the client.)

@honzasp honzasp removed the good first issue Good for newcomers label May 1, 2023
@CodingDoug
Copy link
Contributor

My impression is that most JS developers would:

  • Never expect an API to yield a bigint data that wasn't a bigint when they provided it (which will be a vast majority of the time, I predict)
  • Be completely surprised by the difference in behavior between number and bigint at runtime, and not be able to debug what actually happened that ran counter to their expectations
  • Or even know what a bigint is and why it exists

I would prefer to adhere to the principle of least surprise, for a majority of use cases. Anything else, and we are more likely appear to be broken, or we require a ramp-up in knowledge that makes adoption difficult.

@honzasp
Copy link
Contributor

honzasp commented May 2, 2023

The current behavior is that you always get a number from the database, never a bigint. There are two problems, however:

  1. There is no way to pass a SQLite integer to the database: number is a SQLite float, and bigint is converted to SQLite text.
  2. It's not possible to directly read an integer bigger than 2^53, because that would overflow the number type.

But I agree with you, Doug, that bigints might be very confusing for developers; that's why I implemented this bigint-less scheme in the first place :) If we want to avoid bigints, then we can simply do nothing and keep the existing behavior.

@honzasp honzasp self-assigned this May 2, 2023
@CodingDoug
Copy link
Contributor

In #32 (comment) I recommend allowing the developer to choose to change the behavior of the SDK so that they can request useful behavior that would otherwise be unexpected. This would allow someone to opt in to special handling of bigint, understanding how that affects queries on tables with integer types. Do you think that would work?

@honzasp
Copy link
Contributor

honzasp commented May 3, 2023

In general, I don't like when such an important behavior is defined in some configuration option somewhere. When reading and writing code that uses @libsql/client, you would have to be aware of the configuration (or you would risk introducing subtle bugs). It would also affect typing: the type of values possibly returned from the database would include bigint even if the user opts into the number option.

@CodingDoug
Copy link
Contributor

The configuration could be on the client as a while, or specified per-query if that's more clear for the people reading and writing the code that immediately follows.

There is precedence for this sort of thing. Firestore once used to return all Timestamp types as JS Dates, which ended up being problematic some (but not all) of the time because a Date is less precise than a Timestamp, so you effectively lost data with a Date (despite the Date's convenience). To help that, they added a client configuration to allow people to opt into getting Timestamp object instead of Date. Eventually, that setting became the new default, and SDK would not yield Date objects anymore. But the irony of this is that people were then confused about how to turn that Timestamp into a Date, which is what they actually wanted a majority of the time!

The moral of the story is that people generally know what they want by default, and in some cases should be given a chance to change the default they prefer, accepting whatever other problems come with that.

@honzasp
Copy link
Contributor

honzasp commented May 4, 2023

After ruminating about this for a while, I'm starting to like your proposal, Doug :) However, to keep it simple, what about just two models instead of three?

  • "number" is the default behavior that "just works" for people who don't deal with numbers bigger than 2^53: SQLite integers are read as JS numbers (throwing an exception on overflow), and JS numbers are stored as SQLite floats (should be lossless). Bigints are stored as SQLite text.
  • "bigint" is the advanced behavior for people who can work with bigints: SQLite integers are read as JS bigints (always lossless), JS numbers are stored as SQLite floats and JS bigints are stored as SQLite integers (throwing an exception on overflow).

@CodingDoug
Copy link
Contributor

After ruminating about this for a while, I'm starting to like your proposal, Doug :) However, to keep it simple, what about just two models instead of three?

That's fine by me. The original issue pointed out that pg uses strings as the rendering, so I don't know if that adds specific value to some use cases. Probably not though.

@honzasp
Copy link
Contributor

honzasp commented May 30, 2023

With the new major version 0.2, I took the opportunity to include the breaking changes that will be required to fix this issue:

  • JS bigint passed to database is now converted to SQLite integer (potentially throwing an overflow exception), instead of SQLite text
  • The Value type now includes bigint (even though we still never actually return a bigint)

I will implement the optional bigint-returning behavior after 0.2 is stabilized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants