Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(c/driver/postgresql): fix numeric to str #1502

Merged
merged 1 commit into from
Feb 1, 2024
Merged

Conversation

lupko
Copy link
Contributor

@lupko lupko commented Jan 31, 2024

  • conversion from numeric to string had two bugs due to deviations from the original PostgreSQL code
  • leading single zero before decimal would always be dropped
  • in some cases, the numbers after decimal would not be incomplete and instead replaced with 'garbage'

here is the PostgreSQL code: https://github.com/postgres/postgres/blob/9589b038d3203cd5ba708fb4f5c23182c88ad0b3/src/backend/utils/adt/numeric.c#L7443

(the DEC_DIGITS=4 variant)

Fixes #1501.

@lupko lupko requested a review from lidavidm as a code owner January 31, 2024 20:09
@github-actions github-actions bot added this to the ADBC Libraries 0.10.0 milestone Jan 31, 2024
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Is it possible to add a test case for this?

There's an example here:

TEST(PostgresCopyUtilsTest, PostgresCopyReadNumeric) {

Though, I'm not sure how the reference data was generated; it looks to be by running a query and saving the output/translating it to a byte array:

// For full coverage, ensure that this contains NUMERIC examples that:
// - Have >= four zeroes to the left of the decimal point
// - Have >= four zeroes to the right of the decimal point
// - Include special values (nan, -inf, inf, NULL)
// - Have >= four trailing zeroes to the right of the decimal point
// - Have >= four leading zeroes before the first digit to the right of the decimal point
// - Is < 0 (negative)
// COPY (SELECT CAST(col AS NUMERIC) AS col FROM ( VALUES (1000000), ('0.00001234'),
// ('1.0000'), (-123.456), (123.456), ('nan'), ('-inf'), ('inf'), (NULL)) AS drvd(col)) TO
// STDOUT WITH (FORMAT binary);
static const uint8_t kTestPgCopyNumeric[] = {
0x50, 0x47, 0x43, 0x4f, 0x50, 0x59, 0x0a, 0xff, 0x0d, 0x0a, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x01, 0x00,
0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x64, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0a, 0x00,
0x01, 0xff, 0xfe, 0x00, 0x00, 0x00, 0x08, 0x04, 0xd2, 0x00, 0x01, 0x00, 0x00, 0x00,
0x0a, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x01, 0x00, 0x01, 0x00,
0x00, 0x00, 0x0c, 0x00, 0x02, 0x00, 0x00, 0x40, 0x00, 0x00, 0x03, 0x00, 0x7b, 0x11,
0xd0, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
0x03, 0x00, 0x7b, 0x11, 0xd0, 0x00, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00,
0x00, 0xc0, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00,
0x00, 0xf0, 0x00, 0x00, 0x20, 0x00, 0x01, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00,
0x00, 0xd0, 0x00, 0x00, 0x20, 0x00, 0x01, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff};

@WillAyd
Copy link
Contributor

WillAyd commented Jan 31, 2024

Though, I'm not sure how the reference data was generated; it looks to be by running a query and saving the output/translating it to a byte array:

What I typically do is COPY out to a file then check the file in a hex editor. That way nothing with the terminal or keyring ends mixing up the bytes e.g.

postgres=# COPY (SELECT CAST(col AS NUMERIC) AS col FROM (  VALUES (1000000), ('0.00001234'), ('1.0000'), (-123.456), (123.456), ('nan'), ('-inf'), ('inf'), (NULL)) AS drvd(col)) TO '/tmp/pgdata.out' WITH (FORMAT binary);

Will write the bytes out to /tmp/pgdata.out

@lupko
Copy link
Contributor Author

lupko commented Jan 31, 2024

Added extra test against NUMERIC(16,10). Tried with the fix reverted - test was failing. Both the missing single zero before decimal was missing & there was corruption on the tail end.

the test data does not contain +- infinity - as per PostgreSQL doc, these can be only stored in unconstrained NUMERIC.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. I'll merge after CI unless Will has comments

- conversion from numeric to string had two bugs due to deviations from the original PostgreSQL code
- leading zero would always be dropped
- in some cases, the numbers after decimal would not be incomplete and instead replaced with 'garbage'
- extra test with NUMERIC(16,10)
@lupko
Copy link
Contributor Author

lupko commented Jan 31, 2024

eih, sorry CI has failed because I did not have env setup with pre-commit so some code failed the format check. corrected.

Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - thanks for the investigation and fix!

@lidavidm lidavidm merged commit 1b04cdf into apache:main Feb 1, 2024
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

python/postgresql: NUMERIC values corrupted in fetched results
3 participants