Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch SBJsonStreamParser for integral values. #171

Closed
wants to merge 1 commit into from

Conversation

adonoho
Copy link
Contributor

@adonoho adonoho commented Jun 20, 2013

SBJsonStreamParser parses an integer into an NSDecimalNumber. With some
large values, such as Twitter tweet ids, this results in a rounding
error. This pull request makes a one line change to the parser to parse
into a longLongNumber instead of a decimal number. It also modifies the
sample app to use the Apple social media APIs and accounts. The problem
can be demonstrated by uncommenting the original line in the
SBJsonStreamParser file. Then the app will start logging tweets that
have incorrect values for tweet ids.

SBJsonStreamParser parses an integer into an NSDecimalNumber. With some
large values, such as Twitter tweet ids, this results in a rounding
error. This pull request makes a one line change to the parser to parse
into a longLongNumber instead of a decimal number. It also modifies the
sample app to use the Apple social media APIs and accounts. The problem
can be demonstrated by uncommenting the original line in the
SBJsonStreamParser file. Then the app will start logging tweets that
have incorrect values for tweet ids.
@stig
Copy link
Collaborator

stig commented Jun 21, 2013

Hi Andrew,

I'm a little confused about your statements above. Here's Apple's description of NSDecimalNumber:

NSDecimalNumber, an immutable subclass of NSNumber, provides an object-oriented wrapper for doing base-10 arithmetic. An instance can represent any number that can be expressed as mantissa x 10^exponent where mantissa is a decimal integer up to 38 digits long, and exponent is an integer from –128 through 127.

AFAIK you cannot rely on long long to be more than 64 bits, which means ~20 integer digits. And indeed, your patch breaks three tests that tests the limits of integer lengths, including the test for Snowflake (Twitter's ID scheme) compatibility:

Could it be that the way you are reading from the NSDecimalNumbers is causing the rounding you're seeing?

@adonoho
Copy link
Contributor Author

adonoho commented Jun 21, 2013

Stig,

My patch fixes a real, demonstrable problem with interpreting twitter ids as an NSDecimalNumber. It includes modifications to your example code that shows the problem with live data. If you revert the fix to SBJsonStreamParser, the problem is revealed in the log statements.

You may have a better way to fix it than my one line patch. Regardless, I've surfaced a problem in your current scheme. The NSDecimalNumber is introducing a rounding error to some twitter ids. At minimum, the test coverage is incomplete. While NSJSONSerialization is far from the gold standard in JSON parsers, they do return an NSNumber for this parameter. You may wish to revisit your implementation decision.

Andrew

@stig
Copy link
Collaborator

stig commented Jun 21, 2013

You're right.

There is definitively something fishy going on here. It's weird; it almost looks like a bug in NSDecimalNumber itself because it stringifies correctly:

(lldb) po dict[@"id"]
348188955398766593

but we get an off-by-one for longLong (and unsignedLongLong) values:

(lldb) p (long long)[dict[@"id"] longLongValue]
(long long) $9 = 348188955398766592

(lldb) p (unsigned long long)[dict[@"id"] unsignedLongLongValue]
(unsigned long long) $11 = 348188955398766592

In contrast, for the id_str, we get:

(lldb) p (long long) [dict[@"id_str"] longLongValue]
(long long) $18 = 348188955398766593

(lldb) p (long double)[dict[@"id_str"] doubleValue]
(long double) $20 = 348188955398766592

What. Is. Going. On?

Double-precision IEEE floating point, is what. It looks like NSDecimalNumber's conversion to long long goes via double, which has only 17 decimal digits of precision. 348188955398766593 is 18 digits long, and so we get this rounding error.

So yes; you were right. My tests were inadequate. NSDecimalNumber has been a sticking point for a while (see issue #128). This might provide motivation to finally remove it.

Let me mull this over over the weekend.

stig added a commit that referenced this pull request Jun 22, 2013
…per integer type

Integers are now always parsed into `long long` or `unsigned long long` if they fit.
@stig
Copy link
Collaborator

stig commented Jun 22, 2013

Manually merged.

@stig stig closed this Jun 22, 2013
@adonoho
Copy link
Contributor Author

adonoho commented Jun 22, 2013

Stig,

As the TwitterStream app is now officially broken by Twitter's move to the new API, I suggest you adopt my patches to that app or equivalent.

Andrew

@stig
Copy link
Collaborator

stig commented Jun 22, 2013

I already did earlier today. Thanks!

Stig

Sent from my iPhone

On 22 Jun 2013, at 17:03, "Andrew W. Donoho" notifications@github.com wrote:

Stig,

As the TwitterStream app is now officially broken by Twitter's move to the new API, I suggest you adopt my patches to that app or equivalent.

Andrew


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants