-
Notifications
You must be signed in to change notification settings - Fork 909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: sort strings in UTF-8 encoded byte order with lazy encoding #8787
base: main
Are you sure you want to change the base?
Conversation
Size Report 1Affected Products
Test Logs |
Size Analysis Report 1This report is too large (572,783 characters) to be displayed here in a GitHub comment. Please use the below link to see the full report on Google Cloud Storage.Test Logs |
🦋 Changeset detectedLatest commit: 9653bdc The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
packages/firestore/src/util/misc.ts
Outdated
} | ||
} | ||
|
||
// Compare lengths if all bytes are equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances would all bytes be equal? If there were all equal, then wouldn't the "if" condition on line 85 have evaluated to false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, IIUC since leftCodePoint !== rightCodePoint
, comparison
on line 101 should be non-zero for at least one iteration of the loop on line 96.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, there are some cases where trailing surrogate gets falsy involved, and being misrepresented by bytes, which could lead to 2 equal byte array for 2 different code points. That's why the unit tests were failing in android sdk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮
Strings should be sorted in UTF-8 encoded byte order. Public document: https://cloud.google.com/firestore/docs/concepts/data-types#data_types
SDK sorts strings using built in comparator method, which sorts lexicographically, and leads to mismatch between server and sdk when special characters are present. This PR fixes the string order mismatches on document field, map key, and document key.
The previous fix added created a performance issue due to expensive UTF-8 encoding and reverted, #8774, #8778.
compareUtf8Strings
is updated to use lazy encoding instead.b/329441702