-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BREAKING: fix json marshal unmarshal for namespace > 127 #7810
Conversation
9830cad
to
057b989
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. One thing to consider is if we can use hex for representing namespaces to be consistent, instead of decimal. Don't need 0x
in the front. Just the hex should be sufficient.
Reviewed 6 of 6 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @pawanrawal and @vvbalaji-dgraph)
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal) (cherry picked from commit 90d77f3)
This PR adds `dgraph updatemanifest` tool. This can be used to migrate the manifest of the backups taken on release/v21.03 or their parallels to the state after this change #7810
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal) (cherry picked from commit 90d77f3)
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: * Live loader * Backup and List Backup * Http clients and Ratel Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal)
This PR adds `dgraph updatemanifest` tool. This can be used to migrate the manifest of the backups taken on release/v21.03 or their parallels to the state after this change #7810
This PR adds `dgraph updatemanifest` tool. This can be used to migrate the manifest of the backups taken on release/v21.03 or their parallels to the state after this change #7810
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal)
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal)
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal)
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal) fix(restore): update the schema and type from 2103 (#7838) With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it. fix(state): fix hex to uint64 response of list of namespaces (#8091) There is an issue in ExtractNamespaceFromPredicate. The issue is the parsing was done assuming ns in <ns>-<attr> to be decimal (actually it is hexadecimal). This leads to the following issues. A predicate a-name, it was skipped. A predicate 11-name was parsed as namespace 11, actually it is namespace 17 (0x11). fix(backup): handle manifest version logic, update manifest version to 2105 (#7825) The backward compatibility of the backup's manifest was broken by #7810, although the tool was added (#7815) that enables smooth migration of manifest. This PR makes backup backward compatible, by updating the manifest(in-memory) after reading. fix(updatemanifest): update the version of manifest after update (#7828) We were not updating the manifest version after the updation. This PR fixes that.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal) fix(restore): update the schema and type from 2103 (#7838) With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it. fix(state): fix hex to uint64 response of list of namespaces (#8091) There is an issue in ExtractNamespaceFromPredicate. The issue is the parsing was done assuming ns in <ns>-<attr> to be decimal (actually it is hexadecimal). This leads to the following issues. A predicate a-name, it was skipped. A predicate 11-name was parsed as namespace 11, actually it is namespace 17 (0x11). fix(backup): handle manifest version logic, update manifest version to 2105 (#7825) The backward compatibility of the backup's manifest was broken by #7810, although the tool was added (#7815) that enables smooth migration of manifest. This PR makes backup backward compatible, by updating the manifest(in-memory) after reading. fix(updatemanifest): update the version of manifest after update (#7828) We were not updating the manifest version after the updation. This PR fixes that.
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: Live loader Backup and List Backup Http clients and Ratel Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <anmespace>-<attribute> (- is the hyphen literal)
…t json marshal issues (#8601) We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: - Live loader using guardian of galaxy - Backup and List Backup - Http clients and Ratel - Schema and predicate Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <namespace>-<attribute> (- is the hyphen literal) <namespace> is a string "81" in hex We also update the manifest version after update. This diff takes care that older backups are still compatible and can be used to restore. Contains: #7838 #7828 #7825 #7815 #7810
…t json marshal issues (#8601) We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: - Live loader using guardian of galaxy - Backup and List Backup - Http clients and Ratel - Schema and predicate Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <namespace>-<attribute> (- is the hyphen literal) <namespace> is a string "81" in hex We also update the manifest version after update. This diff takes care that older backups are still compatible and can be used to restore. Contains: #7838 #7828 #7825 #7815 #7810
…t json marshal issues (#8601) We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph: - Live loader using guardian of galaxy - Backup and List Backup - Http clients and Ratel - Schema and predicate Fix: Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte. New Format: <namespace>-<attribute> (- is the hyphen literal) <namespace> is a string "81" in hex We also update the manifest version after update. This diff takes care that older backups are still compatible and can be used to restore. Contains: #7838 #7828 #7825 #7815 #7810
We used to store predicate as
<namespace>|<attribute>
(pipe|
signifies concatenation). We store this as a string.<namespace>
is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph:Fix:
Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte.
New Format:
<namespace>-<attribute>
(-
is the hyphen literal)<namespace>
is a string "81" in hexThis change is