ReadPostingList skipping over valid keys #4905
Labels
kind/bug
Something is broken.
priority/P0
Critical issue that requires immediate attention.
status/accepted
We accept to investigate/work on it.
What version of Dgraph are you using?
master
Have you tried reproducing the issue with the latest release?
yes
Steps to reproduce the issue (command/config used to run Dgraph).
In ReadPostingList, there's logic to skip over the parts of a multi-list part shown below.
This logic assumes there are no other lists in between. However, I don't think that's true as there's a flipped bit in the key to signal that the key is a multi-part list but that bit is not the last in the key. This means there are other valid keys in the range that are skipped, such as count and term index keys.
This bit is not the last in the key because of term indexes. We don't encode the length of the term in the key so it's not possible to put the bit that signals this key corresponds to a multi-part list at the end. The best (long term) solution is to change the format of the keys so that this is true.
The short-term solution would be to remove this code and ensure that multi-part list keys are skipped in the places where the stream framework is called (rollup, backups, etc).
Expected behaviour and actual result.
Some keys might not be rolled up or included in backups because they are skipped. All keys should be considered for rollup. This bug does not lead to data losses for rollups but could lead to performance issues since the rollup is not being done for certain keys.
The text was updated successfully, but these errors were encountered: