-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit for multi-part lists #3105
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, a lot more complex than it needs to be. Have a look at the comments.
Reviewable status: 0 of 9 files reviewed, 16 unresolved discussions (waiting on @martinmr)
posting/list.go, line 55 at r1 (raw file):
ErrStopIteration = errors.New("Stop iteration") emptyPosting = &pb.Posting{} maxListLength = 2000000
Let's do it based on size of the list instead of the number of UIDs.
posting/list.go, line 81 at r1 (raw file):
pendingTxns int32 // Using atomic for this, to avoid locking in SetForDeletion operation. deleteMe int32 // Using atomic for this, to avoid expensive SetForDeletion operation. next *List // If a multi-part list, this is a pointer to the next list.
I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.
If the list is empty, then we know it is one.
posting/list.go, line 85 at r1 (raw file):
} func appendNextStartToKey(key []byte, nextPartStart uint64) []byte {
Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.
posting/list.go, line 95 at r1 (raw file):
} func replaceNextStartInKey(key []byte, nextPartStart uint64) []byte {
Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.
posting/list.go, line 104 at r1 (raw file):
} func (l *List) getNextPartKey() []byte {
This should lie within the iterator, and not be part of the list.
Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.
The UID slice can also be used to do a quick binary search for afterUid
option in iterate.
posting/list.go, line 130 at r1 (raw file):
appendToKey := currPl.FirstPart || !currPl.MultiPart if appendToKey { return appendNextStartToKey(currKey, nextPartStart)
Yeah, too much arithmetic for micro-optimization.
posting/list.go, line 135 at r1 (raw file):
} func (l *List) updateMinTs(readTs, minTs uint64) error {
Not sure what this func is for.
posting/list.go, line 711 at r1 (raw file):
} func (l *List) partIterate(readTs uint64, f func(obj *pb.Posting) error) error {
Try not to create a copy of the iterate logic. It is easy to get wrong.
Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.
posting/list.go, line 904 at r1 (raw file):
enc := codec.Encoder{BlockSize: blockSize} err := curr.partIterate(readTs, func(p *pb.Posting) error {
This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.
Also, once the final list is created, you should do a judgement about whether it should be binary split or not.
posting/list.go, line 1232 at r1 (raw file):
txn := pstore.NewTransactionAt(readTs, false) opts := badger.DefaultIteratorOptions
No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.
This should just be a Txn.Get
.
posting/list.go, line 1246 at r1 (raw file):
return err } l.next = nextListPart
list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.
Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.
posting/list.go, line 1252 at r1 (raw file):
func (l *List) needsSplit(readTs uint64) bool { length := l.partialLength(readTs)
do it based on size. You can get proto size.
posting/list.go, line 1292 at r1 (raw file):
} func (l *List) splitListPart(readTs uint64) []*List {
To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.
- Load whole list at once. - Eliminate linked list. - Store all plists inside the List object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 10 files reviewed, 22 unresolved discussions (waiting on @golangcibot and @manishrjain)
posting/list.go, line 55 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Let's do it based on size of the list instead of the number of UIDs.
Done.
posting/list.go, line 81 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.
If the list is empty, then we know it is one.
Done. I did something similar but instead it's a list of posting lists. If the list is empty, then the
posting/list.go, line 85 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.
Done. Function has been removed.
posting/list.go, line 95 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.
Done. Function has been removed and replaced by a simpler function.
posting/list.go, line 104 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This should lie within the iterator, and not be part of the list.
Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.
The UID slice can also be used to do a quick binary search for
afterUid
option in iterate.
Replaced this function by a smaller function that is not part of any interface.
posting/list.go, line 130 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Yeah, too much arithmetic for micro-optimization.
Done.
posting/list.go, line 135 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Not sure what this func is for.
Done. Removed the method. It was used to update minTs in all of the linked list objects but the design has changed so this is not needed anymore.
posting/list.go, line 162 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: field
firstPart
is unused (fromunused
)
Done.
posting/list.go, line 178 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
l.loadNextPart
is not checked (fromerrcheck
)
Done.
posting/list.go, line 711 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Try not to create a copy of the iterate logic. It is easy to get wrong.
Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.
Done. I managed to merge the two functions so the logic is not duplicate anymore.
posting/list.go, line 904 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.
Also, once the final list is created, you should do a judgement about whether it should be binary split or not.
Done. I simplified the logic a bit. The logic goes through each part (or the entire list if not a multi part list) and applies the changes to each list. I moved the split logic to happen at the end.
posting/list.go, line 956 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
Error return value of
l.updateMinTs
is not checked (fromerrcheck
)
Done.
posting/list.go, line 1232 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.
This should just be a
Txn.Get
.
Done. All parts of a list are read when the list is created.
posting/list.go, line 1246 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.
Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.
Done. All parts of a list are created when the list is created.
posting/list.go, line 1252 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
do it based on size. You can get proto size.
Done.
posting/list.go, line 1292 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.
pb.PostingList is not just a list. It has the encoded and packed list of uids as well as the list of postings. I don't think it can be done as simply as you said.
posting/list.go, line 93 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: func
appendNextStartToKey
is unused (fromunused
)
Done.
posting/list.go, line 103 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: func
replaceNextStartInKey
is unused (fromunused
)
Done.
posting/list.go, line 112 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: func
(*List).getNextPartKey
is unused (fromunused
)
Done.
posting/list.go, line 135 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: func
generateNextPartKey
is unused (fromunused
)
Done.
posting/list.go, line 162 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: field
partIndex
is unused (fromunused
)
Done.
posting/list.go, line 566 at r3 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
U1000: func
(*List).pickPartPostings
is unused (fromunused
)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 13 files reviewed, 22 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)
posting/list.go, line 484 at r12 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
This is the only place where the postings are trimmed. The only place were deleteBelowTs is used is in the iterator but it's to indicate that the immutable layer should be ignored.
Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?
posting/list.go, line 857 at r12 (raw file):
// readTs and calculate the maxCommitTs. deleteBelow, mposts := l.pickPostings(readTs) maxCommitTs = x.Max(maxCommitTs, deleteBelow)
I'd move the comment I was mentioning earlier here.
Add test to verify this invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 13 files reviewed, 23 unresolved discussions (waiting on @golangcibot and @manishrjain)
dgraph/cmd/debug/run.go, line 509 at r16 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed (fromgoimports
)
Done.
posting/list.go, line 484 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?
Rollup is doing the following
maxCommitTs = x.Max(maxCommitTs, deleteBelow)
so I think this case is covered.
posting/list.go, line 857 at r12 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
I'd move the comment I was mentioning earlier here.
Done. I copied it since it seemed relevant in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, @martinmr . Also I'm glad that now that you have authored a big change in posting/list.go, you really understand this core piece of tech well.
Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)
posting/list.go, line 893 at r17 (raw file):
// way to get the max commit timestamp is to pick all the relevant postings for the given // readTs and calculate the maxCommitTs. // If deleteBelowTs is greater than zero, there was a delete all marker. The list of postings
100 chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still running some benchmarks so I'll hold on merging this for a bit.
I am seeing some differences though, the version with split lists finishes in about an hour (with 3 million mutations and fulltext indexes) while master starts crashing my machine after 10-15 min. This seems to be a memory issue so I am changing the index to term to see if that let's the benchmark finish. However, the fact that the benchmark can reliably finish with my changes without crapping out is encouraging.
Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot and @manishrjain)
posting/list.go, line 893 at r17 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
100 chars
Done.
I ran the benchmark with just a term index but I didn't observe any substantial difference. I didn't observe any issues with my changes and all the queries I tried worked fine. I am going to write a different benchmark that just creates large posting lists (without split and with split) and compares the time to create iterate over this lists. I am merging this PR for now since all tests and manual checks are passing. |
Congrats! |
Large posting lists are now being split during rollup if they become too large. A normal list has the following format: <key> -> <posting list with all the data for this list> A multi-part list is stored using multiple keys. The keys for the parts will be generated by appending the first uid in the part to the key of the main part. The main part of a multi-part list will have the following format: <key> -> <posting with list of each part's start uid> The data for this list will be stored in key-value pairs with the format below: <key, 1> -> <first part of the list with all the data for this part> <key, next start uid> -> <second part of the list with the data for this part> ... <key, last start uid> -> <last part of the list with all its data> The first part of a multi-part list always has start uid 1 and will be the last part to be deleted, at which point the entire list will be marked for deletion. As the list grows, existing parts might be split if they become too big.
This change is