Initial commit for multi-part lists #3105

martinmr · 2019-03-06T19:16:26Z

This change is

posting/list.go

manishrjain

In general, a lot more complex than it needs to be. Have a look at the comments.

Reviewable status: 0 of 9 files reviewed, 16 unresolved discussions (waiting on @martinmr)

posting/list.go, line 55 at r1 (raw file):

	ErrStopIteration = errors.New("Stop iteration")
	emptyPosting     = &pb.Posting{}
	maxListLength    = 2000000

Let's do it based on size of the list instead of the number of UIDs.

posting/list.go, line 81 at r1 (raw file):

	pendingTxns int32 // Using atomic for this, to avoid locking in SetForDeletion operation.
	deleteMe    int32 // Using atomic for this, to avoid expensive SetForDeletion operation.
	next        *List // If a multi-part list, this is a pointer to the next list.

I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.

If the list is empty, then we know it is one.

posting/list.go, line 85 at r1 (raw file):

}

func appendNextStartToKey(key []byte, nextPartStart uint64) []byte {

Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.

posting/list.go, line 95 at r1 (raw file):

}

func replaceNextStartInKey(key []byte, nextPartStart uint64) []byte {

Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.

posting/list.go, line 104 at r1 (raw file):

}

func (l *List) getNextPartKey() []byte {

This should lie within the iterator, and not be part of the list.

Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.

The UID slice can also be used to do a quick binary search for afterUid option in iterate.

posting/list.go, line 130 at r1 (raw file):

	appendToKey := currPl.FirstPart || !currPl.MultiPart
	if appendToKey {
		return appendNextStartToKey(currKey, nextPartStart)

Yeah, too much arithmetic for micro-optimization.

posting/list.go, line 135 at r1 (raw file):

}

func (l *List) updateMinTs(readTs, minTs uint64) error {

Not sure what this func is for.

posting/list.go, line 711 at r1 (raw file):

}

func (l *List) partIterate(readTs uint64, f func(obj *pb.Posting) error) error {

Try not to create a copy of the iterate logic. It is easy to get wrong.

Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.

posting/list.go, line 904 at r1 (raw file):

		enc := codec.Encoder{BlockSize: blockSize}
		err := curr.partIterate(readTs, func(p *pb.Posting) error {

This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.

Also, once the final list is created, you should do a judgement about whether it should be binary split or not.

posting/list.go, line 1232 at r1 (raw file):

	txn := pstore.NewTransactionAt(readTs, false)
	opts := badger.DefaultIteratorOptions

No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.

This should just be a Txn.Get.

posting/list.go, line 1246 at r1 (raw file):

		return err
	}
	l.next = nextListPart

list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.

Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.

posting/list.go, line 1252 at r1 (raw file):

func (l *List) needsSplit(readTs uint64) bool {
	length := l.partialLength(readTs)

do it based on size. You can get proto size.

posting/list.go, line 1292 at r1 (raw file):

}

func (l *List) splitListPart(readTs uint64) []*List {

To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.

- Load whole list at once. - Eliminate linked list. - Store all plists inside the List object.

posting/list.go

martinmr

Reviewable status: 0 of 10 files reviewed, 22 unresolved discussions (waiting on @golangcibot and @manishrjain)

posting/list.go, line 55 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Let's do it based on size of the list instead of the number of UIDs.

Done.

posting/list.go, line 81 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.

If the list is empty, then we know it is one.

Done. I did something similar but instead it's a list of posting lists. If the list is empty, then the

posting/list.go, line 85 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.

Done. Function has been removed.

posting/list.go, line 95 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.

Done. Function has been removed and replaced by a simpler function.

posting/list.go, line 104 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This should lie within the iterator, and not be part of the list.

Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.

The UID slice can also be used to do a quick binary search for afterUid option in iterate.

Replaced this function by a smaller function that is not part of any interface.

posting/list.go, line 130 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Yeah, too much arithmetic for micro-optimization.

Done.

posting/list.go, line 135 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Not sure what this func is for.

Done. Removed the method. It was used to update minTs in all of the linked list objects but the design has changed so this is not needed anymore.

posting/list.go, line 162 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field firstPart is unused (from unused)

Done.

posting/list.go, line 178 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

Error return value of l.loadNextPart is not checked (from errcheck)

Done.

posting/list.go, line 711 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Try not to create a copy of the iterate logic. It is easy to get wrong.

Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.

Done. I managed to merge the two functions so the logic is not duplicate anymore.

posting/list.go, line 904 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.

Also, once the final list is created, you should do a judgement about whether it should be binary split or not.

Done. I simplified the logic a bit. The logic goes through each part (or the entire list if not a multi part list) and applies the changes to each list. I moved the split logic to happen at the end.

posting/list.go, line 956 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

Error return value of l.updateMinTs is not checked (from errcheck)

Done.

posting/list.go, line 1232 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.

This should just be a Txn.Get.

Done. All parts of a list are read when the list is created.

posting/list.go, line 1246 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.

Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.

Done. All parts of a list are created when the list is created.

posting/list.go, line 1252 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

do it based on size. You can get proto size.

Done.

posting/list.go, line 1292 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.

pb.PostingList is not just a list. It has the encoded and packed list of uids as well as the list of postings. I don't think it can be done as simply as you said.

posting/list.go, line 93 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func appendNextStartToKey is unused (from unused)

Done.

posting/list.go, line 103 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func replaceNextStartInKey is unused (from unused)

Done.

posting/list.go, line 112 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func (*List).getNextPartKey is unused (from unused)

Done.

posting/list.go, line 135 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func generateNextPartKey is unused (from unused)

Done.

posting/list.go, line 162 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field partIndex is unused (from unused)

Done.

posting/list.go, line 566 at r3 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func (*List).pickPartPostings is unused (from unused)

Done.

posting/list.go

manishrjain

Reviewable status: 8 of 13 files reviewed, 22 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)

posting/list.go, line 484 at r12 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

This is the only place where the postings are trimmed. The only place were deleteBelowTs is used is in the iterator but it's to indicate that the immutable layer should be ignored.

Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?

posting/list.go, line 857 at r12 (raw file):

		// readTs and calculate the maxCommitTs.
		deleteBelow, mposts := l.pickPostings(readTs)
		maxCommitTs = x.Max(maxCommitTs, deleteBelow)

I'd move the comment I was mentioning earlier here.

Add test to verify this invariant.

…-list

dgraph/cmd/debug/run.go

…-list

martinmr

Reviewable status: 6 of 13 files reviewed, 23 unresolved discussions (waiting on @golangcibot and @manishrjain)

dgraph/cmd/debug/run.go, line 509 at r16 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

File is not goimports-ed (from goimports)

Done.

posting/list.go, line 484 at r12 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?

Rollup is doing the following

maxCommitTs = x.Max(maxCommitTs, deleteBelow)

so I think this case is covered.

posting/list.go, line 857 at r12 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I'd move the comment I was mentioning earlier here.

Done. I copied it since it seemed relevant in both places.

…-list

manishrjain

Nice work, @martinmr . Also I'm glad that now that you have authored a big change in posting/list.go, you really understand this core piece of tech well.

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)

posting/list.go, line 893 at r17 (raw file):

		// way to get the max commit timestamp is to pick all the relevant postings for the given
		// readTs and calculate the maxCommitTs.
		// If deleteBelowTs is greater than zero, there was a delete all marker. The list of postings

100 chars

…-list

martinmr

I am still running some benchmarks so I'll hold on merging this for a bit.

I am seeing some differences though, the version with split lists finishes in about an hour (with 3 million mutations and fulltext indexes) while master starts crashing my machine after 10-15 min. This seems to be a memory issue so I am changing the index to term to see if that let's the benchmark finish. However, the fact that the benchmark can reliably finish with my changes without crapping out is encouraging.

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)

martinmr

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot and @manishrjain)

posting/list.go, line 893 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars

Done.

…-list

martinmr · 2019-04-19T18:51:42Z

I ran the benchmark with just a term index but I didn't observe any substantial difference. I didn't observe any issues with my changes and all the queries I tried worked fine. I am going to write a different benchmark that just creates large posting lists (without split and with split) and compares the time to create iterate over this lists.

I am merging this PR for now since all tests and manual checks are passing.

manishrjain · 2019-04-19T19:04:57Z

Congrats!

Large posting lists are now being split during rollup if they become too large. A normal list has the following format: <key> -> <posting list with all the data for this list> A multi-part list is stored using multiple keys. The keys for the parts will be generated by appending the first uid in the part to the key of the main part. The main part of a multi-part list will have the following format: <key> -> <posting with list of each part's start uid> The data for this list will be stored in key-value pairs with the format below: <key, 1> -> <first part of the list with all the data for this part> <key, next start uid> -> <second part of the list with the data for this part> ... <key, last start uid> -> <last part of the list with all its data> The first part of a multi-part list always has start uid 1 and will be the last part to be deleted, at which point the entire list will be marked for deletion. As the list grows, existing parts might be split if they become too big.

… > 0 (#4204) Don't iterate over the immutable layer when we have a delete marker. Earlier after doing a S P * deletion, we were still returning the first uid from the immutable layer. Fixes #4182 This bug was introduced in #3105 as part of the PR which introduced multi-part posting lists.

Initial commit for multi-part lists

e8b96e6

golangcibot reviewed Mar 6, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

Decide if list needs to be split based on approxLen.

8aff4a7

manishrjain reviewed Mar 6, 2019

View reviewed changes

martinmr added 2 commits March 6, 2019 13:04

Use size instead of length to do split decisions.

dd630c4

Several changes.

f6125ba

- Load whole list at once. - Eliminate linked list. - Store all plists inside the List object.

golangcibot reviewed Mar 7, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

martinmr added 2 commits March 6, 2019 16:21

Increase test size and remove print statements

b572a39

Combine partial and complete iterations into the same function.

e9d3f05

golangcibot reviewed Mar 7, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

martinmr added 9 commits March 6, 2019 17:08

Remove unused function and optimize rollup.

9f8d54f

Add extra test.

4dc9397

Add extra tests

fc4c183

Remove first_part field from proto.

75db0f5

Fix bug in Uids function.

e29a15d

Removed unused field

ebd8303

Simplify rollup logic.

0250636

Change order of if-statements.

b0eddc4

Remove unused field.

aad3f20

martinmr commented Mar 7, 2019

View reviewed changes

Load list parts on demand. Also simpler split logic.

6029630

golangcibot reviewed Mar 7, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

martinmr added 3 commits March 7, 2019 16:02

Remove partialIteration field.

78d35fb

Remove unused proto fields.

6847378

Remove unnecessary parameter from method.

29df604

golangcibot reviewed Mar 8, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

martinmr added 3 commits March 7, 2019 16:46

Variable rename

d1d6bce

Clean map of uncommitted lists.

a9eca0f

Move logic to clean uncommitted list map at the beginning of rollup.

7670457

golangcibot reviewed Mar 8, 2019

View reviewed changes

posting/list.go Outdated Show resolved Hide resolved

posting/list.go Outdated Show resolved Hide resolved

manishrjain suggested changes Apr 12, 2019

View reviewed changes

martinmr added 9 commits April 12, 2019 12:30

Do not use out.plist.Splits for iteration.

f7debac

Sort splits after removing empty list parts.

4c1ded9

First split should start at 1 instead of zero.

ba74bce

Remove HasStartUid field from ParsedKey.

b5b6a7f

Remove completed todo.

fb04677

First split should not be removed unless entire list is empty.

30c1799

Add test to verify this invariant.

Add comment to Rollup

c13b4cb

Merge remote-tracking branch 'origin/master' into martinmr/multi-part…

8a18e67

…-list

Fix debug tool.

99cc5cb

golangcibot reviewed Apr 13, 2019

View reviewed changes

dgraph/cmd/debug/run.go Outdated Show resolved Hide resolved

martinmr added 3 commits April 15, 2019 10:47

go fmt.

2adc138

Merge remote-tracking branch 'origin/master' into martinmr/multi-part…

d818bbe

…-list

Copied comment.

bca3c05

martinmr commented Apr 15, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into martinmr/multi-part…

4fb78a6

…-list

manishrjain approved these changes Apr 18, 2019

View reviewed changes

martinmr added 2 commits April 18, 2019 10:54

Fix comment length.

51786ad

Merge remote-tracking branch 'origin/master' into martinmr/multi-part…

ad39e3b

…-list

martinmr commented Apr 18, 2019

View reviewed changes

martinmr added 2 commits April 19, 2019 11:13

Merge remote-tracking branch 'origin/master' into martinmr/multi-part…

81f58f0

…-list

Remove extra file

76e1a54

martinmr merged commit a3136d3 into master Apr 19, 2019

martinmr deleted the martinmr/multi-part-list branch April 19, 2019 19:03

pawanrawal mentioned this pull request Oct 23, 2019

Don't traverse immutable layer if deleteBelowTs > 0 #4204

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit for multi-part lists #3105

Initial commit for multi-part lists #3105

martinmr commented Mar 6, 2019 •

edited by manishrjain

Loading

manishrjain left a comment

martinmr left a comment

manishrjain left a comment

martinmr left a comment

manishrjain left a comment

martinmr left a comment

martinmr left a comment

martinmr commented Apr 19, 2019

manishrjain commented Apr 19, 2019

Initial commit for multi-part lists #3105

Initial commit for multi-part lists #3105

Conversation

martinmr commented Mar 6, 2019 • edited by manishrjain Loading

manishrjain left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

manishrjain left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

manishrjain left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

martinmr commented Apr 19, 2019

manishrjain commented Apr 19, 2019

martinmr commented Mar 6, 2019 •

edited by manishrjain

Loading