Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a limit to the size of the batches sent over a stream. #1412

Merged
merged 8 commits into from
Jul 13, 2020

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Jul 10, 2020

Currently, there's no limit to how many KVs can be included in a single
batch. This leads to issues when the size is over a hard limit. For
example, a batch of size > 2GB will cause issues if it has to be sent
over gRPC.


This change is Reviewable

Currently, there's no limit to how many KVs can be included in a single
batch. This leads to issues when the size is over a hard limit. For
example, a batch of size > 2GB will cause issues if it has to be sent
over gRPC.
Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)


stream.go, line 279 at r1 (raw file):

				// If the size of the batch exceeds maxStreamSize, break from the loop
				// to avoid creating a batch that is so big certain limits are reached.

nit: ... so big that certain limits may be reached.


stream.go, line 315 at r1 (raw file):

			batch = kvs

			// Send the batch immediately if it already exceeds the maximum allowed size.

If the first batch itself is > maxStreamSize, then we shouldn't send it. This is what we are trying to prevent, no? Maybe we should slurp even for the first batch.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, @martinmr, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

If the first batch itself is > maxStreamSize, then we shouldn't send it. This is what we are trying to prevent, no? Maybe we should slurp even for the first batch.

maxStreamSize is 100MB right now so it's well under the limit of 2GB so that this type of situations don't necessarily mean we cannot send the batch. What this means is that we'll immediately try to send it if it already exceeds the limit instead of trying to add more keys into it.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 279 at r1 (raw file):

Previously, parasssh wrote…

nit: ... so big that certain limits may be reached.

Done

@martinmr martinmr requested a review from parasssh July 10, 2020 19:15
Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, @martinmr, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

maxStreamSize is 100MB right now so it's well under the limit of 2GB so that this type of situations don't necessarily mean we cannot send the batch. What this means is that we'll immediately try to send it if it already exceeds the limit instead of trying to add more keys into it.

Yes. But it is confusing to allow the first batch through if it exceeds. Why can't we slurp before the check?

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

Yes. But it is confusing to allow the first batch through if it exceeds. Why can't we slurp before the check?

slurping will increase the size of the batch even further. I'll update the comment.

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)


stream.go, line 315 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

slurping will increase the size of the batch even further. I'll update the comment.

hmm. I see. However, we are still leaving open the possibility of first batch being even more than 2GB. Maybe the upstream producer for the st.kvChan should send <100MB batches?

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

hmm. I see. However, we are still leaving open the possibility of first batch being even more than 2GB. Maybe the upstream producer for the st.kvChan should send <100MB batches?

I checked and there's already a similar check. The limit in that part of the code is 4mb. So this would still be a problem if we had a value bigger than 2gb. For dgraph this should not be a problem since we split the keys.

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)


stream.go, line 315 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I checked and there's already a similar check. The limit in that part of the code is 4mb. So this would still be a problem if we had a value bigger than 2gb. For dgraph this should not be a problem since we split the keys.

So, to clarify, upstream will always send max 4 MB batches. But the slurp earlier in the code added more kvs to the batch without bound. With your fix, upstream still sends 4MB batches and we slurp only upto 100 MB. If this is correct, then this check two lines below is not needed at all since "a" batch will never be more than 4MB?

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)


stream.go, line 317 at r1 (raw file):

			// Send the batch immediately if it already exceeds the maximum allowed size.
			sz := uint64(proto.Size(batch))
			if sz > maxStreamSize {

This check is not needed since upstream will only send 4MB batch.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

So, to clarify, upstream will always send max 4 MB batches. But the slurp earlier in the code added more kvs to the batch without bound. With your fix, upstream still sends 4MB batches and we slurp only upto 100 MB. If this is correct, then this check two lines below is not needed at all since "a" batch will never be more than 4MB?

It could be more than 4MB if the first list the upstream process puts into the batch is greater than 4MB. I can remove this code but I thought we should check first in case there's a big value.


stream.go, line 317 at r1 (raw file):

Previously, parasssh wrote…

This check is not needed since upstream will only send 4MB batch.

See comment above.

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

It could be more than 4MB if the first list the upstream process puts into the batch is greater than 4MB. I can remove this code but I thought we should check first in case there's a big value.

But the check is not preventing the big value from sending (which is the bug we are trying to fix). In fact, it is allowing it and just preventing additional slurp. Removing the check is not good either as it will slurp more to make it an even bigger value. I don't know how much of a corner case this is but if we were to fix, I think we can do the check but split it in 100MB chunks if it is already >100MB inside the if. @jarifibrahim , comment?

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

But the check is not preventing the big value from sending (which is the bug we are trying to fix). In fact, it is allowing it and just preventing additional slurp. Removing the check is not good either as it will slurp more to make it an even bigger value. I don't know how much of a corner case this is but if we were to fix, I think we can do the check but split it in 100MB chunks if it is already >100MB inside the if. @jarifibrahim , comment?

This would be the case where a few entries are greater than 100MB. For example, a list of 1MB and another of 150MB could be sent in the same batch but further splits would not help with the 150MB list.

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: but run it by Ibrahim

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

This would be the case where a few entries are greater than 100MB. For example, a list of 1MB and another of 150MB could be sent in the same batch but further splits would not help with the 150MB list.

I see. List = batch and a list cannot be split further. Its either whole or nothing. So, in that sense the 100MB is a soft limit. Sounds good. Feel free to resolve but just run it by @jarifibrahim

Copy link

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

I see. List = batch and a list cannot be split further. Its either whole or nothing. So, in that sense the 100MB is a soft limit. Sounds good. Feel free to resolve but just run it by @jarifibrahim

Also, put a comment somewhere that the 100MB is a soft limit.

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, @martinmr, and @parasssh)


stream.go, line 317 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

See comment above.

Wouldn't it be simpler to do this

diff --git a/stream.go b/stream.go
index 7500064..eb7341e 100644
--- a/stream.go
+++ b/stream.go
@@ -270,6 +270,13 @@ func (st *Stream) streamKVs(ctx context.Context) error {
 	slurp := func(batch *pb.KVList) error {
 	loop:
 		for {
+			// Send the batch immediately if it already exceeds the maximum allowed size.
+			// If the size of the batch exceeds maxStreamSize, break from the loop to
+			// avoid creating a batch that is so big that certain limits are reached.
+			sz := uint64(proto.Size(batch))
+			if sz > maxStreamSize {
+				break loop
+			}
 			select {
 			case kvs, ok := <-st.kvChan:
 				if !ok {
@@ -277,13 +284,6 @@ func (st *Stream) streamKVs(ctx context.Context) error {
 				}
 				y.AssertTrue(kvs != nil)
 				batch.Kv = append(batch.Kv, kvs.Kv...)
-
-				// If the size of the batch exceeds maxStreamSize, break from the loop to
-				// avoid creating a batch that is so big that certain limits are reached.
-				sz := uint64(proto.Size(batch))
-				if sz > maxStreamSize {
-					break loop
-				}
 			default:
 				break loop
 			}
@@ -315,16 +315,6 @@ outer:
 			y.AssertTrue(kvs != nil)
 			batch = kvs
 
-			// Send the batch immediately if it already exceeds the maximum allowed size.
-			// Calling slurp on this batch will only increase the size further.
-			sz := uint64(proto.Size(batch))
-			if sz > maxStreamSize {
-				if err := sendBatch(batch); err != nil {
-					return err
-				}
-				continue
-			}
-
 			// Otherwise, slurp more keys into this batch.
 			if err := slurp(batch); err != nil {
 				return err

stream_test.go, line 207 at r2 (raw file):

}

func TestBigStream(t *testing.T) {

This test doesn't fail on master. Also, I don't see any batches bigger than 1 mb that would test our threshold.


stream_test.go, line 226 at r2 (raw file):

	var count int
	for _, prefix := range []string{"p0", "p1", "p2"} {
		txn := db.NewTransactionAt(math.MaxUint64, true)

I would've used NewWriteBatchAt. It would be much faster than creating individual transactions (small transactions are costly).

Copy link
Contributor

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, @martinmr, and @parasssh)


stream.go, line 317 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Wouldn't it be simpler to do this

diff --git a/stream.go b/stream.go
index 7500064..eb7341e 100644
--- a/stream.go
+++ b/stream.go
@@ -270,6 +270,13 @@ func (st *Stream) streamKVs(ctx context.Context) error {
 	slurp := func(batch *pb.KVList) error {
 	loop:
 		for {
+			// Send the batch immediately if it already exceeds the maximum allowed size.
+			// If the size of the batch exceeds maxStreamSize, break from the loop to
+			// avoid creating a batch that is so big that certain limits are reached.
+			sz := uint64(proto.Size(batch))
+			if sz > maxStreamSize {
+				break loop
+			}
 			select {
 			case kvs, ok := <-st.kvChan:
 				if !ok {
@@ -277,13 +284,6 @@ func (st *Stream) streamKVs(ctx context.Context) error {
 				}
 				y.AssertTrue(kvs != nil)
 				batch.Kv = append(batch.Kv, kvs.Kv...)
-
-				// If the size of the batch exceeds maxStreamSize, break from the loop to
-				// avoid creating a batch that is so big that certain limits are reached.
-				sz := uint64(proto.Size(batch))
-				if sz > maxStreamSize {
-					break loop
-				}
 			default:
 				break loop
 			}
@@ -315,16 +315,6 @@ outer:
 			y.AssertTrue(kvs != nil)
 			batch = kvs
 
-			// Send the batch immediately if it already exceeds the maximum allowed size.
-			// Calling slurp on this batch will only increase the size further.
-			sz := uint64(proto.Size(batch))
-			if sz > maxStreamSize {
-				if err := sendBatch(batch); err != nil {
-					return err
-				}
-				continue
-			}
-
 			// Otherwise, slurp more keys into this batch.
 			if err := slurp(batch); err != nil {
 				return err

👍

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, and @parasssh)


stream.go, line 317 at r1 (raw file):

Previously, ashish-goswami (Ashish Goswami) wrote…

👍

Fixed.


stream_test.go, line 207 at r2 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

This test doesn't fail on master. Also, I don't see any batches bigger than 1 mb that would test our threshold.

This isn't a failing test. @martinmr used this test to manually verify that his code works.


stream_test.go, line 226 at r2 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

I would've used NewWriteBatchAt. It would be much faster than creating individual transactions (small transactions are costly).

Fixed.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes by Ibrahim look good.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @jarifibrahim, @manishrjain, and @parasssh)


stream.go, line 315 at r1 (raw file):

Previously, parasssh wrote…

Also, put a comment somewhere that the 100MB is a soft limit.

Done.


stream_test.go, line 207 at r2 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

This isn't a failing test. @martinmr used this test to manually verify that his code works.

Done.


stream_test.go, line 226 at r2 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

Fixed.

Done.

@martinmr martinmr merged commit c892251 into master Jul 13, 2020
@martinmr martinmr deleted the martinmr/stream-limit branch July 13, 2020 18:29
martinmr added a commit that referenced this pull request Jul 15, 2020
Currently, there's no limit to how many KVs can be included in a single
batch. This leads to issues when the size is over a hard limit. For
example, a batch of size > 2GB will cause issues if it has to be sent
over gRPC.

Co-authored-by: Ibrahim Jarif <ibrahim@dgraph.io>
(cherry picked from commit c892251)
martinmr added a commit that referenced this pull request Jul 15, 2020
Currently, there's no limit to how many KVs can be included in a single
batch. This leads to issues when the size is over a hard limit. For
example, a batch of size > 2GB will cause issues if it has to be sent
over gRPC.

Fixes DGRAPH-1899

Co-authored-by: Ibrahim Jarif <ibrahim@dgraph.io>
(cherry picked from commit c892251)
jarifibrahim pushed a commit that referenced this pull request Oct 2, 2020
Currently, there's no limit to how many KVs can be included in a single
batch. This leads to issues when the size is over a hard limit. For
example, a batch of size > 2GB will cause issues if it has to be sent
over gRPC.

Co-authored-by: Ibrahim Jarif <ibrahim@dgraph.io>
mYmNeo added a commit to mYmNeo/badger that referenced this pull request Jan 16, 2023
…#1412)

Signed-off-by: thomassong <thomassong2012@gmail.com>
mYmNeo added a commit to mYmNeo/badger that referenced this pull request Feb 13, 2023
…#1412)

Signed-off-by: thomassong <thomassong2012@gmail.com>
@simon28082
Copy link
Contributor

maxStreamSize can changed? I want use small size, but current maxStreamSize=100M is fixed

mYmNeo added a commit to mYmNeo/badger that referenced this pull request Sep 14, 2024
…#1412)

Signed-off-by: thomassong <thomassong2012@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants