-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-902709 Limit the max allowed number of chunks in blob #580
Conversation
27a27ec
to
df04e8c
Compare
Why we need to limit as well? |
// Newly added BDEC file would exceed the max number of chunks in a single registration | ||
// request. We put chunks collected so far into the result list and create a new batch with | ||
// the current blob | ||
result.add(currentBatch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This probably can't happen as long as maxChunksInBlob
parameter is less than getMaxChunksInRegistrationRequest
parameter but if it is not we will add a currentBatch
that is empty to the result
list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I added validation that maxChunksInBlob
is always less than getMaxChunksInRegistrationRequest
} else if (blobData.size() | ||
>= this.owningClient.getParameterProvider().getMaxChunksInBlob()) { | ||
// Create a new blob if the current one already contains max allowed number of chunks | ||
logger.logInfo( | ||
"Max allowed number of chunks in the current blob reached. chunkCount={}" | ||
+ " maxChunkCount={} currentBlobPath={}", | ||
blobData.size(), | ||
this.owningClient.getParameterProvider().getMaxChunksInBlob(), | ||
blobPath); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this check be in the shouldStopProcessing() method?
if I am not wrong if the leftoverChannelsDataPerTable is not empty, then we will ignore this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think shouldStopProcessing is checking on the channel level but this is on the chunks level, there is no correctness issue if leftoverChannelsDataPerTable is not empty, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not - the idea is that we process leftover channels first and only when they are empty and we are about to start a new chunk, we check if the new chunks would exceed the max chunk limit. I will add a new test that generates some leftoverChannelsDataPerTable
to verify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional test added to FlushServiceTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comment, PTAL! No need to block on my approval :)
@@ -30,6 +30,9 @@ public class ParameterProvider { | |||
public static final String MAX_CHUNK_SIZE_IN_BYTES = "MAX_CHUNK_SIZE_IN_BYTES".toLowerCase(); | |||
public static final String MAX_ALLOWED_ROW_SIZE_IN_BYTES = | |||
"MAX_ALLOWED_ROW_SIZE_IN_BYTES".toLowerCase(); | |||
public static final String MAX_CHUNKS_IN_BLOB = "MAX_CHUNKS_IN_BDEC".toLowerCase(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests for these added parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests added
@@ -351,6 +351,16 @@ void distributeFlushTasks() { | |||
if (!leftoverChannelsDataPerTable.isEmpty()) { | |||
channelsDataPerTable.addAll(leftoverChannelsDataPerTable); | |||
leftoverChannelsDataPerTable.clear(); | |||
} else if (blobData.size() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the logic here combined with
snowflake-ingest-java/src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Line 407 in 3a3cbc8
if (idx != channelsDataPerTable.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but it could be triggered in the middle of a chunk, which makes sense when we split based on chunk/blob size, but not for the newly added check. The added check only cares about the number of chunks in a blob, so I put the break just before a new chunk is about to start.
public static final int MAX_CHUNKS_IN_BLOB_DEFAULT = 20; | ||
public static final int MAX_CHUNKS_IN_REGISTRATION_REQUEST_DEFAULT = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the real issue here? Is it because of the total number of chunks in one blob or the total number of the chunks in one request? If it's former, then why we need to limit the total number of chunks in one request? If it's latter, then these two values should be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fundamentally, the issue issue is the number of chunks in one request, because it is where the server-side latency and potential timeouts come from. They could be the same, the reasoning behind one being smaller than the other was that it would give the SDK an oportunity to put BDECs with fewer chunks into the same registration request. For example, if the limit of both is 100 and there is another bdec with just one chunk, it would have to go into its own registration request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, given that this situation should be rare, I'm not sure if we need to be smart here. I'm more in a favor of adding less configurable parameters with random default values, WDYT?
c6dd85c
to
5e261b1
Compare
@sfc-gh-tzhang I addressed your PR comments, could you re-review, please? |
93c1a3e
to
e93dd17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a suggestion, PTAL and feel free to merge if you disagree, thanks!
public static final int MAX_CHUNKS_IN_BLOB_DEFAULT = 20; | ||
public static final int MAX_CHUNKS_IN_REGISTRATION_REQUEST_DEFAULT = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, given that this situation should be rare, I'm not sure if we need to be smart here. I'm more in a favor of adding less configurable parameters with random default values, WDYT?
e93dd17
to
7166bde
Compare
When one client is ingesting into many tables, we are seeing occasional timeouts from blob registrations calls. This PR introduces the limit of 20 chunks in one blob and in blob registration request.
Fixes #570
Fixes #567