Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1512047 Introduce independent per-table flushes when interleaving is disabled #788
SNOW-1512047 Introduce independent per-table flushes when interleaving is disabled #788
Changes from 3 commits
be25b1e
27e1bcb
91f546e
89e2a4a
5f57e88
90b2e2f
cdc5385
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure i understand what this helps with. If someone does addChannel, and doesn't add any data for a minute, the first row that they add will trigger a flush since we'll mistakenly think its been a long time since the last flush.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Should we change the logic to following?
lastFlushTime
when creating a channel.putRow
orputRows
, iflastFlushTime
is null, set to current time.lastFlushTime
to null, go to step 2.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed pl track this with a JIRA so we don't forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jira created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is confusing, this is the end time of the previous flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to
prevFlushEndTime
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the check before populating
tablesToFlush
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. Preserve the client level
lastFlushTime
andisNeedFlush
to avoid checking table level flush info when interleaving is enabled which might cause performance change. Preserve old logging format when interleaving is enabled to avoid logging too much information.cc: @sfc-gh-hmadan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this, If the previous code block already picked up the minimal set of tables needing flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, even if interleaving is enabled, I'd prefer to keep the above logic for flushing and wait until the MaxClientLag for each channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I aimed to maintain the original interleaving behavior, where all channels are flushed if any channel needs it. With independent flushing intervals, we might miss the chance to combine multiple chunks into the same BDEC. A potential workaround is to discretize timestamps and reduce jitter on
lastFlushTime
in interleaving mode. This can increase the chances of combining multiple chunks into the same blob. What do you think?