-
Notifications
You must be signed in to change notification settings - Fork 75
[NSE-928] Add ARROW_CHECK for batch_size check #973
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other place we use int16_t but it may exceed 64K?
@@ -854,7 +854,7 @@ Splitter::row_offset_type Splitter::CalculateSplitBatchSize( | |||
|
|||
arrow::Status Splitter::DoSplit(const arrow::RecordBatch& rb) { | |||
// buffer is allocated less than 64K | |||
// ARROW_CHECK_LE(rb.num_rows(),64*1024); | |||
ARROW_CHECK_LE(rb.num_rows(), 64 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails in TPCDS power test if we add the check here. We need to fix the max batch issue before we enable the check here
@@ -1399,7 +1399,9 @@ class SortOnekeyKernel : public SortArraysToIndicesKernel::Impl { | |||
if (nulls_total_ == 0) { | |||
// if all batches have no null value, | |||
// we do not need to check whether the value is null | |||
ARROW_CHECK_LE(num_batches_, 64 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_batches_ is controled by partition size eventually. looks limitation of 64K maximal recordbatch is too small.
If use uint32_t for ArrayItemIndex: array_id, caused many problems: crash and result mismatch. It is noted in #989 . So retain uint16_t ,and troubleshoot later. |
cc @zhouyuan |
Passed Jenkins workloads again. |
…)" This reverts commit c4d4a65.
* remove sort limit Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * s/uint16/uint32 Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * Revert "s/uint16/uint32" This reverts commit 35c0909. * Revert "remove sort limit" This reverts commit 6b5738a. * change sort limit Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * Revert "[NSE-928] Add ARROW_CHECK for batch_size check (#973)" This reverts commit c4d4a65. * fix window sort Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)