Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-417] Sort spill support framework #369

Merged
merged 19 commits into from
Jul 23, 2021

Conversation

zhouyuan
Copy link
Collaborator

@zhouyuan zhouyuan commented Jun 20, 2021

What changes were proposed in this pull request?

This patch add spill support for sort.

  • The Spill() API of Sort is connected to the TaskMemoryManager of Spark. If under lower memory, Spark will try toissue Spill on Sort.
  • On Spill() is called, executor will try to pick one partiton(one task) then try to write the sort content(keys + payloads) to disks, then release the memory for these consents. For following outputing, sort will first load the spill contents into memory then continue to output the results.
  • Please note the Spill will be only called from the following operators of Sort, like SortMergeJoin - this makes spill happens only on sort results outputting phase.
  • Currently the spill files will be written into each yarn container, this should be improved to use Spark's blockmanager.

How was this patch tested?

locally verified

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/native-sql-engine/issues

Then could you also rename commit message and pull request title in the following format?

[NSE-${ISSUES_ID}] ${detailed message}

See also:

zhouyuan added 3 commits July 19, 2021 16:03
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
@zhouyuan zhouyuan changed the title [DNM] Wip sort spill support [NSE-417] Sort spill support Jul 19, 2021
@github-actions
Copy link

#417

zhouyuan added 14 commits July 19, 2021 22:50
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
} else {
(p_->in_record_batch_holder_).clear();
*spilled_size = size;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should refine this logic to allow repeated spill

zhouyuan added 2 commits July 23, 2021 11:46
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
@zhouyuan zhouyuan changed the title [NSE-417] Sort spill support [NSE-417] Sort spill support framework Jul 23, 2021
@zhouyuan zhouyuan merged commit 5d05fd2 into oap-project:master Jul 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant