number table apply top-n (order-by and limit) #2665

junli1026 · 2021-11-05T03:19:02Z

I hereby agree to the terms of the CLA available at: https://databend.rs/policies/cla/

Summary

When order-by and limit are set, we just return top-n rows for every scan part.

Changelog

Improvement

Related Issues

Fixes #2617

Test Plan

Unit Tests

Stateless Tests

databend-bot · 2021-11-05T03:19:06Z

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

codecov-commenter · 2021-11-05T03:33:41Z

Codecov Report

Merging #2665 (d95fefe) into main (48e4d79) will increase coverage by 0%.
The diff coverage is 47%.

@@          Coverage Diff          @@
##            main   #2665   +/-   ##
=====================================
  Coverage     69%     69%           
=====================================
  Files        608     608           
  Lines      32513   32544   +31     
=====================================
+ Hits       22509   22571   +62     
+ Misses     10004    9973   -31

Impacted Files	Coverage Δ
query/src/datasources/table_func/numbers_stream.rs	`78% <33%> (-17%)`	⬇️
query/src/datasources/table_func/numbers_table.rs	`76% <69%> (-1%)`	⬇️
common/management/src/namespace/namespace_mgr.rs	`77% <0%> (-3%)`	⬇️
metasrv/src/meta_service/raftmeta.rs	`89% <0%> (-1%)`	⬇️
metasrv/src/meta_service/meta_service_impl.rs	`71% <0%> (+1%)`	⬆️
cli/src/error.rs	`27% <0%> (+3%)`	⬆️
metasrv/src/api/http_service_test.rs	`68% <0%> (+4%)`	⬆️
query/src/common/mod.rs	`85% <0%> (+14%)`	⬆️
metasrv/src/api/http_service.rs	`76% <0%> (+67%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48e4d79...d95fefe. Read the comment docs.

sundy-li · 2021-11-05T05:45:47Z

query/src/datasources/table_func/numbers_stream.rs

@@ -108,6 +118,11 @@ impl NumbersStream {

            let series = DFUInt64Array::new_from_aligned_vec(av).into_series();
            let block = DataBlock::create_by_array(self.schema.clone(), vec![series]);
+            if !self.sort_columns_descriptions.is_empty() {


This did not help to block generate.

select number from numbers(1000) order by number desc limit 3;

SortPartialTransform already did the limit improvement for us.

We should push down the sort/limit into try_get_one_block.

This did not help to block generate.

select number from numbers(1000) order by number desc limit 3;

SortPartialTransform already did the limit improvement for us.

We should push down the sort/limit into try_get_one_block.

Sure will address accordingly.

It is actually inside of function try_get_one_block. I think at least one of the benefit is, it decreases the data block size to transmit, right ? Every thread just send limit number of rows, I think that is the point of top-n push-down. What do you think ?

I think at least one of the benefit is, it decreases the data block size to transmit, right

Yes, currently it benefits the data transfer to other nodes. It saves io mostly and it's better than previous.

But if you cut to limit the block, it can save io & CPU both, it's even better.

The number generation in try_get_block is in sequence, so it's easy to apply sort & limit pushdown optimization.

sure, will address

For other functions like: order by number + 1, order by (number * 3), this is related to #2343.

In this feature, we just make it simple using column name match.

Thanks! Yes, the monotonic check is something I thought about. And I thought the logical would be implemented in the DataBlock sort, that is why I chose use DataBlock sort, instead of jut changing DataRange in the PR.
Thanks for clarifying, will address accordingly.

BTW, @sundy-li @BohuTANG , someone with investing background sent me a message, asking about the creator of this project. I forwarded the message to you in the Slack, could you take a look ?

Sorry for later, sure :)
Thank you

sundy-li

/LGTM

databend-bot · 2021-11-06T10:12:36Z

Wait for another reviewer approval

databend-bot added the pr-improvement label Nov 5, 2021

BohuTANG requested a review from sundy-li November 5, 2021 05:23

sundy-li reviewed Nov 5, 2021

View reviewed changes

BohuTANG requested a review from sundy-li November 5, 2021 23:57

databend-bot added the need-review label Nov 6, 2021

junli1026 added 2 commits November 5, 2021 23:34

number table apply top-n (order-by and limit)

e69d171

Address the feedback

d95fefe

junli1026 force-pushed the jun/dev0 branch from 66bf2ea to d95fefe Compare November 6, 2021 06:54

sundy-li approved these changes Nov 6, 2021

View reviewed changes

sundy-li merged commit d13042f into databendlabs:main Nov 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

number table apply top-n (order-by and limit) #2665

number table apply top-n (order-by and limit) #2665

junli1026 commented Nov 5, 2021

databend-bot commented Nov 5, 2021

codecov-commenter commented Nov 5, 2021 •

edited

Loading

sundy-li Nov 5, 2021

junli1026 Nov 5, 2021 •

edited

Loading

sundy-li Nov 5, 2021 •

edited

Loading

sundy-li Nov 5, 2021

junli1026 Nov 5, 2021

sundy-li Nov 6, 2021

junli1026 Nov 6, 2021 •

edited

Loading

BohuTANG Nov 6, 2021

sundy-li left a comment

databend-bot commented Nov 6, 2021

number table apply top-n (order-by and limit) #2665

number table apply top-n (order-by and limit) #2665

Conversation

junli1026 commented Nov 5, 2021

Summary

Changelog

Related Issues

Test Plan

databend-bot commented Nov 5, 2021

codecov-commenter commented Nov 5, 2021 • edited Loading

Codecov Report

sundy-li Nov 5, 2021

Choose a reason for hiding this comment

junli1026 Nov 5, 2021 • edited Loading

Choose a reason for hiding this comment

sundy-li Nov 5, 2021 • edited Loading

Choose a reason for hiding this comment

sundy-li Nov 5, 2021

Choose a reason for hiding this comment

junli1026 Nov 5, 2021

Choose a reason for hiding this comment

sundy-li Nov 6, 2021

Choose a reason for hiding this comment

junli1026 Nov 6, 2021 • edited Loading

Choose a reason for hiding this comment

BohuTANG Nov 6, 2021

Choose a reason for hiding this comment

sundy-li left a comment

Choose a reason for hiding this comment

databend-bot commented Nov 6, 2021

codecov-commenter commented Nov 5, 2021 •

edited

Loading

junli1026 Nov 5, 2021 •

edited

Loading

sundy-li Nov 5, 2021 •

edited

Loading

junli1026 Nov 6, 2021 •

edited

Loading