forked from oap-project/gazelle_plugin
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wip hashagg opt3 #4
Open
zhouyuan
wants to merge
91
commits into
master
Choose a base branch
from
wip_hashagg_opt3
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch disabled SMJ with local limit as child Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
…roject#847) This patch adds support for expressions: length, char_length, locate, regexp_extract. The codegen part will be added in next PR * Enable length/char_length/locate to be workable * Add regexp_extract expression support * Correct the return type and add subquery checking * Change arrow branch for test [will revert at last] * Let supportColumnarCodegen return false * Check codegen support for columnar BHJ with condition * Fallback non-literal regex case * Remove the assert for bytes read metric in a unit test * Revert "Change arrow branch for test [will revert at last]" This reverts commit 19f8942.
* Add substring_index support * Fix a compile issue * Change arrow branch for test [will revert at last] * Revert "Change arrow branch for test [will revert at last]" This reverts commit e11e9db. * Return false for checking codegen support
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/native-sql-engine/issues Then could you also rename commit message and pull request title in the following format?
See also: |
* code gen changes * use logDebug instead of logWarning
* implement replace function * set columnar codegen to false
* [NSE-728] Upgrade to Arrow 7.0.0 (oap-project#729) Known issues of current Arrow 7.0.0 support: 1. Data Source writing / ORC reading is disabled; 2. Data Source filter pushdown is disabled; 3. FastPFor compression is leading to unexpected concurrent memory writes. Use LZ4 instead. * fix get_physical_plan issue * Revert "[NSE-728] Upgrade to Arrow 7.0.0 (oap-project#729)" This reverts commit e329253. Co-authored-by: Hongze Zhang <hongze.zhang@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
(oap-project#894) * merge master and branch shuffle_opt_fillbyreducer. To submit PR to upstream Implemented fill by reducer * format code * Allocate large block of memory then slice to each buffer * wip, rebase to master * to rebase to master * return to original * added memory leak check in test * Done * disable alignment allocation in benchmark since arrow doesn't support it * optimized validity buffer assign. initialize the validity buffer as true once allocated. skip the initialize during split fix validity buffer bug * fix out of memory test * fix setbitsto bug remove nullcnt * add shuffle test * remove unused variables * allocate validity buffer from pool * fix bug set validity buffer after allocation fix bug during of last bits after process valitity buffer * Add arrow check for batch size and part number use uint32 as row number size * format code * fix format Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
This patch adds the missing columnar expression replacement for substring_index
This patch adds one more configuration for deployment with jar cmd not in PATH export JAR=/path/to/jar --conf spark.executorEnv.JAR=/path/to/jar Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
zhouyuan
force-pushed
the
wip_hashagg_opt3
branch
2 times, most recently
from
May 10, 2022 03:26
fa1b0ff
to
d6ff00a
Compare
…rmat) while using ArrowWriteExtension Closes oap-project#889
zhouyuan
force-pushed
the
wip_hashagg_opt3
branch
2 times, most recently
from
May 10, 2022 14:29
e5d2efe
to
d5bcc6b
Compare
"INSERT OVERWRITE x SELECT /*+ REPARTITION(2) */ * FROM y LIMIT 2" drains 4 rows into table x using Arrow write extension The issue is GlobalLimit used a special setNumRows() which only effect the recordbatch level, the internal vectors are not changed. This patch adds a new API for setting rows on vectors
…#915) further optimization of validity buffer split. Get 8 bit each time and set the destination.
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
* Add pmod support * Set false for code gen checking * Change arrow branch for test [will revert at last] * Revert "Change arrow branch for test [will revert at last]" This reverts commit 85fd4f3.
* Initial commit * Add unit test * Fix compile issues * Fix compile issue in ut and remove decimal support * Fix runtime issues * Add seperate action class for handling string type input * Get attribute for first agg func * Fix bugs in support numeric types * Add ignoreNulls node in making arrow function * Handle special case * Remove a redundant variable * Exclude first agg in code gen * Add a unit test for testing group by case * Format the native code
* Test for removing memset * merge Numeric type case * Add #define * Only remove memset * Only add macro * Recover memset and Only add macro * Use cmov for C2R * Improve Vector usage * Remove String case * Remove memset in Init and add memset in Write * Add memset for fixedwidth type and add benchmark * get optimized code from FelixYBW Repo * Fix int8_t * Fix String/Binary Buffer * Fix Multi Rows Buffer Error * Add native UT and benchmark * Add Buffer UT in columnar_to_row_converter_test.cc * Adapt new interfaces * Fix length and offset in JNI * Add AVX512 Flags * Fix GHA * Add GHA fixes * make properties enbale * Add CXXFlags * Fix UT bugs * Fix clang format * Add .
* Inital commit * Cover no consition case for BHJ & SHJ
* Initial commit * Implement doColumnarCodeGen * Handle different input types
* Revert "disable row_number() temporary (oap-project#994)" This reverts commit b973977. * improve row_number() Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
* [NSE-943] Optimize String/Binary Type for Row2Columnar * Fix TPCDS queries * Add __AVX512BW__ Check * [WIP][NSE-943] Utilize CPU Cache by first-row-second-column and fixed-width type Optimization * Extract vector from functin * Add optimizations * Add remaining optimization * Remove ListType in Native R2C * Fix Spark UT * Clean code * Fix clang format
* s/string/string_view in sort Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * improve timsort Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
… time zone in handling unix timestamp (oap-project#1021) * Trim user-specified format in time expression * Support other formats * Change arrow branch [will revert at last] * Fix issues * Do some converts * Support more format for from_unixtime * Align with spark's timezone awareness * Refine the code * Add some comment * Correct the expected results in a UT * Revert "Change arrow branch [will revert at last]" This reverts commit 11f0977.
* Initial commit * Change arrow branch [will revert at last] * Refine the code * Ignore a UT * Revert "Change arrow branch [will revert at last]" This reverts commit 60a7b34.
…oject#1017) * use TimSort for STRING/DECIMAL onekey based sorting Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix sort unit test std::sort is a stable sort on most times, while Timsort is not stable this patch changes to sort unit tests to align with Timsort result gtest repeat 1000 times seems stable Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix format Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * test log Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * remove too many tests Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix sort external test Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
* fix: missing WSCG check for keys in join * change comment * remove UnsupportedOperationException in join key check
* Turn on the support for get_json_object * Change arrow branch [will revert at last] * Revert "Change arrow branch [will revert at last]" This reverts commit 53b8fc8.
* Initial commit * Cast short type to int32
* Initial commit * Change arrow branch [will revert at last] * Revert "Change arrow branch [will revert at last]" This reverts commit 545b1b4.
* Initial commit * Ignore some test failures
* Initial commit * Fix bugs * Small fix on code format * Fix bugs for find_in_set * Change arrow branch [will revert at last] * Revert "Change arrow branch [will revert at last]" This reverts commit c727c69.
…oap-project#1054) * support function in substring * remove extra match
…ributeReference (oap-project#1041) * Initial commit * Consider leaf expression * Revert some changes and support to get attr for conv/lpad * Remove the handling for MakeTimestamp
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
zhouyuan
force-pushed
the
wip_hashagg_opt3
branch
from
August 2, 2022 08:14
606f2b0
to
6e56197
Compare
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)