-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix stats for prefix_eq function #16666
Conversation
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Review 🔍
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Code Suggestions ✨
|
Conflicts: pkg/sql/plan/expr_opt.go pkg/sql/plan/stats.go
* fix bvt test (matrixorigin#16605) fix bvt test Approved by: @heni02 * remove duplicates in object list for flushing (matrixorigin#16677) - reduce the size of tombstone files Approved by: @XuPeng-SH * fix stats for prefix_eq function (matrixorigin#16666) 由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句 现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度 Approved by: @aunjgr * Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633) Add a log statement to improve traceability when the transfer is not needed Approved by: @XuPeng-SH * make sure all pipeline run in single parallel for tp query (matrixorigin#16685) make sure all pipeline run in single parallel for tp query Approved by: @ouyuanning, @aunjgr * [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676) handle null in convertRowsIntoBatch Approved by: @daviszhen * Fix enumtype system variable check (matrixorigin#16691) Fix enumtype system variable check Approved by: @daviszhen * support query replica count of special cn (matrixorigin#16642) support query replica count of special cn Approved by: @reusee, @daviszhen * split build operator into merge and build operators (matrixorigin#16673) 把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备 Approved by: @m-schen, @ouyuanning, @aunjgr * fileservice: add caching dns resolver (matrixorigin#16702) fileservice: longer timeout for http client Approved by: @zhangxu19830126 * Fix-16620 (matrixorigin#16681) 1. Reuse latest partition state. Approved by: @badboynt1, @m-schen, @XuPeng-SH * rmTag16601_16597 (matrixorigin#16700) rm tag 16601 and 16597 Approved by: @heni02 * optimize top operator in pipeline for tp query (matrixorigin#16704) optimize top operator in pipeline for tp query, don't need mergetop Approved by: @m-schen * optimize limit operator in pipeline for tp query (matrixorigin#16705) optimize limit operator in pipeline for tp query, don't need toplimit Approved by: @m-schen * add global system variable and session variable account isolation cases (matrixorigin#16694) add global system variable and session variable account isolation cases Approved by: @aressu1985 * Add issue 16613 cases (matrixorigin#16719) Add issue 16613 cases Approved by: @aressu1985 * add case for function hex() and unhex() (matrixorigin#16711) add case for hex() and unhex(). Approved by: @heni02 * optimize group operator in pipeline for tp query (matrixorigin#16717) optimize group operator in pipeline for tp query, don't need mergegroup Approved by: @m-schen * add debug info for panic (matrixorigin#16634) issue上的问题是事务状态异常。 在出问题的调用栈上,增加事务状态的检测逻辑。 txnIsValid 判断事务状态是否异常。 Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr * add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722) add optimizer hint exectype to force query to be ap or tp Approved by: @ouyuanning * update bloom filter for the new prefix bf (matrixorigin#16684) support prefix bloom filter for object reader and writer Approved by: @XuPeng-SH * memorycache: code clean-ups (matrixorigin#16313) fileservice: remove IOEntry.ReadFromOSFile memorycache: remove RCBytes Approved by: @zhangxu19830126 * optimize offset operator in pipeline for tp query (matrixorigin#16706) optimize limit operator in pipeline for tp query, don't need mergeoffset Approved by: @m-schen * fix merge --------- Co-authored-by: YANGGMM <www.yangzhao123@gmail.com> Co-authored-by: aptend <49832303+aptend@users.noreply.github.com> Co-authored-by: nitao <badboynt@126.com> Co-authored-by: Wei Ziran <weiziran125@gmail.com> Co-authored-by: Kai Cao <ck89119@users.noreply.github.com> Co-authored-by: fagongzi <zhangxu19830126@gmail.com> Co-authored-by: reusee <reusee@gmail.com> Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com> Co-authored-by: Ariznawlll <ariznawl@163.com> Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com> Co-authored-by: davis zhen <daviszhen007@gmail.com> Co-authored-by: GreatRiver <2552853833@qq.com>
* fix bvt test (matrixorigin#16605) fix bvt test Approved by: @heni02 * remove duplicates in object list for flushing (matrixorigin#16677) - reduce the size of tombstone files Approved by: @XuPeng-SH * fix stats for prefix_eq function (matrixorigin#16666) 由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句 现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度 Approved by: @aunjgr * Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633) Add a log statement to improve traceability when the transfer is not needed Approved by: @XuPeng-SH * make sure all pipeline run in single parallel for tp query (matrixorigin#16685) make sure all pipeline run in single parallel for tp query Approved by: @ouyuanning, @aunjgr * [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676) handle null in convertRowsIntoBatch Approved by: @daviszhen * Fix enumtype system variable check (matrixorigin#16691) Fix enumtype system variable check Approved by: @daviszhen * support query replica count of special cn (matrixorigin#16642) support query replica count of special cn Approved by: @reusee, @daviszhen * split build operator into merge and build operators (matrixorigin#16673) 把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备 Approved by: @m-schen, @ouyuanning, @aunjgr * fileservice: add caching dns resolver (matrixorigin#16702) fileservice: longer timeout for http client Approved by: @zhangxu19830126 * Fix-16620 (matrixorigin#16681) 1. Reuse latest partition state. Approved by: @badboynt1, @m-schen, @XuPeng-SH * rmTag16601_16597 (matrixorigin#16700) rm tag 16601 and 16597 Approved by: @heni02 * optimize top operator in pipeline for tp query (matrixorigin#16704) optimize top operator in pipeline for tp query, don't need mergetop Approved by: @m-schen * optimize limit operator in pipeline for tp query (matrixorigin#16705) optimize limit operator in pipeline for tp query, don't need toplimit Approved by: @m-schen * add global system variable and session variable account isolation cases (matrixorigin#16694) add global system variable and session variable account isolation cases Approved by: @aressu1985 * Add issue 16613 cases (matrixorigin#16719) Add issue 16613 cases Approved by: @aressu1985 * add case for function hex() and unhex() (matrixorigin#16711) add case for hex() and unhex(). Approved by: @heni02 * optimize group operator in pipeline for tp query (matrixorigin#16717) optimize group operator in pipeline for tp query, don't need mergegroup Approved by: @m-schen * add debug info for panic (matrixorigin#16634) issue上的问题是事务状态异常。 在出问题的调用栈上,增加事务状态的检测逻辑。 txnIsValid 判断事务状态是否异常。 Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr * add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722) add optimizer hint exectype to force query to be ap or tp Approved by: @ouyuanning * update bloom filter for the new prefix bf (matrixorigin#16684) support prefix bloom filter for object reader and writer Approved by: @XuPeng-SH * memorycache: code clean-ups (matrixorigin#16313) fileservice: remove IOEntry.ReadFromOSFile memorycache: remove RCBytes Approved by: @zhangxu19830126 * optimize offset operator in pipeline for tp query (matrixorigin#16706) optimize limit operator in pipeline for tp query, don't need mergeoffset Approved by: @m-schen * add issue 16139 cases (matrixorigin#16733) add issue 16139 cases Approved by: @aressu1985 * handle Restore Duplicate Entry (matrixorigin#16567) SQL执行时将事务WriteOffset与当前语句绑定,解决读数据万圣节问题 MO Checkin Regression test susccess: https://github.com/matrixorigin/ci-test/actions/runs/9362961560 https://github.com/matrixorigin/ci-test/actions/runs/9379340928 Approved by: @daviszhen, @badboynt1, @m-schen, @reusee, @zhangxu19830126, @XuPeng-SH, @aunjgr, @triump2020 * Handle Cancel Restore Statement Fail (matrixorigin#16735) handle `ctrl+c` failed to cancel during restore data Approved by: @daviszhen * fix a bug that cause ap performance regression on multi cn (matrixorigin#16737) fix a bug that cause ap performance regression on multi cn Approved by: @ouyuanning * optimize order operator in pipeline for tp query (matrixorigin#16742) 把mergeorder算子拆成merge+mergeorder,把接收数据的功能独立出来。 对于tp query,pipeline直接改成scan->order->mergeorder即可,不需要通过connector-merge进行连接 Approved by: @ouyuanning, @m-schen * dashboard: refactor runtime dashboard (matrixorigin#16746) refine go runtime metrics dashboard Approved by: @zhangxu19830126 * malloc: add profiler (matrixorigin#16699) malloc: refactor config malloc: add chainDeallocator, FuncDeallocator; optimize metrics allocator malloc: optimize metrics allocator malloc: enable metrics default Approved by: @zhangxu19830126 * skip stats for create view (matrixorigin#16728) skip stats for create view Approved by: @daviszhen, @aunjgr * fileservice: add disk-based object storage (matrixorigin#16610) add local disk s3 fs object storage for testing purposes Approved by: @zhangxu19830126 * block reader supports between filter. (matrixorigin#16674) block reader supports between filters. Approved by: @XuPeng-SH, @heni02, @aunjgr * change LIMIT,OFFSET's data type from int64 to uint64 (matrixorigin#16697) limit和offset 的数据类型改为uint64。 Approved by: @m-schen, @reusee, @ouyuanning, @badboynt1, @aunjgr, @aressu1985 * fix shard service panic (matrixorigin#16759) fix shard service panic Approved by: @reusee * make add txn error trace async (matrixorigin#16757) Fix hung when add txn error trace Approved by: @iamlinjunhong * Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" (matrixorigin#16754) Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" Approved by: @badboynt1, @daviszhen * [bug] launch: fix timeout mechanism when TN service startup (matrixorigin#16760) fix timeout mechanism when TN service startup: 5m timeout for total wait and 5s timeout for each request. Approved by: @zhangxu19830126 * add a code owner for vector (matrixorigin#16762) add XuPeng-SH as vector owner Approved by: @fengttt * refactor the block reader filter to support more expressions (matrixorigin#16756) 1. `>, >=, <, <=` 2. `between and, prefix in, prefix between, prefix eq` 3. `in, eq` Approved by: @XuPeng-SH, @aressu1985 * update readme(1.2.0) (matrixorigin#16243) * Remove list checkpoint meta file (matrixorigin#16723) Remove list checkpoint meta file Approved by: @XuPeng-SH * fileservice: fix fd leak in LocalFS.read (matrixorigin#16748) fix fd leak in LocalFS.read Approved by: @fengttt * [BugFix]: Remove unnecessary projections in master index (matrixorigin#16766) Remove unnecessary projects from Master Index. ```sql mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a75 = 'I2nJ0RqIQu'; +---------------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=6ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=2ms waitTime=44ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = #[1,0]) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes | | Filter Cond: (tbl.a75 = 'I2nJ0RqIQu') | | Block Filter Cond: (tbl.a75 = 'I2nJ0RqIQu') | | Runtime Filter Probe: tbl.a100 | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=2ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes | | Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ') | | Block Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ') | +---------------------------------------------------------------------------------------------------------------------------------------------------------+ 16 rows in set (0.01 sec) mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a37 = '3Tfm6CEXy5' AND tbl.a94 = '6PRBdXpsVB'; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=15ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=4ms waitTime=72ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = #[1,0]) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5') | | Block Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5') | | Runtime Filter Probe: tbl.a100 | | -> Join | | Analyze: timeConsumed=4ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[4ms] waitTime=57ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes | | Join Type: INNER | | Join Cond: (#[0,0] = #[1,0]) | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=4 inputRows=32768 outputRows=1 InputSize=2mb OutputSize=24bytes MemorySize=679936bytes | | Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ') | | Block Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ') | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes | | Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ') | | Block Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ') | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 24 rows in set (0.00 sec) ``` Approved by: @badboynt1 * fix * fix * fix * fix --------- Co-authored-by: YANGGMM <www.yangzhao123@gmail.com> Co-authored-by: aptend <49832303+aptend@users.noreply.github.com> Co-authored-by: nitao <badboynt@126.com> Co-authored-by: Wei Ziran <weiziran125@gmail.com> Co-authored-by: Kai Cao <ck89119@users.noreply.github.com> Co-authored-by: fagongzi <zhangxu19830126@gmail.com> Co-authored-by: reusee <reusee@gmail.com> Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com> Co-authored-by: Ariznawlll <ariznawl@163.com> Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com> Co-authored-by: davis zhen <daviszhen007@gmail.com> Co-authored-by: GreatRiver <2552853833@qq.com> Co-authored-by: qingxinhome <70939751+qingxinhome@users.noreply.github.com> Co-authored-by: gouhongshen <gouhongshen@hotmail.com> Co-authored-by: zengyan1 <93656539+zengyan1@users.noreply.github.com> Co-authored-by: LiuBo <g.user.lb@gmail.com> Co-authored-by: yangj1211 <153493538+yangj1211@users.noreply.github.com> Co-authored-by: Arjun Sunil Kumar <arjunsk@users.noreply.github.com>
* fix bvt test (matrixorigin#16605) fix bvt test Approved by: @heni02 * remove duplicates in object list for flushing (matrixorigin#16677) - reduce the size of tombstone files Approved by: @XuPeng-SH * fix stats for prefix_eq function (matrixorigin#16666) 由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句 现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度 Approved by: @aunjgr * Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633) Add a log statement to improve traceability when the transfer is not needed Approved by: @XuPeng-SH * make sure all pipeline run in single parallel for tp query (matrixorigin#16685) make sure all pipeline run in single parallel for tp query Approved by: @ouyuanning, @aunjgr * [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676) handle null in convertRowsIntoBatch Approved by: @daviszhen * Fix enumtype system variable check (matrixorigin#16691) Fix enumtype system variable check Approved by: @daviszhen * support query replica count of special cn (matrixorigin#16642) support query replica count of special cn Approved by: @reusee, @daviszhen * split build operator into merge and build operators (matrixorigin#16673) 把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备 Approved by: @m-schen, @ouyuanning, @aunjgr * fileservice: add caching dns resolver (matrixorigin#16702) fileservice: longer timeout for http client Approved by: @zhangxu19830126 * Fix-16620 (matrixorigin#16681) 1. Reuse latest partition state. Approved by: @badboynt1, @m-schen, @XuPeng-SH * rmTag16601_16597 (matrixorigin#16700) rm tag 16601 and 16597 Approved by: @heni02 * optimize top operator in pipeline for tp query (matrixorigin#16704) optimize top operator in pipeline for tp query, don't need mergetop Approved by: @m-schen * optimize limit operator in pipeline for tp query (matrixorigin#16705) optimize limit operator in pipeline for tp query, don't need toplimit Approved by: @m-schen * add global system variable and session variable account isolation cases (matrixorigin#16694) add global system variable and session variable account isolation cases Approved by: @aressu1985 * Add issue 16613 cases (matrixorigin#16719) Add issue 16613 cases Approved by: @aressu1985 * add case for function hex() and unhex() (matrixorigin#16711) add case for hex() and unhex(). Approved by: @heni02 * optimize group operator in pipeline for tp query (matrixorigin#16717) optimize group operator in pipeline for tp query, don't need mergegroup Approved by: @m-schen * add debug info for panic (matrixorigin#16634) issue上的问题是事务状态异常。 在出问题的调用栈上,增加事务状态的检测逻辑。 txnIsValid 判断事务状态是否异常。 Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr * add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722) add optimizer hint exectype to force query to be ap or tp Approved by: @ouyuanning * update bloom filter for the new prefix bf (matrixorigin#16684) support prefix bloom filter for object reader and writer Approved by: @XuPeng-SH * memorycache: code clean-ups (matrixorigin#16313) fileservice: remove IOEntry.ReadFromOSFile memorycache: remove RCBytes Approved by: @zhangxu19830126 * optimize offset operator in pipeline for tp query (matrixorigin#16706) optimize limit operator in pipeline for tp query, don't need mergeoffset Approved by: @m-schen * add issue 16139 cases (matrixorigin#16733) add issue 16139 cases Approved by: @aressu1985 * handle Restore Duplicate Entry (matrixorigin#16567) SQL执行时将事务WriteOffset与当前语句绑定,解决读数据万圣节问题 MO Checkin Regression test susccess: https://github.com/matrixorigin/ci-test/actions/runs/9362961560 https://github.com/matrixorigin/ci-test/actions/runs/9379340928 Approved by: @daviszhen, @badboynt1, @m-schen, @reusee, @zhangxu19830126, @XuPeng-SH, @aunjgr, @triump2020 * Handle Cancel Restore Statement Fail (matrixorigin#16735) handle `ctrl+c` failed to cancel during restore data Approved by: @daviszhen * fix a bug that cause ap performance regression on multi cn (matrixorigin#16737) fix a bug that cause ap performance regression on multi cn Approved by: @ouyuanning * optimize order operator in pipeline for tp query (matrixorigin#16742) 把mergeorder算子拆成merge+mergeorder,把接收数据的功能独立出来。 对于tp query,pipeline直接改成scan->order->mergeorder即可,不需要通过connector-merge进行连接 Approved by: @ouyuanning, @m-schen * dashboard: refactor runtime dashboard (matrixorigin#16746) refine go runtime metrics dashboard Approved by: @zhangxu19830126 * malloc: add profiler (matrixorigin#16699) malloc: refactor config malloc: add chainDeallocator, FuncDeallocator; optimize metrics allocator malloc: optimize metrics allocator malloc: enable metrics default Approved by: @zhangxu19830126 * skip stats for create view (matrixorigin#16728) skip stats for create view Approved by: @daviszhen, @aunjgr * fileservice: add disk-based object storage (matrixorigin#16610) add local disk s3 fs object storage for testing purposes Approved by: @zhangxu19830126 * block reader supports between filter. (matrixorigin#16674) block reader supports between filters. Approved by: @XuPeng-SH, @heni02, @aunjgr * change LIMIT,OFFSET's data type from int64 to uint64 (matrixorigin#16697) limit和offset 的数据类型改为uint64。 Approved by: @m-schen, @reusee, @ouyuanning, @badboynt1, @aunjgr, @aressu1985 * fix shard service panic (matrixorigin#16759) fix shard service panic Approved by: @reusee * make add txn error trace async (matrixorigin#16757) Fix hung when add txn error trace Approved by: @iamlinjunhong * Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" (matrixorigin#16754) Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" Approved by: @badboynt1, @daviszhen * [bug] launch: fix timeout mechanism when TN service startup (matrixorigin#16760) fix timeout mechanism when TN service startup: 5m timeout for total wait and 5s timeout for each request. Approved by: @zhangxu19830126 * add a code owner for vector (matrixorigin#16762) add XuPeng-SH as vector owner Approved by: @fengttt * refactor the block reader filter to support more expressions (matrixorigin#16756) 1. `>, >=, <, <=` 2. `between and, prefix in, prefix between, prefix eq` 3. `in, eq` Approved by: @XuPeng-SH, @aressu1985 * update readme(1.2.0) (matrixorigin#16243) * Remove list checkpoint meta file (matrixorigin#16723) Remove list checkpoint meta file Approved by: @XuPeng-SH * fileservice: fix fd leak in LocalFS.read (matrixorigin#16748) fix fd leak in LocalFS.read Approved by: @fengttt * [BugFix]: Remove unnecessary projections in master index (matrixorigin#16766) Remove unnecessary projects from Master Index. ```sql mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a75 = 'I2nJ0RqIQu'; +---------------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=6ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=2ms waitTime=44ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = #[1,0]) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes | | Filter Cond: (tbl.a75 = 'I2nJ0RqIQu') | | Block Filter Cond: (tbl.a75 = 'I2nJ0RqIQu') | | Runtime Filter Probe: tbl.a100 | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=2ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes | | Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ') | | Block Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ') | +---------------------------------------------------------------------------------------------------------------------------------------------------------+ 16 rows in set (0.01 sec) mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a37 = '3Tfm6CEXy5' AND tbl.a94 = '6PRBdXpsVB'; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=15ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=4ms waitTime=72ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = #[1,0]) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5') | | Block Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5') | | Runtime Filter Probe: tbl.a100 | | -> Join | | Analyze: timeConsumed=4ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[4ms] waitTime=57ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes | | Join Type: INNER | | Join Cond: (#[0,0] = #[1,0]) | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=4 inputRows=32768 outputRows=1 InputSize=2mb OutputSize=24bytes MemorySize=679936bytes | | Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ') | | Block Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ') | | -> Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN] | | Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes | | Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ') | | Block Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ') | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 24 rows in set (0.00 sec) ``` Approved by: @badboynt1 * add SERVER_MORE_RESULTS_EXISTS judgment (matrixorigin#16712) 补全响应client时的SERVER_MORE_RESULTS_EXISTS 设置 Approved by: @daviszhen * fileservice: tune metrics dashboard (matrixorigin#16770) add metrics for io.ReadAll Approved by: @zhangxu19830126 * [opt] retain IN expression in prepared stmt (matrixorigin#16744) don't convert IN expression to OR list in prepared statements Approved by: @ouyuanning * Refactor view scope execute (matrixorigin#15984) 降低创建视图与创建表的操作的耦合,增强代码可读性, Approved by: @badboynt1, @ouyuanning, @m-schen, @aunjgr * optimize join pipeline for tp query (matrixorigin#16773) 对于tp query,将build端的pipeline优化掉,直接将build算子添加到右子树上,不需要connector->merge进行连接 并且直接在compile时完成,不需要在运行时再对pipeline做修改 Approved by: @ouyuanning * add optimizer hints (matrixorigin#16782) 增加了一个optimizer hints选项,可以强制所有的right join改成left join 修改了对query ap/tp hint的实现方式 Approved by: @aunjgr * [BugFix]: Add ColName to MasterIndexScan for Filter PushDown (matrixorigin#16778) - Adding ColName to MasterIndex Optimizer Plan for Filter Pushdown. - With this change, master index performance is now reasonable. Hence removing the Experimental Flag. 1 Filter Query QPS - No index: 500 - 1 Master: 2820 - 100 Secondary: 2958 <details> <summary> Query Plan </summary> ```sql -- Master Index mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a48 = 'b92k7dWP5t'; +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=4ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=1ms waitTime=29ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes | | Filter Cond: (tbl.a48 = 'b92k7dWP5t') | | Block Filter Cond: (tbl.a48 = 'b92k7dWP5t') | | Runtime Filter Probe: tbl.a100 | | -> Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN] | | Analyze: timeConsumed=1ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes | | Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ') | | Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ') | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ 16 rows in set (0.00 sec) -- Secondary Index mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a48 = 'b92k7dWP5t'; +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | TP QURERY PLAN | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = __mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_pri_col) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes | | Filter Cond: (tbl.a48 = 'b92k7dWP5t') | | Block Filter Cond: (tbl.a48 = 'b92k7dWP5t') | | Runtime Filter Probe: tbl.a100 | | -> Table Scan on a.__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067 [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ') | | Block Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ') | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ 16 rows in set (0.00 sec) ``` </details> 2 Filter Query QPS - No Index: - 1 Master: 1335 (right now we use both the filters using inner join) - 100 Secondary: 1725 (we only make use of one secondary index table) <details> <summary> Query Plan </summary> ```sql -- master index mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol'; +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | AP QUERY PLAN ON MULTICN(10 core) | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=36ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=11ms waitTime=167ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol') | | Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol') | | Runtime Filter Probe: tbl.a100 | | -> Join | | Analyze: timeConsumed=11ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[11ms] waitTime=145ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes | | Join Type: INNER | | Join Cond: (__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col) | | -> Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN] | | Analyze: timeConsumed=5ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes | | Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ') | | Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ') | | -> Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN] | | Analyze: timeConsumed=11ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes | | Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ') | | Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ') | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 24 rows in set (0.03 sec) -- secondary index mysql> explain analyze SELECT tbl.a100 FROM tbl WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol'; +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | TP QURERY PLAN | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ | Project | | Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes | | -> Join | | Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes | | Join Type: INDEX | | Join Cond: (tbl.a100 = __mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_pri_col) | | Runtime Filter Build: #[-1,0] | | -> Table Scan on a.tbl [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol') | | Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol') | | Runtime Filter Probe: tbl.a100 | | -> Table Scan on a.__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6 [ForceOneCN] | | Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes | | Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ') | | Block Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ') | +-----------------------------------------------------------------------------------------------------------------------------------------------------+ 16 rows in set (0.00 sec) ``` </details> Approved by: @daviszhen, @badboynt1, @heni02 * dashboard: fix runtime dashboard (matrixorigin#16771) refine runtime metrics dashboard Approved by: @zhangxu19830126 * remove 1PC commands (matrixorigin#16786) remove 1PC subcommands Approved by: @XuPeng-SH * do not apply delete when flush (matrixorigin#16731) do not apply delete when flush Approved by: @XuPeng-SH * fix merge * rm 1pc --------- Co-authored-by: YANGGMM <www.yangzhao123@gmail.com> Co-authored-by: aptend <49832303+aptend@users.noreply.github.com> Co-authored-by: nitao <badboynt@126.com> Co-authored-by: Wei Ziran <weiziran125@gmail.com> Co-authored-by: Kai Cao <ck89119@users.noreply.github.com> Co-authored-by: fagongzi <zhangxu19830126@gmail.com> Co-authored-by: reusee <reusee@gmail.com> Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com> Co-authored-by: Ariznawlll <ariznawl@163.com> Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com> Co-authored-by: davis zhen <daviszhen007@gmail.com> Co-authored-by: GreatRiver <2552853833@qq.com> Co-authored-by: qingxinhome <70939751+qingxinhome@users.noreply.github.com> Co-authored-by: gouhongshen <gouhongshen@hotmail.com> Co-authored-by: zengyan1 <93656539+zengyan1@users.noreply.github.com> Co-authored-by: LiuBo <g.user.lb@gmail.com> Co-authored-by: XuPeng-SH <xupeng3112@163.com> Co-authored-by: yangj1211 <153493538+yangj1211@users.noreply.github.com> Co-authored-by: Arjun Sunil Kumar <arjunsk@users.noreply.github.com> Co-authored-by: CJKkkk_ <66134511+CJKkkk-315@users.noreply.github.com> Co-authored-by: bRong Njam <longran1989@gmail.com>
由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句 现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度 Approved by: @aunjgr
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #16542
What this PR does / why we need it:
由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句
现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度
PR Type
Bug fix, Enhancement
Description
applyIndicesForFiltersRegularIndex
anddoMergeFiltersOnCompositeKey
functions.estimateExprSelectivity
function to cache selectivity values and handle various expressions more accurately.Changes walkthrough 📝
apply_indices.go
Enhance filter selectivity calculation in index application
pkg/sql/plan/apply_indices.go
applyIndicesForFiltersRegularIndex
function to includeselectivity estimation.
filters.
expr_opt.go
Improve selectivity estimation for composite key filters
pkg/sql/plan/expr_opt.go
doMergeFiltersOnCompositeKey
to include selectivityestimation.
filters.
stats.go
Refine expression selectivity estimation and caching
pkg/sql/plan/stats.go
estimateExprSelectivity
to cache selectivity values.utils.go
Include selectivity in expression formatting
pkg/sql/plan/utils.go