Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug](insert )fix insert wrong data on mv when stmt have multiple values #27297

Merged
merged 3 commits into from
Nov 21, 2023

Conversation

BiteTheDDDDt
Copy link
Contributor

@BiteTheDDDDt BiteTheDDDDt commented Nov 20, 2023

Proposed changes

#27277
fix insert wrong data on mv when stmt have multiple values

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.05 seconds
stream load tsv: 575 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17100689314 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit f8ae91119178719ebee7731e14decbb2e2e47743, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4925	4691	4662	4662
q2	359	155	171	155
q3	2021	1913	1958	1913
q4	1385	1290	1266	1266
q5	3988	3966	4026	3966
q6	243	133	133	133
q7	1408	880	898	880
q8	2751	2782	2778	2778
q9	9780	9661	9603	9603
q10	3623	3555	3551	3551
q11	380	234	251	234
q12	434	297	296	296
q13	4592	3807	3822	3807
q14	325	280	298	280
q15	587	554	519	519
q16	660	582	581	581
q17	1134	960	923	923
q18	7771	7354	7376	7354
q19	1658	1718	1677	1677
q20	575	291	282	282
q21	4435	3909	3968	3909
q22	466	375	384	375
Total cold run time: 53500 ms
Total hot run time: 49144 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4610	4617	4568	4568
q2	348	229	240	229
q3	4050	4007	4002	4002
q4	2705	2695	2678	2678
q5	9691	9690	9809	9690
q6	241	121	129	121
q7	3020	2484	2430	2430
q8	4447	4478	4481	4478
q9	13291	13041	13042	13041
q10	4119	4196	4228	4196
q11	820	632	668	632
q12	973	805	793	793
q13	4280	3566	3569	3566
q14	384	344	344	344
q15	581	532	516	516
q16	735	671	661	661
q17	3913	3913	3929	3913
q18	9417	9183	8971	8971
q19	1807	1790	1784	1784
q20	2410	2055	2060	2055
q21	8851	8647	8596	8596
q22	875	821	836	821
Total cold run time: 81568 ms
Total hot run time: 78085 ms

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit f8ae91119178719ebee7731e14decbb2e2e47743, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4911	4672	4692	4672
q2	354	162	156	156
q3	2032	1871	1888	1871
q4	1377	1287	1247	1247
q5	3982	3972	3999	3972
q6	249	128	126	126
q7	1408	880	892	880
q8	2739	2788	2768	2768
q9	9756	9878	9613	9613
q10	3460	3562	3532	3532
q11	381	250	251	250
q12	441	295	290	290
q13	4591	3837	3841	3837
q14	315	281	285	281
q15	591	543	528	528
q16	662	583	587	583
q17	1141	980	978	978
q18	7797	7325	7338	7325
q19	1655	1712	1660	1660
q20	527	341	297	297
q21	4352	3971	3949	3949
q22	470	362	382	362
Total cold run time: 53191 ms
Total hot run time: 49177 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4593	4595	4581	4581
q2	344	242	276	242
q3	4017	4020	3990	3990
q4	2705	2683	2688	2683
q5	9934	9819	9810	9810
q6	242	121	120	120
q7	3055	2494	2506	2494
q8	4449	4440	4445	4440
q9	13279	13151	13164	13151
q10	4102	4163	4197	4163
q11	792	666	682	666
q12	982	816	797	797
q13	4287	3564	3574	3564
q14	389	349	344	344
q15	580	517	525	517
q16	749	683	690	683
q17	3976	3958	3876	3876
q18	9450	8953	9019	8953
q19	1817	1777	1763	1763
q20	2388	2039	2066	2039
q21	8802	8444	8467	8444
q22	866	839	778	778
Total cold run time: 81798 ms
Total hot run time: 78098 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.82 seconds
stream load tsv: 581 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17100473954 Bytes

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@wm1581066 wm1581066 added dev/2.0.3 usercase Important user case type label labels Nov 21, 2023
starocean999
starocean999 previously approved these changes Nov 21, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 21, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit cab69a265bd1bddb5bca0a7b7fda4a55b116f297, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4933	4664	4650	4650
q2	357	173	161	161
q3	2042	1882	1899	1882
q4	1393	1274	1225	1225
q5	4011	3981	4061	3981
q6	253	133	133	133
q7	1457	884	885	884
q8	2763	2809	2788	2788
q9	9768	9617	9706	9617
q10	3493	3538	3562	3538
q11	390	241	251	241
q12	447	295	299	295
q13	4606	3823	3825	3823
q14	321	286	282	282
q15	594	537	527	527
q16	659	586	579	579
q17	1152	987	905	905
q18	7826	7289	7400	7289
q19	1692	1721	1709	1709
q20	535	313	301	301
q21	4417	3976	3973	3973
q22	483	379	376	376
Total cold run time: 53592 ms
Total hot run time: 49159 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4590	4581	4576	4576
q2	335	235	267	235
q3	4013	3997	4019	3997
q4	2715	2706	2705	2705
q5	9615	9672	9611	9611
q6	244	125	125	125
q7	3023	2458	2454	2454
q8	4468	4477	4430	4430
q9	13213	13072	13201	13072
q10	4129	4157	4216	4157
q11	794	703	658	658
q12	967	831	829	829
q13	4319	3592	3620	3592
q14	404	352	358	352
q15	574	525	529	525
q16	733	670	679	670
q17	3880	3864	3872	3864
q18	9460	9049	8936	8936
q19	1835	1804	1802	1802
q20	2360	2057	2041	2041
q21	8965	8541	8417	8417
q22	902	766	790	766
Total cold run time: 81538 ms
Total hot run time: 77814 ms

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Nov 21, 2023
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 21, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.53 seconds
stream load tsv: 568 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17099469900 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 7fe4b58fc36388fc3846dabb85cc0d87d69894b5, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4916	4652	4646	4646
q2	372	165	162	162
q3	2034	1933	1906	1906
q4	1399	1286	1287	1286
q5	3977	3935	4052	3935
q6	272	133	134	133
q7	1429	874	899	874
q8	2775	2801	2784	2784
q9	9818	9822	9676	9676
q10	3491	3555	3553	3553
q11	372	257	258	257
q12	448	291	301	291
q13	4615	3823	3768	3768
q14	317	291	286	286
q15	590	536	537	536
q16	660	586	580	580
q17	1140	975	953	953
q18	7939	7442	7434	7434
q19	1665	1688	1674	1674
q20	554	328	307	307
q21	4439	3954	3947	3947
q22	474	378	375	375
Total cold run time: 53696 ms
Total hot run time: 49363 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4618	4575	4569	4569
q2	347	218	259	218
q3	4022	4006	4000	4000
q4	2717	2693	2710	2693
q5	9583	9624	9599	9599
q6	243	126	126	126
q7	3010	2459	2480	2459
q8	4442	4458	4436	4436
q9	13160	13004	13173	13004
q10	4074	4227	4232	4227
q11	779	654	678	654
q12	972	822	821	821
q13	4278	3585	3562	3562
q14	389	386	347	347
q15	575	516	510	510
q16	730	673	692	673
q17	3929	3831	3899	3831
q18	9469	8899	8962	8899
q19	1832	1830	1777	1777
q20	2417	2062	2039	2039
q21	8813	8652	8455	8455
q22	906	784	786	784
Total cold run time: 81305 ms
Total hot run time: 77683 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.26 seconds
stream load tsv: 571 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17099786862 Bytes

@BiteTheDDDDt BiteTheDDDDt merged commit cee8cc4 into apache:master Nov 21, 2023
27 of 29 checks passed
superdiaodiao pushed a commit to superdiaodiao/doris that referenced this pull request Nov 21, 2023
…es (apache#27297)

fix insert wrong data on mv when stmt have multiple values
BiteTheDDDDt added a commit that referenced this pull request Nov 22, 2023
…es (#27297)

fix insert wrong data on mv when stmt have multiple values
xiaokang pushed a commit that referenced this pull request Nov 22, 2023
…es (#27297) (#27382)

fix insert wrong data on mv when stmt have multiple values
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Nov 27, 2023
…es (apache#27297) (apache#27382)

fix insert wrong data on mv when stmt have multiple values
eldenmoon added a commit that referenced this pull request Nov 27, 2023
* [fix](stats) Fix update rows for unique table didn't get updated properly #26968 (#27337)

* [FIX](jsonb) fix jsonb in predict column #27325 (#27424)

* [fix](fe) slots in having clause should be set to need materialized(#27412) (#27429)

* [Bug](insert)fix insert wrong data on mv when stmt have multiple values (#27297) (#27382)

fix insert wrong data on mv when stmt have multiple values

* [fix](fe ut) Fix OlapQueryCacheTest failed (#27305) (#27406)

1.
```
java.lang.NullPointerException: null
        at org.apache.doris.catalog.Env.getCurrentSystemInfo(Env.java:793) ~[classes/:?]
        at org.apache.doris.qe.SimpleScheduler$UpdateBlacklistThread.run(SimpleScheduler.java:206) ~[classes/:?]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]

java.lang.NullPointerException
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:226)
```

2.
```
[ERROR] testSqlCacheKeyWithNestedViewForNereids  Time elapsed: 1.962 s  <<< FAILURE!
java.lang.AssertionError: SELECT command denied to user 'testCluster:testUser'@'192.168.1.1' for table 'internal: testCluster:testDb: appevent'
	at org.apache.doris.qe.OlapQueryCacheTest.parseSqlByNereids(OlapQueryCacheTest.java:579)
	at org.apache.doris.qe.OlapQueryCacheTest.testSqlCacheKeyWithNestedViewForNereids(OlapQueryCacheTest.java:1338)
```

3.
```
[ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.63 s <<< FAILURE! - in org.apache.doris.qe.OlapQueryCacheTest
[ERROR] testCacheModeTable  Time elapsed: 1.657 s  <<< ERROR!
java.lang.IllegalArgumentException: Value of type org.apache.doris.qe.QueryState incompatible with return type org.apache.doris.system.SystemInfoService of org.apache.doris.catalog.Env#getCurrentSystemInfo()
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:156)
```

* [regression test](schema change) add some schema change regression cases (#27112) (#27418)

* [fix](Nereids) result type of add precision is 1 more than expected (#27136) (#27426)

* [fix](Nereids): fill miss slot in having subquery (#27177) (#27394)

* [fix](memory) Fix make_top_consumption_snapshots heap-use-after-free #27434 (#27465)

* [fix](function) make TIMESTAMP function DEPEND_ON_ARGUMENT (#27343) (#27458)

* [fix](test) order by clause in test_map(#27390) (#27391)

pick #27390

* [performance](Planner): optimize getStringValue() in DateLiteral (#27363) (#27470)

- reduce cost of `getStringValue()`
- original code don't consider `microsecond` part in `getStringValue()`

(cherry picked from commit 044a295)

* [Chore](pick) do not push down agg on aggregate column (#27356) (#27498)

* [fix](stats) table not exists error msg not print objects name #27074 (#27463)

* [improve](nereids) support agg function of count(const value) pushdown #26677 (#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.

* [test](fe-ut) fix unstable MysqlServerTest (#27459)

Need to find a unbind port for MysqlServerTest

* [opt](MergedIO) no need to merge large columns (#27315) (#27497)

1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes  of ranges.
2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.

* [improvement](drop tablet)  impr gc shutdown tablet lock (#26151) (#27478)

* [doc](stats) SQL manual for stats (#27461)

* [chore](merge-on-write) disable rowid conversion check for mow table by default (#27482) (#27508)

* [fix](regression)Fix hive p2 case (#27466) (#27511)

* [fix](statistics)Fix auto analyze remove finished job bug #27486 (#27510)

* [Bug](bitmap) Fix heap-use-after-free in the bitmap functions #27411 (#27521)

* [Pick](nereids) Pick: partition prune fails in case of NOT expression (#27047) (#27507)

* [fix](clone) Fix engine_clone file exist (#27361) (#27536)

* [chore](case) adjust timeout of broker load case #27540

* Fix auto analyze doesn't filter unsupported type bug. (#27547)

Fix auto analyze doesn't filter unsupported type bug.
Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed.
change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num
backport #27559

* [chore](fe plugin) Upgrade dependency to doris 2.0-SNAPSHOT #27522 (#27558)

* [Bug](materialized-view) add limitation for duplicate expr on materialized view (#27523) (#27562)

* [fix](planner)join node should output required slot from parent node #27526 (#27551)

* [branch-2.0](hive) enable hive view by default (#27550)

* [pick](nereids) adjust bc join and shuffle join #27113 (#27566)

* [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27567)

---------

Co-authored-by: AKIRA <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: 谢健 <jianxie0@gmail.com>
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
Co-authored-by: yujun <yu.jun.reach@gmail.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: Jack Drogon <jack.xsuperman@gmail.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
seawinde pushed a commit to seawinde/doris that referenced this pull request Nov 28, 2023
…es (apache#27297)

fix insert wrong data on mv when stmt have multiple values
gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023
…es (apache#27297) (apache#27382)

fix insert wrong data on mv when stmt have multiple values
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
…es (apache#27297)

fix insert wrong data on mv when stmt have multiple values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.3-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants