Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](show-load)Show load npe(userinfo is null) #27698

Merged
merged 1 commit into from
Nov 28, 2023

Conversation

CalvinKirs
Copy link
Member

What happend

In some cases, the user information of the job may be lost, so this will cause NPE to occur.
version 2.0.2
execute sql : show load

java.lang.NullPointerException: null
    at org.apache.doris.load.loadv2.LoadJob.getShowInfo(LoadJob.java:823) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.load.loadv2.LoadManager.getLoadJobInfosByDb(LoadManager.java:565) ~[doris-fe.jar:1.2-SNAPSHOT]

changes

Check whether the user information is null, and add peripheral exception capture and logs to avoid exceptions in a single piece of data, causing the entire show load to fail.

@CalvinKirs
Copy link
Member Author

run buildall

@wm1581066 wm1581066 added dev/2.0.3 usercase Important user case type label labels Nov 28, 2023
@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.82 seconds
stream load tsv: 565 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 31 seconds loaded 2358488459 Bytes, about 72 MB/s
stream load orc: 68 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17100770713 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 44d6ea18bdf31989020e28b76aacb141c0de63eb, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4855	4639	4679	4639
q2	370	145	160	145
q3	2053	1919	1884	1884
q4	1403	1258	1228	1228
q5	3981	3950	4004	3950
q6	255	135	135	135
q7	1362	887	825	825
q8	2812	2822	2785	2785
q9	9961	9495	9386	9386
q10	3458	3523	3510	3510
q11	382	252	248	248
q12	439	294	291	291
q13	4566	3813	3799	3799
q14	318	303	286	286
q15	588	536	519	519
q16	489	453	454	453
q17	1152	958	888	888
q18	7903	7572	7415	7415
q19	1673	1696	1672	1672
q20	589	311	301	301
q21	4451	4060	4016	4016
q22	463	370	385	370
Total cold run time: 53523 ms
Total hot run time: 48745 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4589	4598	4592	4592
q2	335	234	285	234
q3	4033	4026	4015	4015
q4	2713	2712	2695	2695
q5	9611	9635	9580	9580
q6	244	122	124	122
q7	2970	2491	2532	2491
q8	4465	4473	4448	4448
q9	12979	12847	12800	12800
q10	4033	4146	4174	4146
q11	776	675	645	645
q12	964	823	800	800
q13	4277	3595	3555	3555
q14	376	339	340	339
q15	577	522	517	517
q16	588	562	615	562
q17	3934	3925	3879	3879
q18	9567	8875	9062	8875
q19	1770	1789	1789	1789
q20	2420	2058	2109	2058
q21	8794	8572	8341	8341
q22	940	800	804	800
Total cold run time: 80955 ms
Total hot run time: 77283 ms

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 28, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Collaborator

@wm1581066 wm1581066 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaokang xiaokang merged commit c6f43e4 into apache:master Nov 28, 2023
27 of 30 checks passed
CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Nov 28, 2023
@CalvinKirs CalvinKirs deleted the master-showload-npe branch November 29, 2023 01:19
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Dec 3, 2023
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Dec 3, 2023
eldenmoon added a commit that referenced this pull request Dec 4, 2023
* [chore](case) Use correct insert stmt for cold heat separation case #27546 (#27585)

Co-authored-by: AlexYue <yj976240184@gmail.com>

* [enhance](S3) Print the error detail for every s3 operation (#27572) (#27615)

* [nereids] fix stats error when using dateTime type filter #27571 (#27577)

* [fix](planner)sort node should materialized required slots for itself #27605 (#27620)

* [fix](Nereids) non-deterministic expression should not be constant (#27606) (#27631)

* [enhancement](stats) Add process for aggstate type #27640 (#27642)

* [Fix](statistics)Fix bug and improve auto analyze. (#27626) (#27657)

1. Implement needReAnalyzeTable for ExternalTable. For now, external table will not be reanalyzed in 10 days.
2. For HiveMetastoreCache.loadPartitions, handle the empty iterator case to avoid Index out of boundary exception.
3. Wrap handle show analyze loop with try catch, so that when one table failed (for example, catalog dropped so the table couldn't be found anymore), we can still show the other tables.
4. For now, only OlapTable and Hive HMSExternalTable support sample analyze, throw exception for other types of table.
5. In StatisticsCollector, call constructJob after createTableLevelTaskForExternalTable to avoid NPE.

* [profile](bugfix) should not cache profile content because the profile may not be a full profile (#27635)

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>

* [Enhance](fe) Support setting initial root password when FE firstly launch (#27438) (#27603)

* [opt](plan) only lock olap table when query plan #27639 (#27656)

bp #27639

* select coordinator node from user's tag when exec streaming load (#27106) (#27677)

* [fix](statistics)Need to recalculate health value when table row count become 0  #27673 (#27674)

backport #27673

* [fix](statistics)Fix sample min max npe bug  #27702 (#27707)

backport #27702

* [Bug](join) try fix wrong _has_null_in_build_side setted (#27684) (#27710)

* [Fix](show-load)Show load npe(userinfo is null) (#27698) (#27719)

* [pick](nereids)temporary partition is always pruned #27636 (#27722)

* [enhancement](stats) limit bq cap size for analyze task #27685 (#27687)

* [improvement](statistics) Add config for the threshold of column count for auto analyze #27713 (#27723)

* [doc](fix) k8s operator docs fix to 2.0 (#27476)

* [Improvement](planner)support select tablets with nereids optimize #23164 #23365 (#27740)

#23164
#23365

* [FIX](complextype)fix complex type hash equals (#27743)

* [fix](statistics) Fix show auto analyze missing jobs bug (#27761)

* [bugfix](topn) fix coredump in copy_column_data_to_block when nullable mismatch

return RuntimeError if copy_column_data_to_block nullable mismatch to avoid coredump in input_col_ptr->filter_by_selector(sel_rowid_idx, select_size, raw_res_ptr) .

The problem is reported by a doris user but I can not reproduce it, so there is no testcase added currently.

* [opt](stats) Use escape rather than base64 for min/max value #27746 (#27748)

* [refactor](http) disable snapshot and get_log_file api (#27724) (#27770)

* [branch-2.0](pick 27738) Warning log to trace send fragment #27738 (#27760)

* [branch-2.0](pick #27771) Add more detail msg for waitRPC exception (#27773)

* [Bug](pipeline) prevent PipelineFragmentContext destruct early (#27790)

* [deps](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 1. Add libdeflate lib.  (#27542) (#27711)

Backport from #27542.

* [FIX](case)fix case truncate table first #27792

* [doc](stats) add auto_analyze_table_width_threshold description. (#27818) (#27832)

* [fix](bdbje) Fix bdbje logging level not work (#27597) (#27788)

* `EnvironmentConfig.FILE_LOGGING_LEVEL` only set FileHandlerLevel, we should
   set logger level firstly, otherwise it will not take effect.

* [Opt](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 2. Opt gzip decompression by libdeflate lib. (#27669) (#27801)

Backport from #27669.

* [branch-2.0](fix) Fix broken exception message #27836

* [Bug](func) coredump in equal for null in function (#27843)

* [minor](stats) Update olap table row count after analyze (#27858)

pick from master #27814

* [fix](stats)min and max return NaN when table is empty (#27863)

fix analyze empty table and min/max null value bug:
1. Skip empty analyze task for sample analyze task. (Full analyze task already skipped).
2. Check sample rows is not 0 before calculate the scale factor.
3. Remove ' in sql template after remove base64 encoding for min/max value.

backport #27862

* [minor](stats) Throw error when sync analyze failed (#27846)

pick from master #27845

* [fix](stats) Don't save colToPartitions anymore to save mem (#27880)

pick from master #27879

* [fix](nereids) set operation's result type is wrong if decimal overflows (#27872)

pick from master #27870

* [Config] Modify the default value of tablet_schema_cache_recycle_interval (#27877)

* [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode(#27842) (#27851)

* [fix](fe) Fix show frontends npt in some situations (#27295) (#27789)

```
java.lang.NullPointerException: null
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getMasterSocket(ReplicationGroupAdmin.java:191)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:607)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getGroup(ReplicationGroupAdmin.java:406)
    at org.apache.doris.ha.BDBHA.getElectableNodes(BDBHA.java:132)
    at org.apache.doris.common.proc.FrontendsProcNode.getFrontendsInfo(FrontendsProcNode.java:84)
    at org.apache.doris.qe.ShowExecutor.handleShowFrontends(ShowExecutor.java:1923)
    at org.apache.doris.qe.ShowExecutor.execute(ShowExecutor.java:355)
    at org.apache.doris.qe.StmtExecutor.handleShow(StmtExecutor.java:2113)
    ...
```

* [branch-2.0](fix) Fix extremely high CPU usage caused by rf merge #27894 (#27895)

* [fix](stacktrace) ignore stacktrace for error code INVALID_ARGUMENT INVERTED_INDEX_NOT_IMPLEMENTED (#27898)

* ignore stacktrace for error INVALID_ARGUMENT INVERTED_INDEX_NOT_IMPLEMENTED

* AndBlockColumnPredicate::evaluate

* [opt](nereids) Branch-2.0: remove partition & histogram from col stats to reduce memory usage #27885 (#27896)

* [pick](Nereids) temporary partition is selected only if user manually specified: Branch-2.0 #27893 (#27905)

* [fix](multi-catalog)support the max compute partition prune (#27154) (#27902)

backport #27154

* [fix](Nereids) should not push down project to the nullable side of outer join #27912 (#27913)

* fix compile

---------

Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: xzj7019 <131111794+xzj7019@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: AKIRA <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: DuRipeng <453243496@qq.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: Calvin Kirs <acm_master@163.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: catpineapple <42031973+catpineapple@users.noreply.github.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023
@xiaokang xiaokang mentioned this pull request Dec 4, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.3-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants