Skip to content

Conversation

@deardeng
Copy link
Contributor

@deardeng deardeng commented Aug 12, 2025

…and replay failure

Fix

2025-08-04 01:00:20,626 ERROR (replayer|119) [EditLog.loadJournal():1439] replay Operation Type 10, log id: 62731
java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Database.createTableWithLock(org.apache.doris.catalog.Table, boolean, boolean)" because "db" is null
        at org.apache.doris.datasource.InternalCatalog.replayCreateTable(InternalCatalog.java:1359) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayCreateTable(Env.java:4767) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:351) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayJournal(Env.java:3103) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2865) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]

The cause of the problem, as observed in the observer bdbje log, is as follows:

  1. Key:10, Rename db from dbA to dbB
  2. Key:11, Rename db from dbB to dbA
  3. key:12, The edit log for create view (table) saves the db name dbB from step 1.

During replay, because the db has become dbA, replay cannot find the dbB by using the name dbB, resulting in an error (npe). The follower crashed.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…and replay failure

```
2025-08-04 01:00:20,626 ERROR (replayer|119) [EditLog.loadJournal():1439] replay Operation Type 10, log id: 62731
java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Database.createTableWithLock(org.apache.doris.catalog.Table, boolean, boolean)" because "db" is null
        at org.apache.doris.datasource.InternalCatalog.replayCreateTable(InternalCatalog.java:1359) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayCreateTable(Env.java:4767) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:351) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayJournal(Env.java:3103) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2865) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
```
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Aug 12, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34246 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit afda448f44e09c39c8c7029be0bd7649619a1a3d, data reload: false

------ Round 1 ----------------------------------
q1	17577	5354	5085	5085
q2	1915	286	193	193
q3	10301	1303	716	716
q4	10220	1042	523	523
q5	7504	2410	2368	2368
q6	186	160	135	135
q7	916	748	636	636
q8	9326	1310	1113	1113
q9	7036	5155	5168	5155
q10	6881	2375	1950	1950
q11	497	292	272	272
q12	339	356	228	228
q13	17790	3676	3059	3059
q14	247	249	220	220
q15	553	491	496	491
q16	447	428	373	373
q17	596	834	384	384
q18	7346	7133	7177	7133
q19	1084	990	604	604
q20	357	349	223	223
q21	3960	3274	2393	2393
q22	1061	1028	992	992
Total cold run time: 106139 ms
Total hot run time: 34246 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5223	5159	5190	5159
q2	256	319	224	224
q3	2162	2755	2331	2331
q4	1389	1761	1333	1333
q5	4238	4684	4602	4602
q6	209	170	130	130
q7	2051	1976	1836	1836
q8	2686	2675	2692	2675
q9	7343	7141	7375	7141
q10	3079	3351	2846	2846
q11	585	523	492	492
q12	714	842	655	655
q13	3571	3930	3302	3302
q14	291	338	298	298
q15	517	495	486	486
q16	452	507	458	458
q17	1239	1570	1388	1388
q18	8085	7730	7633	7633
q19	903	854	1037	854
q20	2059	2116	1857	1857
q21	4818	4411	4298	4298
q22	1082	1004	1026	1004
Total cold run time: 52952 ms
Total hot run time: 51002 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184397 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit afda448f44e09c39c8c7029be0bd7649619a1a3d, data reload: false

query1	979	406	409	406
query2	6514	1725	1802	1725
query3	6734	219	218	218
query4	26310	23659	22908	22908
query5	4364	616	481	481
query6	310	224	203	203
query7	4630	494	292	292
query8	274	241	217	217
query9	8602	2873	2859	2859
query10	483	336	291	291
query11	15954	15298	14779	14779
query12	175	114	113	113
query13	1660	543	399	399
query14	9500	5811	5810	5810
query15	212	183	159	159
query16	7625	656	459	459
query17	1218	730	584	584
query18	2033	416	319	319
query19	195	191	158	158
query20	132	115	112	112
query21	209	124	111	111
query22	4172	4274	3962	3962
query23	34367	33235	33215	33215
query24	8162	2389	2337	2337
query25	532	470	396	396
query26	1228	274	153	153
query27	2728	507	346	346
query28	4348	2234	2219	2219
query29	757	556	450	450
query30	281	225	189	189
query31	924	795	714	714
query32	83	76	75	75
query33	572	360	325	325
query34	809	848	505	505
query35	788	841	769	769
query36	992	1031	926	926
query37	123	105	87	87
query38	4096	3975	3958	3958
query39	1444	1416	1400	1400
query40	223	123	109	109
query41	58	55	58	55
query42	118	109	106	106
query43	493	496	487	487
query44	1320	847	853	847
query45	173	174	159	159
query46	837	1000	641	641
query47	1749	1799	1734	1734
query48	386	410	310	310
query49	727	482	390	390
query50	639	687	401	401
query51	4207	4154	4093	4093
query52	115	113	98	98
query53	228	264	194	194
query54	592	590	514	514
query55	94	86	85	85
query56	299	304	294	294
query57	1167	1205	1151	1151
query58	281	269	270	269
query59	2622	2807	2638	2638
query60	343	334	314	314
query61	151	124	120	120
query62	802	735	667	667
query63	228	188	193	188
query64	4325	1025	676	676
query65	4311	4201	4203	4201
query66	1089	477	323	323
query67	15280	15256	15049	15049
query68	8316	905	575	575
query69	465	324	286	286
query70	1220	1175	1113	1113
query71	465	322	300	300
query72	5574	4502	4799	4502
query73	739	626	355	355
query74	8939	9124	8878	8878
query75	3861	3057	2632	2632
query76	3695	1136	729	729
query77	786	394	323	323
query78	9599	9740	8911	8911
query79	2433	839	586	586
query80	603	532	476	476
query81	469	257	226	226
query82	429	136	107	107
query83	282	254	234	234
query84	284	96	80	80
query85	868	359	334	334
query86	334	313	270	270
query87	4292	4304	4162	4162
query88	3104	2207	2242	2207
query89	390	306	287	287
query90	1957	231	219	219
query91	141	141	111	111
query92	89	69	65	65
query93	1236	986	649	649
query94	677	385	306	306
query95	384	313	308	308
query96	491	577	278	278
query97	2638	2691	2587	2587
query98	238	224	209	209
query99	1453	1383	1350	1350
Total cold run time: 273827 ms
Total hot run time: 184397 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.35 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit afda448f44e09c39c8c7029be0bd7649619a1a3d, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.05	0.04
query3	0.25	0.08	0.07
query4	1.65	0.11	0.10
query5	0.41	0.42	0.41
query6	1.16	0.64	0.65
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.63	0.51	0.50
query10	0.59	0.57	0.57
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.63	0.62	0.59
query14	0.81	0.82	0.84
query15	0.89	0.86	0.85
query16	0.39	0.40	0.40
query17	1.05	1.02	1.02
query18	0.21	0.19	0.20
query19	1.95	1.82	1.79
query20	0.01	0.01	0.01
query21	15.40	0.90	0.55
query22	0.78	1.20	0.66
query23	14.89	1.38	0.61
query24	6.97	0.63	0.90
query25	0.47	0.22	0.10
query26	0.62	0.16	0.13
query27	0.05	0.06	0.05
query28	9.45	0.93	0.44
query29	12.54	3.99	3.27
query30	3.15	2.97	3.02
query31	2.84	0.59	0.38
query32	3.23	0.56	0.48
query33	3.13	3.18	3.15
query34	15.85	5.44	4.88
query35	4.86	4.93	4.92
query36	0.70	0.51	0.50
query37	0.10	0.07	0.06
query38	0.06	0.05	0.04
query39	0.03	0.02	0.02
query40	0.17	0.14	0.14
query41	0.08	0.03	0.02
query42	0.03	0.02	0.03
query43	0.04	0.03	0.02
Total cold run time: 106.56 s
Total hot run time: 32.35 s

@dataroaring dataroaring merged commit 3869d9c into apache:master Aug 13, 2025
32 of 34 checks passed
deardeng added a commit to deardeng/incubator-doris that referenced this pull request Sep 14, 2025
apache#54614)

…and replay failure

Fix
```
2025-08-04 01:00:20,626 ERROR (replayer|119) [EditLog.loadJournal():1439] replay Operation Type 10, log id: 62731
java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Database.createTableWithLock(org.apache.doris.catalog.Table, boolean, boolean)" because "db" is null
        at org.apache.doris.datasource.InternalCatalog.replayCreateTable(InternalCatalog.java:1359) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayCreateTable(Env.java:4767) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:351) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayJournal(Env.java:3103) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2865) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
```

The cause of the problem, as observed in the observer bdbje log, is as
follows:
1. Key:10, Rename db from dbA to dbB
2. Key:11, Rename db from dbB to dbA
3. key:12, The edit log for create view (table) saves the db name dbB
from step 1.

During replay, because the db has become dbA, replay cannot find the dbB
by using the name dbB, resulting in an error (npe). The follower
crashed.
morrySnow pushed a commit that referenced this pull request Sep 17, 2025
deardeng added a commit to deardeng/incubator-doris that referenced this pull request Sep 18, 2025
apache#54614)

…and replay failure

Fix
```
2025-08-04 01:00:20,626 ERROR (replayer|119) [EditLog.loadJournal():1439] replay Operation Type 10, log id: 62731
java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Database.createTableWithLock(org.apache.doris.catalog.Table, boolean, boolean)" because "db" is null
        at org.apache.doris.datasource.InternalCatalog.replayCreateTable(InternalCatalog.java:1359) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayCreateTable(Env.java:4767) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:351) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayJournal(Env.java:3103) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2865) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
```

The cause of the problem, as observed in the observer bdbje log, is as
follows:
1. Key:10, Rename db from dbA to dbB
2. Key:11, Rename db from dbB to dbA
3. key:12, The edit log for create view (table) saves the db name dbB
from step 1.

During replay, because the db has become dbA, replay cannot find the dbB
by using the name dbB, resulting in an error (npe). The follower
crashed.
deardeng added a commit to deardeng/incubator-doris that referenced this pull request Sep 18, 2025
apache#54614)

…and replay failure

Fix
```
2025-08-04 01:00:20,626 ERROR (replayer|119) [EditLog.loadJournal():1439] replay Operation Type 10, log id: 62731
java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Database.createTableWithLock(org.apache.doris.catalog.Table, boolean, boolean)" because "db" is null
        at org.apache.doris.datasource.InternalCatalog.replayCreateTable(InternalCatalog.java:1359) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayCreateTable(Env.java:4767) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:351) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env.replayJournal(Env.java:3103) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2865) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
```

The cause of the problem, as observed in the observer bdbje log, is as
follows:
1. Key:10, Rename db from dbA to dbB
2. Key:11, Rename db from dbB to dbA
3. key:12, The edit log for create view (table) saves the db name dbB
from step 1.

During replay, because the db has become dbA, replay cannot find the dbB
by using the name dbB, resulting in an error (npe). The follower
crashed.
@morrySnow morrySnow mentioned this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants