-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](cloud) Should consider tablet state change whether to skip sync_rowsets in publish phase
#48400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](cloud) Should consider tablet state change whether to skip sync_rowsets in publish phase
#48400
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
ca25fc3 to
7ba25fc
Compare
7ba25fc to
499e9f3
Compare
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 32013 ms |
TPC-DS: Total hot run time: 184351 ms |
ClickBench: Total hot run time: 31.27 s |
7a66937 to
b0dce74
Compare
|
run buildall |
|
TeamCity cloud ut coverage result: |
TPC-H: Total hot run time: 31554 ms |
|
PR approved by anyone and no changes requested. |
|
run buildall |
|
TeamCity cloud ut coverage result: |
TPC-H: Total hot run time: 32038 ms |
TPC-DS: Total hot run time: 191000 ms |
ClickBench: Total hot run time: 31.29 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
run p0 |
1 similar comment
|
run p0 |
…c_rowsets` in publish phase (apache#48400) considering the following situation: 1. heavy SC begins 2. alter task on tablet X(to tablet Y) is sent to be1 3. be1 shutdown for some reason 4. new loads on new tablet Y are routed to be2(which will skip to calculate delete bitmaps in commit phase and publish phase because the tablet's state is `NOT_READY`) 5. be1 restarted and resumed to do alter task 6. alter task on be1 finished and change the tablet's state to `RUNNING` in MS 7. some load on tablet Y on be2 skip to calculate delete bitmap because it doesn't know the tablet's state has changed, which will cause duplicate key problem Like apache#37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
…c_rowsets` in publish phase (apache#48400) considering the following situation: 1. heavy SC begins 2. alter task on tablet X(to tablet Y) is sent to be1 3. be1 shutdown for some reason 4. new loads on new tablet Y are routed to be2(which will skip to calculate delete bitmaps in commit phase and publish phase because the tablet's state is `NOT_READY`) 5. be1 restarted and resumed to do alter task 6. alter task on be1 finished and change the tablet's state to `RUNNING` in MS 7. some load on tablet Y on be2 skip to calculate delete bitmap because it doesn't know the tablet's state has changed, which will cause duplicate key problem Like apache#37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
…c_rowsets` in publish phase (apache#48400) considering the following situation: 1. heavy SC begins 2. alter task on tablet X(to tablet Y) is sent to be1 3. be1 shutdown for some reason 4. new loads on new tablet Y are routed to be2(which will skip to calculate delete bitmaps in commit phase and publish phase because the tablet's state is `NOT_READY`) 5. be1 restarted and resumed to do alter task 6. alter task on be1 finished and change the tablet's state to `RUNNING` in MS 7. some load on tablet Y on be2 skip to calculate delete bitmap because it doesn't know the tablet's state has changed, which will cause duplicate key problem Like apache#37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
…c_rowsets` in publish phase (apache#48400) considering the following situation: 1. heavy SC begins 2. alter task on tablet X(to tablet Y) is sent to be1 3. be1 shutdown for some reason 4. new loads on new tablet Y are routed to be2(which will skip to calculate delete bitmaps in commit phase and publish phase because the tablet's state is `NOT_READY`) 5. be1 restarted and resumed to do alter task 6. alter task on be1 finished and change the tablet's state to `RUNNING` in MS 7. some load on tablet Y on be2 skip to calculate delete bitmap because it doesn't know the tablet's state has changed, which will cause duplicate key problem Like apache#37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
…t tablet states for `GetDeleteBitmapUpdateLockResponse` (#49165) ### What problem does this PR solve? fix for #48400, when fe send `GetDeleteBitmapUpdateLock` rpc to low version MS which will not set tablet states field and get response from it, FE will encounter `IndexOutOfBoundsException`. ``` 2025-03-17 18:05:35,224 WARN (thrift-server-pool-77|200) [FrontendServiceImpl.loadTxnCommit():1676] catch unknown result. java.lang.IndexOutOfBoundsException: Index:0, Size:0 at com.google.protobuf.LongArrayList.ensureIndexInRange(LongArrayList.java:288) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.getLong(LongArrayList.java:136) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:131) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:45) ~[protobuf-java-3.24.3.jar:?] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.getDeleteBitmapUpdateLock(CloudGlobalTransactionMgr.java:949) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitTransaction(CloudGlobalTransactionMgr.java:361) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitAndPublishTransaction(CloudGlobalTransactionMgr.java:1203) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommitImpl(FrontendServiceImpl.java:1730) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommit(FrontendServiceImpl.java:1660) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.internal.reflect.GeneratedMethodAccessor121.invoke(Unknown Source) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?] at org.apache.doris.service.FeServer.lambda$start$0(FeServer.java:60) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.proxy2.$Proxy45.loadTxnCommit(Unknown Source) ~[?:?] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4282) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4262) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) ~[libthrift-0.16.0.jar:0.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) ~[?:?] ```
…t tablet states for `GetDeleteBitmapUpdateLockResponse` (#49165) ### What problem does this PR solve? fix for #48400, when fe send `GetDeleteBitmapUpdateLock` rpc to low version MS which will not set tablet states field and get response from it, FE will encounter `IndexOutOfBoundsException`. ``` 2025-03-17 18:05:35,224 WARN (thrift-server-pool-77|200) [FrontendServiceImpl.loadTxnCommit():1676] catch unknown result. java.lang.IndexOutOfBoundsException: Index:0, Size:0 at com.google.protobuf.LongArrayList.ensureIndexInRange(LongArrayList.java:288) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.getLong(LongArrayList.java:136) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:131) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:45) ~[protobuf-java-3.24.3.jar:?] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.getDeleteBitmapUpdateLock(CloudGlobalTransactionMgr.java:949) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitTransaction(CloudGlobalTransactionMgr.java:361) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitAndPublishTransaction(CloudGlobalTransactionMgr.java:1203) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommitImpl(FrontendServiceImpl.java:1730) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommit(FrontendServiceImpl.java:1660) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.internal.reflect.GeneratedMethodAccessor121.invoke(Unknown Source) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?] at org.apache.doris.service.FeServer.lambda$start$0(FeServer.java:60) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.proxy2.$Proxy45.loadTxnCommit(Unknown Source) ~[?:?] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4282) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4262) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) ~[libthrift-0.16.0.jar:0.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) ~[?:?] ```
…c_rowsets` in publish phase (apache#48400) ### What problem does this PR solve? considering the following situation: 1. heavy SC begins 2. alter task on tablet X(to tablet Y) is sent to be1 3. be1 shutdown for some reason 4. new loads on new tablet Y are routed to be2(which will skip to calculate delete bitmaps in commit phase and publish phase because the tablet's state is `NOT_READY`) 5. be1 restarted and resumed to do alter task 6. alter task on be1 finished and change the tablet's state to `RUNNING` in MS 7. some load on tablet Y on be2 skip to calculate delete bitmap because it doesn't know the tablet's state has changed, which will cause duplicate key problem Like apache#37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
…t tablet states for `GetDeleteBitmapUpdateLockResponse` (apache#49165) ### What problem does this PR solve? fix for apache#48400, when fe send `GetDeleteBitmapUpdateLock` rpc to low version MS which will not set tablet states field and get response from it, FE will encounter `IndexOutOfBoundsException`. ``` 2025-03-17 18:05:35,224 WARN (thrift-server-pool-77|200) [FrontendServiceImpl.loadTxnCommit():1676] catch unknown result. java.lang.IndexOutOfBoundsException: Index:0, Size:0 at com.google.protobuf.LongArrayList.ensureIndexInRange(LongArrayList.java:288) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.getLong(LongArrayList.java:136) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:131) ~[protobuf-java-3.24.3.jar:?] at com.google.protobuf.LongArrayList.get(LongArrayList.java:45) ~[protobuf-java-3.24.3.jar:?] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.getDeleteBitmapUpdateLock(CloudGlobalTransactionMgr.java:949) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitTransaction(CloudGlobalTransactionMgr.java:361) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.cloud.transaction.CloudGlobalTransactionMgr.commitAndPublishTransaction(CloudGlobalTransactionMgr.java:1203) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommitImpl(FrontendServiceImpl.java:1730) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.service.FrontendServiceImpl.loadTxnCommit(FrontendServiceImpl.java:1660) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.internal.reflect.GeneratedMethodAccessor121.invoke(Unknown Source) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?] at org.apache.doris.service.FeServer.lambda$start$0(FeServer.java:60) ~[doris-fe.jar:1.2-SNAPSHOT] at jdk.proxy2.$Proxy45.loadTxnCommit(Unknown Source) ~[?:?] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4282) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.doris.thrift.FrontendService$Processor$loadTxnCommit.getResult(FrontendService.java:4262) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) ~[libthrift-0.16.0.jar:0.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:833) ~[?:?] ```
What problem does this PR solve?
considering the following situation:
NOT_READY)RUNNINGin MSLike #37670, this PR let the meta service return the tablet states along with the getDeleteBitmapUpdateLockResponse to FE and FE will send them to BE to let the BE know whether it should sync_rowsets() due to tablet state change on other BEs.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)