Skip to content

[Bug] Concurrent access _tablets_under_clone/restore should also sharding #5001

@acelyc111

Description

@acelyc111

Describe the bug
I found a coredump backtrace looks like:

Program terminated with signal 11, Segmentation fault.
#0  _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0x1abc07c80, __x=0x20) at /usr/include/c++/7.3.0/bits/stl_tree.h:1872
1872	/usr/include/c++/7.3.0/bits/stl_tree.h: 没有那个文件或目录.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  _M_lower_bound (this=<optimized out>, __k=<optimized out>, __y=0x1abc07c80, __x=0x20) at /usr/include/c++/7.3.0/bits/stl_tree.h:1872
#1  equal_range (__k=<synthetic pointer>, this=0x53d8130) at /usr/include/c++/7.3.0/bits/stl_tree.h:1951
#2  erase (__x=<synthetic pointer>, this=0x53d8130) at /usr/include/c++/7.3.0/bits/stl_tree.h:2500
#3  erase (__x=<synthetic pointer>, this=0x53d8130) at /usr/include/c++/7.3.0/bits/stl_set.h:675
#4  doris::TabletManager::unregister_clone_tablet (this=0x53d8000, tablet_id=3120130) at /builds/olap/doris/be/src/olap/tablet_manager.cpp:1105
#5  0x0000000001637bcb in doris::EngineCloneTask::execute (this=0x7f2f08b0abd0) at /builds/olap/doris/be/src/olap/task/engine_clone_task.cpp:69
#6  0x0000000000e11565 in doris::StorageEngine::execute_task (this=0x4df9c00, task=task@entry=0x7f2f08b0abd0) at /builds/olap/doris/be/src/olap/storage_engine.cpp:996
#7  0x0000000001464318 in doris::TaskWorkerPool::_clone_worker_thread_callback (this=0x2bb858fc0) at /builds/olap/doris/be/src/agent/task_worker_pool.cpp:872
#8  0x0000000001176e42 in operator() (this=0x2c23f2ad8) at /usr/include/c++/7.3.0/bits/std_function.h:706
#9  run (this=0x2c23f2ad0) at /builds/olap/doris/be/src/util/threadpool.cpp:42
#10 doris::ThreadPool::dispatch_thread (this=0x239eb90e0) at /builds/olap/doris/be/src/util/threadpool.cpp:551
#11 0x000000000116eca8 in operator() (this=0x2c23eab58) at /usr/include/c++/7.3.0/bits/std_function.h:706
#12 doris::Thread::supervise_thread (arg=0x2c23eab40) at /builds/olap/doris/be/src/util/thread.cpp:385
#13 0x00007f2f81fdedc5 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f2f822ea73d in clone () from /lib64/libc.so.6
(gdb)

I should point out that one of my config is:

tablet_map_shard_size=256

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions