-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: IndexNode coredump when DeleteIndex and then rebuild it ( cannot be reproduced stably) #25297
Comments
how large is the indexnode? like how many cores and memories? seems like a similar issue |
according to other issue, this is windows only. |
@xige-16 pls help to revert the aws sdk to 1.8.186 |
related with #25264 (comment) |
im sure. centos7.6. indexNode has 4 8c16G servers |
@xiaofan-luan Would it be a good idea to consolidate the functionality of network calls into Go for uniformity? I have so many stories between AWS and me. 😂 |
we use to utilize go for S3 access, but this introduce actual copy thus we moved to cpp sdk |
aws-c-io fix pr awslabs/aws-c-io#515 |
Please use milvus 2.2.12 or later to avoid encountering this issue. |
not reproduce recently, close for now |
Is there an existing issue for this?
Environment
Current Behavior
new collection ,new data rows 20k from attu 2.2.6。
or drop an vec index,and then rebuild it.nothing happen where drop index.
and IndexNode CoreDump when rebuild one.
Expected Behavior
build index stably
Steps To Reproduce
Milvus Log
2023-07-03 20:12:13,180 INFO [default] [KNOWHERE][SetBlasThreshold][milvus] Set faiss::distance_compute_blas_threshold to 16384
2023-07-03 20:12:13,181 INFO [default] [KNOWHERE][SetEarlyStopThreshold][milvus] Set faiss::early_stop_threshold to 0
2023-07-03 20:12:13,181 INFO [default] [KNOWHERE][SetStatisticsLevel][milvus] Set knowhere::STATISTICS_LEVEL to 0
2023-07-03 20:12:13,181 | DEBUG | default | [SERVER][operator()][milvus] Config easylogging with yaml file: /opt/apps/milvus/configs/easylogging.yaml
2023-07-03 20:12:13,181 | INFO | default | [KNOWHERE][SetSimdType][milvus] FAISS expect simdType::AUTO
2023-07-03 20:12:13,181 | INFO | default | [KNOWHERE][SetSimdType][milvus] FAISS hook AVX2
2023-07-03 20:12:13,181 | DEBUG | default | [SEGCORE][SetIndexSliceSize][milvus] set config index slice size(byte): 16777216
2023-07-03 20:12:13,181 | DEBUG | default | [SEGCORE][SetThreadCoreCoefficient][milvus] set thread pool core coefficient: 10
2023-07-03 20:12:17,279 | INFO | default | [SEGCORE][N6milvus7storage17MinioChunkManagerE::MinioChunkManager][milvus] init MinioChunkManager with parameter[endpoint: 's3plus-bj02.vip.sankuai.com:443', default_bucket_name:'milvus-prod', use_secure:'true']
2023-07-03 20:12:17,279 | WARNING | default | [KNOWHERE][GetGlobalThreadPool][milvus] Global ThreadPool has not been inialized yet, init it now with threads num: 8
2023-07-03 20:12:17,280 | INFO | default | [SEGCORE][N6milvus10ThreadPoolE::ThreadPool][milvus] Thread pool's worker num:80
2023-07-03 20:12:17,647 | WARNING | default | [KNOWHERE][MatchNlist][milvus] Row num 10000 match nlist 256
Fatal error condition occurred in /opt/data/milvus_compile/milvus-2.2.11/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:74: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(aws_backtrace_print+0x46) [0x7f05618a4706]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(aws_fatal_assert+0x43) [0x7f056189c003]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(+0x1b3d07) [0x7f05617b0d07]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x1d) [0x7f05618a51ed]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(+0x1b1958) [0x7f05617ae958]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x1d) [0x7f05618a51ed]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws3Crt2Io15ClientBootstrapD2Ev+0x26) [0x7f05617640a6]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws25SetDefaultClientBootstrapERKSt10shared_ptrINS_3Crt2Io15ClientBootstrapEE+0xc2) [0x7f05616bb7d2]
/opt/apps/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws10CleanupCrtEv+0x22) [0x7f05616bb932]
/opt/apps/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManager14ShutdownSDKAPIEv+0x3e) [0x7f056435261e]
/opt/apps/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD1Ev+0x1e) [0x7f056435269e]
/opt/apps/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD0Ev+0x9) [0x7f0564352939]
bin/milvus(_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE10_M_releaseEv+0x56) [0x33aaa96]
/opt/apps/milvus/lib/libmilvus_storage.so(_ZNSt23_Sp_counted_ptr_inplaceIN6milvus7storage18MemFileManagerImplESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0x86) [0x7f056431a8b6]
/opt/apps/milvus/lib/libmilvus_index.so(_ZN6milvus5index16VectorMemNMIndexD0Ev+0x18a) [0x7f055f9b3eba]
/opt/apps/milvus/lib/libmilvus_indexbuilder.so(_ZN6milvus12indexbuilder15VecIndexCreatorD0Ev+0x2f) [0x7f05638ccdff]
/opt/apps/milvus/lib/libmilvus_indexbuilder.so(DeleteIndex+0x13) [0x7f05638cdc53]
bin/milvus(_cgo_182415b04a2d_Cfunc_DeleteIndex+0x1d) [0x32f582d]
bin/milvus(runtime.asmcgocall.abi0+0x64) [0x1516c04]
SIGABRT: abort
PC=0x7f0561e2c387 m=3 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 500 [syscall]:
runtime.cgocall(0x32f5810, 0xc000639828)
/usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc000639800 sp=0xc0006397c8 pc=0x14a98fc
github.com/milvus-io/milvus/internal/util/indexcgowrapper._Cfunc_DeleteIndex(0x7f052f306eb0)
_cgo_gotypes.go:344 +0x51 fp=0xc000639828 sp=0xc000639800 pc=0x2c50271
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete.func1(0x1d33a97?)
/opt/data/milvus_compile/milvus-2.2.11/internal/util/indexcgowrapper/index.go:320 +0x3a fp=0xc000639860 sp=0xc000639828 pc=0x2c55eba
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete(0xc00056a460)
/opt/data/milvus_compile/milvus-2.2.11/internal/util/indexcgowrapper/index.go:320 +0x32 fp=0xc000639898 sp=0xc000639860 pc=0x2c55e32
github.com/milvus-io/milvus/internal/indexnode.(*indexBuildTask).SaveIndexFiles(0xc000f9ad80, {0x46307e8, 0xc001273a40})
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task.go:361 +0x1c8 fp=0xc000639d10 sp=0xc000639898 pc=0x2c628a8
github.com/milvus-io/milvus/internal/indexnode.task.SaveIndexFiles-fm({0x46307e8?, 0xc001273a40?})
:1 +0x3e fp=0xc000639d38 sp=0xc000639d10 pc=0x2c6b0de
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask.func1(0xc0020b3e08)
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task_scheduler.go:207 +0x82 fp=0xc000639d68 sp=0xc000639d38 pc=0x2c65e82
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask(0xc000e9f000, {0x4645ff8, 0xc000f9ad80}, {0x229fba0?, 0xc001f927e0?})
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task_scheduler.go:220 +0x3c9 fp=0xc000639f60 sp=0xc000639d68 pc=0x2c657e9
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func1(0xc001f92840?, {0x4645ff8?, 0xc000f9ad80?})
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task_scheduler.go:253 +0x6c fp=0xc000639fb8 sp=0xc000639f60 pc=0x2c6626c
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func3()
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task_scheduler.go:254 +0x32 fp=0xc000639fe0 sp=0xc000639fb8 pc=0x2c661d2
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000639fe8 sp=0xc000639fe0 pc=0x1516f41
created by github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop
/opt/data/milvus_compile/milvus-2.2.11/internal/indexnode/task_scheduler.go:251 +0x186
goroutine 1 [chan receive]:
runtime.gopark(0xc0006db6b0?, 0xc0006db708?, 0xb3?, 0xdd?, 0xc0006db708?)
/usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0013816d8 sp=0xc0013816b8 pc=0x14e2216
runtime.chanrecv(0xc0002a4960, 0xc0006dbc00, 0x1)
/usr/local/go/src/runtime/chan.go:583 +0x49d fp=0xc001381768 sp=0xc0013816d8 pc=0x14ac73d
runtime.chanrecv1(0xc0002a4960?, 0xc0006dbda8?)
/usr/local/go/src/runtime/chan.go:442 +0x18 fp=0xc001381790 sp=0xc001381768 pc=0x14ac238
github.com/milvus-io/milvus/cmd/roles.(*MilvusRoles).Run(0xc0007a5e48, 0x0, {0x0, 0x0})
/opt/data/milvus_compile/milvus-2.2.11/cmd/roles/roles.go:346 +0xaf1 fp=0xc001381df8 sp=0xc001381790 pc=0x32e3e71
github.com/milvus-io/milvus/cmd/milvus.(*run).execute(0xc000e3a000, {0xc0000520a0?, 0x5, 0x5}, 0xc000528240)
/opt/data/milvus_compile/milvus-2.2.11/cmd/milvus/run.go:117 +0x68e fp=0xc001381ee0 sp=0xc001381df8 pc=0x32f014e
github.com/milvus-io/milvus/cmd/milvus.RunMilvus({0xc0000520a0?, 0x5, 0x5})
/opt/data/milvus_compile/milvus-2.2.11/cmd/milvus/milvus.go:60 +0x21e fp=0xc001381f58 sp=0xc001381ee0 pc=0x32ef9be
main.main()
/opt/data/milvus_compile/milvus-2.2.11/cmd/main.go:26 +0x2e fp=0xc001381f80 sp=0xc001381f58 pc=0x32f302e
runtime.main()
/usr/local/go/src/runtime/proc.go:250 +0x207 fp=0xc001381fe0 sp=0xc001381f80 pc=0x14e1de7
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc001381fe8 sp=0xc001381fe0 pc=0x1516f41
Anything else?
No response
The text was updated successfully, but these errors were encountered: