-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: index node keeps rebooting crazily #25725
Comments
it seems that, after building the index, during the process of saving it to the s3, a fatal error is occurring. Fatal error condition occurred in /go/src/github.com/milvus-io/milvus/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS |
2023-07-18 11:14:27,039 | INFO | default | [SEGCORE][N6milvus7storage17MinioChunkManagerE::MinioChunkManager][milvus] init MinioChunkManager with parameter[endpoint: 'xxxxxxxxxxxxxx', default_bucket_name:'xxxxxxxx', use_secure:'false'] |
我看到你们有回退aws-sdk版本了,我直接调小threadCoreCoefficient可以解决吗,看着是aws-sdk创建线程失败? |
/assign @jiaoew1991 /unassign |
@xiaocai2333 |
releated with ##25297 |
@xige-16 |
This is actually another bug fixed on 2.2.12. |
Do we have a simple way to handle this situation before the release of 2.2.12? |
2.2.12 will be released next monday. |
wonderful! thanks so much |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I believe this was fixed on v2.3.0 |
Is there an existing issue for this?
Environment
Current Behavior
keeps rebooting crazily
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
Fatal error condition occurred in /go/src/github.com/milvus-io/milvus/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/milvus/lib/libaws-cpp-sdk-core.so(aws_backtrace_print+0x61) [0x7fdb89d9d051]
/milvus/lib/libaws-cpp-sdk-core.so(aws_fatal_assert+0x4d) [0x7fdb89d93f6d]
/milvus/lib/libaws-cpp-sdk-core.so(+0x1c3fb2) [0x7fdb89caefb2]
/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x21) [0x7fdb89d9dc91]
/milvus/lib/libaws-cpp-sdk-core.so(+0x1c190c) [0x7fdb89cac90c]
/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x21) [0x7fdb89d9dc91]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws3Crt2Io15ClientBootstrapD2Ev+0x3d) [0x7fdb89c6353d]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws25SetDefaultClientBootstrapERKSt10shared_ptrINS_3Crt2Io15ClientBootstrapEE+0x8a) [0x7fdb89baf0ba]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws10CleanupCrtEv+0x2f) [0x7fdb89baf25f]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManager14ShutdownSDKAPIEv+0x59) [0x7fdb8ad623a9]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD1Ev+0x1e) [0x7fdb8ad623ee]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD0Ev+0xd) [0x7fdb8ad6275d]
/milvus/lib/libmilvus_storage.so(_ZNSt23_Sp_counted_ptr_inplaceIN6milvus7storage18MemFileManagerImplESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0xf0) [0x7fdb8ad25c00]
/milvus/lib/libmilvus_index.so(_ZN6milvus5index16VectorMemNMIndexD0Ev+0x140) [0x7fdb88d59bc0]
/milvus/lib/libmilvus_indexbuilder.so(_ZN6milvus12indexbuilder15VecIndexCreatorD0Ev+0x33) [0x7fdb8a803443]
/milvus/lib/libmilvus_indexbuilder.so(DeleteIndex+0x17) [0x7fdb8a804987]
milvus(_cgo_67e17cc21f76_Cfunc_DeleteIndex+0x21) [0x3292a51]
milvus(runtime.asmcgocall.abi0+0x64) [0x1567a64]
SIGABRT: abort
PC=0x7fdb89f3300b m=14 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 309 [syscall]:
runtime.cgocall(0x3292a30, 0xc0002d3828)
/usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc0002d3800 sp=0xc0002d37c8 pc=0x14fe57c
github.com/milvus-io/milvus/internal/util/indexcgowrapper._Cfunc_DeleteIndex(0x7fda14306610)
_cgo_gotypes.go:338 +0x51 fp=0xc0002d3828 sp=0xc0002d3800 pc=0x2c36711
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete.func1(0x1d55437?)
/go/src/github.com/milvus-io/milvus/internal/util/indexcgowrapper/index.go:320 +0x3a fp=0xc0002d3860 sp=0xc0002d3828 pc=0x2c3c3fa
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete(0xc00153a4b0)
/go/src/github.com/milvus-io/milvus/internal/util/indexcgowrapper/index.go:320 +0x32 fp=0xc0002d3898 sp=0xc0002d3860 pc=0x2c3c372
github.com/milvus-io/milvus/internal/indexnode.(*indexBuildTask).SaveIndexFiles(0xc000824600, {0x46612b0, 0xc0016c4180})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task.go:361 +0x1c8 fp=0xc0002d3d10 sp=0xc0002d3898 pc=0x2c48e08
github.com/milvus-io/milvus/internal/indexnode.task.SaveIndexFiles-fm({0x46612b0?, 0xc0016c4180?})
:1 +0x3e fp=0xc0002d3d38 sp=0xc0002d3d10 pc=0x2c5157e
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask.func1(0xc0002d3e08)
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:207 +0x82 fp=0xc0002d3d68 sp=0xc0002d3d38 pc=0x2c4c3a2
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask(0xc000177400, {0x4674fb8, 0xc000824600}, {0xc001492100?, 0xc0003d0120?})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:220 +0x3c9 fp=0xc0002d3f60 sp=0xc0002d3d68 pc=0x2c4bd09
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func1(0xc0006e17d0?, {0x4674fb8?, 0xc000824600?})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:253 +0x6c fp=0xc0002d3fb8 sp=0xc0002d3f60 pc=0x2c4c7ac
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func3()
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:254 +0x32 fp=0xc0002d3fe0 sp=0xc0002d3fb8 pc=0x2c4c712
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0002d3fe8 sp=0xc0002d3fe0 pc=0x1567da1
created by github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:251 +0x298
goroutine 1 [chan receive]:
github.com/milvus-io/milvus/cmd/roles.(*MilvusRoles).Run(0xc00061fe48, 0x0, {0x0, 0x0})
/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:346 +0xaf1
github.com/milvus-io/milvus/cmd/milvus.(*run).execute(0xc000f8c060, {0xc000050090?, 0x3, 0x3}, 0xc0009344e0)
/go/src/github.com/milvus-io/milvus/cmd/milvus/run.go:117 +0x6ae
github.com/milvus-io/milvus/cmd/milvus.RunMilvus({0xc000050090?, 0x3, 0x3})
/go/src/github.com/milvus-io/milvus/cmd/milvus/milvus.go:60 +0x21e
main.main()
/go/src/github.com/milvus-io/milvus/cmd/main.go:26 +0x2e
goroutine 166 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc0007da080)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a
goroutine 163 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc000890b80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a
goroutine 184 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc001068b80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a
goroutine 164 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5eec8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc000244100?, 0xc00074c000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc000244100, {0xc00074c000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc000244100, {0xc00074c000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000049480, {0xc00074c000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc00050d2c0, {0xc00004a200, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc00050d2c0}, {0xc00004a200, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc00004a200?, 0x9?, 0x2ff8c40?}, {0x4643300?, 0xc00050d2c0?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00004a1c0)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000021e0)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f
goroutine 165 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0001e2370, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00050d320)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91
goroutine 167 [chan receive]:
github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc0002c87e0)
/go/pkg/mod/github.com/panjf2000/ants/v2@v2.4.8/pool.go:69 +0x8b
created by github.com/panjf2000/ants/v2.NewPool
/go/pkg/mod/github.com/panjf2000/ants/v2@v2.4.8/pool.go:137 +0x34a
goroutine 185 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5ece8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0010c2080?, 0xc0003f8000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc0010c2080, {0xc0003f8000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0010c2080, {0xc0003f8000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000fbdb10, {0xc0003f8000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc000934d20, {0xc00004a040, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc000934d20}, {0xc00004a040, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc00004a040?, 0x9?, 0x4b0093d?}, {0x4643300?, 0xc000934d20?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00004a000)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc000791a40)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f
goroutine 154 [syscall]:
os/signal.signal_recv()
/usr/local/go/src/runtime/sigqueue.go:151 +0x2f
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:23 +0x19
created by os/signal.Notify.func1.1
/usr/local/go/src/os/signal/signal.go:151 +0x2a
goroutine 172 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc000220c80)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a
goroutine 183 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5edd8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc000220a80?, 0x0?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Accept(0xc000220a80)
/usr/local/go/src/internal/poll/fd_unix.go:614 +0x22c
net.(*netFD).accept(0xc000220a80)
/usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc00014e138)
/usr/local/go/src/net/tcpsock_posix.go:139 +0x28
net.(*TCPListener).Accept(0xc00014e138)
/usr/local/go/src/net/tcpsock.go:288 +0x3d
net/http.(*Server).Serve(0xc0004e2380, {0x465f350, 0xc00014e138})
/usr/local/go/src/net/http/server.go:3039 +0x385
net/http.(*Server).ListenAndServe(0xc0004e2380)
/usr/local/go/src/net/http/server.go:2968 +0x7d
net/http.ListenAndServe(...)
/usr/local/go/src/net/http/server.go:3222
github.com/milvus-io/milvus/internal/management.ServeHTTP.func1()
/go/src/github.com/milvus-io/milvus/internal/management/server.go:69 +0x151
created by github.com/milvus-io/milvus/internal/management.ServeHTTP
/go/src/github.com/milvus-io/milvus/internal/management/server.go:66 +0x25
goroutine 186 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc000fb4960, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000934d80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91
goroutine 174 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc0005f7740)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a
goroutine 262 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc001493180)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a
goroutine 258 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc001190a50, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00118f020)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91
goroutine 177 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5ebf8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0008d2080?, 0xc000874000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc0008d2080, {0xc000874000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0008d2080, {0xc000874000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000545dd0, {0xc000874000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc00118efc0, {0xc0004e24a0, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc00118efc0}, {0xc0004e24a0, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc0004e24a0?, 0x9?, 0x566d165?}, {0x4643300?, 0xc00118efc0?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc0004e2460)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc00087c000)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f
goroutine 274 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc000754300)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a
Anything else?
No response
The text was updated successfully, but these errors were encountered: