Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: index node keeps rebooting crazily #25725

Closed
1 task done
cyber-llm-agent opened this issue Jul 18, 2023 · 13 comments
Closed
1 task done

[Bug]: index node keeps rebooting crazily #25725

cyber-llm-agent opened this issue Jul 18, 2023 · 13 comments
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@cyber-llm-agent
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.2.11
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others: 

data node 4c8g
index node 4c8g
query node 4c8g

Current Behavior

keeps rebooting crazily

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

Fatal error condition occurred in /go/src/github.com/milvus-io/milvus/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/milvus/lib/libaws-cpp-sdk-core.so(aws_backtrace_print+0x61) [0x7fdb89d9d051]
/milvus/lib/libaws-cpp-sdk-core.so(aws_fatal_assert+0x4d) [0x7fdb89d93f6d]
/milvus/lib/libaws-cpp-sdk-core.so(+0x1c3fb2) [0x7fdb89caefb2]
/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x21) [0x7fdb89d9dc91]
/milvus/lib/libaws-cpp-sdk-core.so(+0x1c190c) [0x7fdb89cac90c]
/milvus/lib/libaws-cpp-sdk-core.so(aws_ref_count_release+0x21) [0x7fdb89d9dc91]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws3Crt2Io15ClientBootstrapD2Ev+0x3d) [0x7fdb89c6353d]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws25SetDefaultClientBootstrapERKSt10shared_ptrINS_3Crt2Io15ClientBootstrapEE+0x8a) [0x7fdb89baf0ba]
/milvus/lib/libaws-cpp-sdk-core.so(_ZN3Aws10CleanupCrtEv+0x2f) [0x7fdb89baf25f]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManager14ShutdownSDKAPIEv+0x59) [0x7fdb8ad623a9]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD1Ev+0x1e) [0x7fdb8ad623ee]
/milvus/lib/libmilvus_storage.so(_ZN6milvus7storage17MinioChunkManagerD0Ev+0xd) [0x7fdb8ad6275d]
/milvus/lib/libmilvus_storage.so(_ZNSt23_Sp_counted_ptr_inplaceIN6milvus7storage18MemFileManagerImplESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0xf0) [0x7fdb8ad25c00]
/milvus/lib/libmilvus_index.so(_ZN6milvus5index16VectorMemNMIndexD0Ev+0x140) [0x7fdb88d59bc0]
/milvus/lib/libmilvus_indexbuilder.so(_ZN6milvus12indexbuilder15VecIndexCreatorD0Ev+0x33) [0x7fdb8a803443]
/milvus/lib/libmilvus_indexbuilder.so(DeleteIndex+0x17) [0x7fdb8a804987]
milvus(_cgo_67e17cc21f76_Cfunc_DeleteIndex+0x21) [0x3292a51]
milvus(runtime.asmcgocall.abi0+0x64) [0x1567a64]
SIGABRT: abort
PC=0x7fdb89f3300b m=14 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 309 [syscall]:
runtime.cgocall(0x3292a30, 0xc0002d3828)
/usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc0002d3800 sp=0xc0002d37c8 pc=0x14fe57c
github.com/milvus-io/milvus/internal/util/indexcgowrapper._Cfunc_DeleteIndex(0x7fda14306610)
_cgo_gotypes.go:338 +0x51 fp=0xc0002d3828 sp=0xc0002d3800 pc=0x2c36711
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete.func1(0x1d55437?)
/go/src/github.com/milvus-io/milvus/internal/util/indexcgowrapper/index.go:320 +0x3a fp=0xc0002d3860 sp=0xc0002d3828 pc=0x2c3c3fa
github.com/milvus-io/milvus/internal/util/indexcgowrapper.(*CgoIndex).Delete(0xc00153a4b0)
/go/src/github.com/milvus-io/milvus/internal/util/indexcgowrapper/index.go:320 +0x32 fp=0xc0002d3898 sp=0xc0002d3860 pc=0x2c3c372
github.com/milvus-io/milvus/internal/indexnode.(*indexBuildTask).SaveIndexFiles(0xc000824600, {0x46612b0, 0xc0016c4180})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task.go:361 +0x1c8 fp=0xc0002d3d10 sp=0xc0002d3898 pc=0x2c48e08
github.com/milvus-io/milvus/internal/indexnode.task.SaveIndexFiles-fm({0x46612b0?, 0xc0016c4180?})
:1 +0x3e fp=0xc0002d3d38 sp=0xc0002d3d10 pc=0x2c5157e
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask.func1(0xc0002d3e08)
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:207 +0x82 fp=0xc0002d3d68 sp=0xc0002d3d38 pc=0x2c4c3a2
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask(0xc000177400, {0x4674fb8, 0xc000824600}, {0xc001492100?, 0xc0003d0120?})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:220 +0x3c9 fp=0xc0002d3f60 sp=0xc0002d3d68 pc=0x2c4bd09
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func1(0xc0006e17d0?, {0x4674fb8?, 0xc000824600?})
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:253 +0x6c fp=0xc0002d3fb8 sp=0xc0002d3f60 pc=0x2c4c7ac
github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func3()
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:254 +0x32 fp=0xc0002d3fe0 sp=0xc0002d3fb8 pc=0x2c4c712
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0002d3fe8 sp=0xc0002d3fe0 pc=0x1567da1
created by github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop
/go/src/github.com/milvus-io/milvus/internal/indexnode/task_scheduler.go:251 +0x298

goroutine 1 [chan receive]:
github.com/milvus-io/milvus/cmd/roles.(*MilvusRoles).Run(0xc00061fe48, 0x0, {0x0, 0x0})
/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:346 +0xaf1
github.com/milvus-io/milvus/cmd/milvus.(*run).execute(0xc000f8c060, {0xc000050090?, 0x3, 0x3}, 0xc0009344e0)
/go/src/github.com/milvus-io/milvus/cmd/milvus/run.go:117 +0x6ae
github.com/milvus-io/milvus/cmd/milvus.RunMilvus({0xc000050090?, 0x3, 0x3})
/go/src/github.com/milvus-io/milvus/cmd/milvus/milvus.go:60 +0x21e
main.main()
/go/src/github.com/milvus-io/milvus/cmd/main.go:26 +0x2e

goroutine 166 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc0007da080)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a

goroutine 163 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc000890b80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a

goroutine 184 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc001068b80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a

goroutine 164 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5eec8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc000244100?, 0xc00074c000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc000244100, {0xc00074c000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc000244100, {0xc00074c000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000049480, {0xc00074c000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc00050d2c0, {0xc00004a200, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc00050d2c0}, {0xc00004a200, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc00004a200?, 0x9?, 0x2ff8c40?}, {0x4643300?, 0xc00050d2c0?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00004a1c0)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000021e0)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f

goroutine 165 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0001e2370, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00050d320)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91

goroutine 167 [chan receive]:
github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc0002c87e0)
/go/pkg/mod/github.com/panjf2000/ants/v2@v2.4.8/pool.go:69 +0x8b
created by github.com/panjf2000/ants/v2.NewPool
/go/pkg/mod/github.com/panjf2000/ants/v2@v2.4.8/pool.go:137 +0x34a

goroutine 185 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5ece8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0010c2080?, 0xc0003f8000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc0010c2080, {0xc0003f8000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0010c2080, {0xc0003f8000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000fbdb10, {0xc0003f8000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc000934d20, {0xc00004a040, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc000934d20}, {0xc00004a040, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc00004a040?, 0x9?, 0x4b0093d?}, {0x4643300?, 0xc000934d20?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00004a000)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc000791a40)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f

goroutine 154 [syscall]:
os/signal.signal_recv()
/usr/local/go/src/runtime/sigqueue.go:151 +0x2f
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:23 +0x19
created by os/signal.Notify.func1.1
/usr/local/go/src/os/signal/signal.go:151 +0x2a

goroutine 172 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc000220c80)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a

goroutine 183 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5edd8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc000220a80?, 0x0?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Accept(0xc000220a80)
/usr/local/go/src/internal/poll/fd_unix.go:614 +0x22c
net.(*netFD).accept(0xc000220a80)
/usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc00014e138)
/usr/local/go/src/net/tcpsock_posix.go:139 +0x28
net.(*TCPListener).Accept(0xc00014e138)
/usr/local/go/src/net/tcpsock.go:288 +0x3d
net/http.(*Server).Serve(0xc0004e2380, {0x465f350, 0xc00014e138})
/usr/local/go/src/net/http/server.go:3039 +0x385
net/http.(*Server).ListenAndServe(0xc0004e2380)
/usr/local/go/src/net/http/server.go:2968 +0x7d
net/http.ListenAndServe(...)
/usr/local/go/src/net/http/server.go:3222
github.com/milvus-io/milvus/internal/management.ServeHTTP.func1()
/go/src/github.com/milvus-io/milvus/internal/management/server.go:69 +0x151
created by github.com/milvus-io/milvus/internal/management.ServeHTTP
/go/src/github.com/milvus-io/milvus/internal/management/server.go:66 +0x25

goroutine 186 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc000fb4960, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000934d80)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91

goroutine 174 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc0005f7740)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a

goroutine 262 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc001493180)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:112 +0x73
created by google.golang.org/grpc.newCCBalancerWrapper
/go/pkg/mod/google.golang.org/grpc@v1.46.0/balancer_conn_wrappers.go:73 +0x22a

goroutine 258 [select]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc001190a50, 0x1)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:407 +0x115
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00118f020)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/controlbuf.go:534 +0x85
google.golang.org/grpc/internal/transport.newHTTP2Client.func3()
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:415 +0x65
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:413 +0x1f91

goroutine 177 [IO wait]:
internal/poll.runtime_pollWait(0x7fdb85a5ebf8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0008d2080?, 0xc000874000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc0008d2080, {0xc000874000, 0x8000, 0x8000})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc0008d2080, {0xc000874000?, 0x3ff1880?, 0x1?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc000545dd0, {0xc000874000?, 0x154ae00?, 0x800010601?})
/usr/local/go/src/net/net.go:183 +0x45
bufio.(*Reader).Read(0xc00118efc0, {0xc0004e24a0, 0x9, 0x18?})
/usr/local/go/src/bufio/bufio.go:236 +0x1b4
io.ReadAtLeast({0x4643300, 0xc00118efc0}, {0xc0004e24a0, 0x9, 0x9}, 0x9)
/usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
/usr/local/go/src/io/io.go:350
golang.org/x/net/http2.readFrameHeader({0xc0004e24a0?, 0x9?, 0x566d165?}, {0x4643300?, 0xc00118efc0?})
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237 +0x6e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc0004e2460)
/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498 +0x95
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc00087c000)
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:1498 +0x414
created by google.golang.org/grpc/internal/transport.newHTTP2Client
/go/pkg/mod/google.golang.org/grpc@v1.46.0/internal/transport/http2_client.go:365 +0x193f

goroutine 274 [select]:
github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc000754300)
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:147 +0x9f
created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1
/go/src/github.com/milvus-io/milvus/internal/config/etcd_source.go:98 +0x5a

Anything else?

No response

@cyber-llm-agent cyber-llm-agent added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 18, 2023
@cyber-llm-agent
Copy link
Author

it seems that, after building the index, during the process of saving it to the s3, a fatal error is occurring.

image

Fatal error condition occurred in /go/src/github.com/milvus-io/milvus/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS

@cyber-llm-agent
Copy link
Author

2023-07-18 11:14:27,039 | INFO | default | [SEGCORE][N6milvus7storage17MinioChunkManagerE::MinioChunkManager][milvus] init MinioChunkManager with parameter[endpoint: 'xxxxxxxxxxxxxx', default_bucket_name:'xxxxxxxx', use_secure:'false']
2023-07-18 11:14:27,040 | WARNING | default | [KNOWHERE][GetGlobalThreadPool][milvus] Global ThreadPool has not been inialized yet, init it now with threads num: 4
2023-07-18 11:14:27,041 | INFO | default | [SEGCORE][N6milvus10ThreadPoolE::ThreadPool][milvus] Thread pool's worker num:40
[2023/07/18 11:14:27.713 +00:00] [DEBUG] [indexnode/indexnode_service.go:155] ["querying index build task"] [traceID=68115c633982b008] [ClusterID=zhike-milvus-up-m29d369896787c81f] [IndexBuildID=442936039136550943] [state=InProgress] ["fail reason"=]
2023-07-18 11:14:28,244 | WARNING | default | [KNOWHERE][MatchNlist][milvus] Row num 1029 match nlist 26
[2023/07/18 11:14:28.735 +00:00] [DEBUG] [indexnode/indexnode_service.go:155] ["querying index build task"] [traceID=50a594bac2ab362a] [ClusterID=zhike-milvus-up-m29d369896787c81f] [IndexBuildID=442936039136550943] [state=InProgress] ["fail reason"=]
[2023/07/18 11:14:29.563 +00:00] [INFO] [indexnode/task.go:346] ["Successfully build index"] [buildID=442936039136550943] [Collection=442936039136529002] [SegmentID=442936039136529290]
[2023/07/18 11:14:29.713 +00:00] [DEBUG] [indexnode/indexnode_service.go:155] ["querying index build task"] [traceID=43d8481bbbd1e3e1] [ClusterID=zhike-milvus-up-m29d369896787c81f] [IndexBuildID=442936039136550943] [state=InProgress] ["fail reason"=]
Fatal error condition occurred in /go/src/github.com/milvus-io/milvus/cmake_build/3rdparty_download/aws-sdk-subbuild/src/aws_sdk_s3_ep/crt/aws-crt-cpp/crt/aws-c-io/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################

@cyber-llm-agent
Copy link
Author

我看到你们有回退aws-sdk版本了,我直接调小threadCoreCoefficient可以解决吗,看着是aws-sdk创建线程失败?

@yanliang567
Copy link
Contributor

/assign @jiaoew1991
I think you are working on a similar issue.

/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 19, 2023
@yanliang567 yanliang567 added this to the 2.2.12 milestone Jul 19, 2023
@xiaofan-luan
Copy link
Collaborator

@xiaocai2333
is this the same issue you are fixing?

@xige-16
Copy link
Contributor

xige-16 commented Jul 19, 2023

releated with ##25297

@cyber-llm-agent
Copy link
Author

@xige-16
What circumstances would raise a fatal error? How can I avoid this without rebuilding Miluvs?

@xiaofan-luan
Copy link
Collaborator

@xige-16 What circumstances would raise a fatal error? How can I avoid this without rebuilding Miluvs?

This is actually another bug fixed on 2.2.12.
IndexNode claim a function as no error, causing S3 error triggered index panic.

@cyber-llm-agent
Copy link
Author

Do we have a simple way to handle this situation before the release of 2.2.12?
Such as changing a parameter, killing a thread, or performing some action prior to building an index
Currently, we are just waiting for the container to stop and reboot,not suit for online service

@xiaofan-luan
Copy link
Collaborator

Do we have a simple way to handle this situation before the release of 2.2.12? Such as changing a parameter, killing a thread, or performing some action prior to building an index Currently, we are just waiting for the container to stop and reboot,not suit for online service

2.2.12 will be released next monday.
@yanliang567 please take care of it

@cyber-llm-agent
Copy link
Author

wonderful! thanks so much

@yanliang567 yanliang567 modified the milestones: 2.2.12, 2.2.13 Aug 4, 2023
@stale
Copy link

stale bot commented Sep 6, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Sep 6, 2023
@yanliang567
Copy link
Contributor

I believe this was fixed on v2.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants