-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Flush hangs when running hello_milvus.py after pulsar recovered from pod kill chaos #17508
Comments
@XuanYang-cn one more flush hang issue... |
may related to #17335, working on it |
/assign @zhuwenxing |
It still happened in version failed job: https://github.com/zhuwenxing/milvus/runs/6857375279?check_suite_focus=true I will try it again |
failed job: https://github.com/zhuwenxing/milvus/runs/6858068872?check_suite_focus=true In GitHub action, this issue seems stable reproduced |
It also failed in Jenkins pipeline。 |
The same reason as #14577, this time it's DataNode's consumer didn't reconnected with pulsar Related issues: |
/assign @xige-16 |
|
/unassign @xige-16 @XuanYang-cn Please help to fix this |
Flush hangs when running failed job:https://github.com/zhuwenxing/milvus/runs/6886845770?check_suite_focus=true |
/assign @zhuwenxing |
failed job: https://github.com/zhuwenxing/milvus/runs/6936878605?check_suite_focus=true /assign @sunby |
Fixed in #17642 |
@zhuwenxing pls help on verifying it |
/assign @sunby |
There is a bug in datacoord. Consider this situation, a segment is allocated by datacoord and the proxy tries to insert some records to it. But pulsar is killed at this moment and this segment is still empty. After calling Flush on proxy, datacoord will retrun a segment list containing segments that are waiting to be flushed. But datacoord won't flush empty segment, so this segment's state is always Sealed and the flush hang. I will fix it by setting the state of segment to Dropped if datacoord find this segment is empty. And the GetFlushState will check the segment whether is empty. Thanks for the great job. @zhuwenxing |
/unassign |
/assign @zhuwenxing please verify |
Not reproduced yet, remove the critical label. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I have one question about this issue.
|
Is there an existing issue for this?
Environment
Current Behavior
The action has timed out when running
hello_milvus.py
Outputting duplicate logs in proxy pod
Expected Behavior
all test cases passed
Steps To Reproduce
see https://github.com/milvus-io/milvus/runs/6851761481?check_suite_focus=true
Milvus Log
failed job: https://github.com/milvus-io/milvus/runs/6851761481?check_suite_focus=true
log: https://github.com/milvus-io/milvus/suites/6898913792/artifacts/267696164
Anything else?
No response
The text was updated successfully, but these errors were encountered: