Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve][Manager] Add a heartbeat timeout status to the source #7987

Closed
2 tasks done
fuweng11 opened this issue May 9, 2023 · 0 comments · Fixed by #7989
Closed
2 tasks done

[Improve][Manager] Add a heartbeat timeout status to the source #7987

fuweng11 opened this issue May 9, 2023 · 0 comments · Fixed by #7989

Comments

@fuweng11
Copy link
Contributor

fuweng11 commented May 9, 2023

Description

Add a heartBeatTimeout status to the source.

Currently streamSource has the following problems:
1.StreamSource information that is in the status to be delivered or modified cannot be deleted.
2.If the agent does not report data for a long time due to the abnormal status of the streamSource, streamSource information cannot be deleted or remains in the status to be issued for a long time.

Solution:
Currently, the Agent supports heartbeat reporting and automatic registration. After receiving the heartbeat information from the Agent, the Manager registers the corresponding cluster and node information and sets the node status to NORMAL.

To solve the above two problems, the agent HEARTBEAT_TIMEOUT state is added to the streamSource. The streamSource in this state can be deleted but cannot be modified.

In this case, if the Agent cluster corresponding to the data source to be delivered fails to report the heartbeat information due to various problems, the Manager does not receive the heartbeat information of the Agent for a long time and changes the node status to HEARTBEAT_TIMEOUT and change the status of the streamSource corresponding to the node to agent heartbeat Timeout.

In this case, you can choose to delete the data source or wait for the recovery of the agent node to continue the task. If you choose to delete the data source, after the Agent node recovers,

The Manager determines if the previous task needs to continue based on the is_deleted property of the current data source, if is_delted! =0, the current data source needs to be deleted. Therefore, the data source status is changed to to be deleted.

To solve the above two problems.

InLong Component

InLong Manager

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@dockerzhang dockerzhang added this to the 1.7.0 milestone May 10, 2023
@healchow healchow changed the title [Improve][Manager] Add a heartBeatTimeout status to the source [Improve][Manager] Add a heartbeat timeout status to the source May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants