Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMQP auto reconnect feature #207

Closed
wants to merge 1 commit into from
Closed

Conversation

ekini
Copy link
Contributor

@ekini ekini commented Sep 17, 2015

I can't find find the code in telegraf, which detects if servers from output plugins go away. It runs Connect() only once.

So if rabbitmq server gets down, telegraf repeatedly logs "can't flush points".
This commit fixes that using func (*Connection) NotifyClose and reconnecting after Close event.

I'm open to suggestions if there's a better way to do it.

@sparrc
Copy link
Contributor

sparrc commented Sep 18, 2015

I'm not sure I like having a thread just for this output sink running to keep the connection open. Instead, could you modify the flush function (https://github.com/influxdb/telegraf/blob/master/agent.go#L300-L318) to retry making a connection if Write fails once?

@ekini
Copy link
Contributor Author

ekini commented Sep 18, 2015

But it will add unnecessary logic in the core. I'd rather leave it in the output pugin.
The thread doesn't run constantly, most of the time it just waits for a close event from amqp.

@ekini
Copy link
Contributor Author

ekini commented Sep 18, 2015

Although, if you/we are going to fix #187, it might be worth implementing connect/reconnect feature for all outputs altogether.

@sparrc
Copy link
Contributor

sparrc commented Sep 18, 2015

@ekini I don't think it's unnecessary logic, I actually think that it's a necessary improvement that we'll need to make, I think most long-running telegraf instances will lose connection to one or more output sinks at some point

@ekini
Copy link
Contributor Author

ekini commented Sep 23, 2015

Ok, I'm going to close this one to prepare for a more common solution, suitable for all outputs.

@ekini ekini closed this Sep 23, 2015
@sparrc
Copy link
Contributor

sparrc commented Sep 24, 2015

@ekini sorry for the delay getting to this.

After thinking more about it, I think that your solution may in fact be the best option. Many of the outputs (such as InfluxDB) do not manage a persistent connection like AMQP does. In the case of AMQP, it makes sense, and it also makes sense to make management of those connections on a per-output basis.

I'm going to re-open the PR, do you mind rebasing and I can this merged tomorrow during the USA/Pacfic workday?

@ekini
Copy link
Contributor Author

ekini commented Sep 24, 2015

Ok, rebased to current master.

@sparrc sparrc closed this in c6283d1 Sep 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants