Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RabbitMQ output #581

Closed
Semyazz opened this issue Dec 22, 2015 · 60 comments
Closed

RabbitMQ output #581

Semyazz opened this issue Dec 22, 2015 · 60 comments

Comments

@Semyazz
Copy link

Semyazz commented Dec 22, 2015

Any plans to implement RabbitMQ (RMQ) output? I got isolated environments and I want to send everything through RMQ. I use cert-based auth and so on and it'd be awesome to utilize the same BUS here.

@monicasarbu
Copy link
Contributor

Not yet, but it would be great if you take the challenge and add support for it in Packetbeat. We are always encouraging our community to help us by adding support for the protocols they know the best.

@monicasarbu
Copy link
Contributor

@Semyazz Ah, sorry, I just noticed that you are referring to the RabbitMQ output. The Beats are not supporting RabbitMQ as output, but you can send data to Logstash that supports RabbitMQ output plugin: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-rabbitmq.html. We don't plan adding more outputs in Beats, see for example the discussion form here: https://github.com/elastic/filebeat/issues/132

@Semyazz
Copy link
Author

Semyazz commented Dec 23, 2015

Yea I've seen that discussion and I kinda disagree. To me any queue, especially RabbitMQ which is, I guess the most standard solution and supports such things like (to me the most important) cert-based auth, gives you much more freedom to plan your architecture than logstash. Yea you can implement clustering, routing and so on in Logstash, but why if you already have a working solution.

Basically I need something light and fast, that will send all collected data to my data BUS (RMQ) and then I can transfer it wherever I want using any kind of fancy topology I come up with and at some point I do have Logstash already to process that data and push it back to RMQ to send it to another place. So as you can see in this case and after reading many posts about logstash/logstash-forwarder and ELK deployments people do, I believe many people have the very same or at least similar use-cases.

Generally I like Linux's tools philosophy where each tool can do its job reliable and doesn't try to be a huge multipurpose software. In this case RMQ to me is a great data BUS, Logstash is a perfect data processor and Beats just like logstash-forwarder, is the perfect data collector.

@rlwmmw
Copy link

rlwmmw commented Jan 14, 2016

+1.
There are obviously multiple ways to approach the problem of moving data around, but having the ability to introduce message queuing at every stage would greatly improve the accuracy of log collection, and alleviate back pressure on LS and ES!

@geekpete
Copy link
Member

geekpete commented Feb 2, 2016

I'm in the same boat, as I currently use rabbitmq to queue inbound messages then have them consumed by a remote logstash. So the only currently supported method involving rabbit is to have beats send to a logstash that outputs to rabbit? I have to insert an additional logstash service in between my log shipper (beats) and my queue (rabbitmq)?

Back to python-beaver forwarder I guess. https://github.com/python-beaver/python-beaver

Add to that, the latest stable version of rabbitmq (3.6.0) has a new feature called lazy queues which will allow you to easily store/backlog hundreds of millions of messages using only a couple of hundred megabytes of ram, as long as the network and disk can handle the inbound.
https://www.rabbitmq.com/lazy-queues.html

@monicasarbu
Copy link
Contributor

@geekpete Yes, currently you need a logstash instance in between to transfer the data from the beats shipper to rabbitmq.

@fatmcgav
Copy link

It would be great if RabbitMQ would be considered for #943.

As above, I've got an existing RMQ BUS in place, and currently use Beaver to stick the logs into RMQ for Logstash to then consume...

@geekpete
Copy link
Member

@fatmcgav What version of beaver are you using?

@fatmcgav
Copy link

@geekpete Apologies for the delay in responding... I'm currently running Beaver 34.1.0...

Cheers
Gav

@monicasarbu
Copy link
Contributor

RabbitMQ output was requested also in the old libbeat repository: https://github.com/elastic/libbeat/issues/313

@timstoop
Copy link

timstoop commented Mar 2, 2016

Reiterating my comment from the old ticket:

We use RabbitMQ as a buffer and as a way to easily distribute messages to the processing logstashes. Logstash itself tends to require a lot more resources to be able to handle the datastream than a RabbitMQ, even when it's only configured to push messages towards a queue. RabbitMQ is in our opinion far better at handling sudden increases in events than logstash with the added benefit of being a buffer in case the logstashes can't handle the traffic by themselves. Scaling in another logstash or two to add processing power would then quickly empty the queues.

Another reason for us is so we can better manage resources between customers. We maintain servers for several customers, each with their own queue in RabbitMQ. The logstashes are setup to treat each queue equally, so a sudden increase in traffic for customer X does not necessarily cause a slowndown of the processing for customer Y as well. In logstash, we would have to open additional ports for doing this (as far as I know, at least). That's why we would like to have AMQP support for libbeat.

Someone in this thread added certificate based authentication, we need that as well, but as I wasn't sure if logstash supported that (and there are ways around it if need be), I didn't originally mention it. We get a lot of data from other datacenter and we do not always have a VPN between them, having certificate based authentication and encryption is very nice. We use Beaver as the client currently as well.

@geekpete
Copy link
Member

geekpete commented Mar 2, 2016

Unless there can be some low memory mode for logstash, then it won't compete with the lazy queue option in the latest rabbitmq 3.6.0. Lazy queues allow backlogs of hundreds of millions of messages in a single rabbit server (given there is enough disk space and fast enough disk to cater for it) but only uses a few hundred megabytes of ram. I'd call that super lightweight. (oops I mentioned this rabbit feature in an above comment, sorry.)

If logstash could be rigged in the same way, with some new special queue/buffer mode, then you wouldn't need the larger footprint just to buffer messages.

Either that or some kind of "proxy beat" written in Go that only acts as a message queue/proxy.

@timstoop
Copy link

timstoop commented Mar 2, 2016

And I personally wouldn't want logstash to do that. Focus on processing, leaving the queuing to the sw projects that focus on that, imho. That's one of the nice things about open source, chaining services that you have experience with to get the best end result.

@geekpete
Copy link
Member

geekpete commented Mar 2, 2016

Do one thing well.

@geekpete
Copy link
Member

geekpete commented Mar 2, 2016

So I actually took the time to go and read the original reasoning as mentioned here https://github.com/elastic/filebeat/issues/132 and I can see why it makes sense from a maintenance point of view. The work is done in logstash to support all the output plugins and anything missing between beats and logstash will be coming soon. But it adds another box into the stack. I'd probably consider removing rabbit if logstash performs ok and just going from logstash to logstash.

I wonder how much of a burst a logstash vs a rabbitmq could take on the same resources, I suppose it should be similar if written well.

@ranleyos
Copy link

I also have a corporate-wide Rabbit solution already in place. It is not only wise to continue to use that as my stream buffer, but it is a necessity. Our AMQP highway is already paved and heavily used. Filebeat (and/or the entire beats library) should be able to send directly to the AMQP stream and THEN Logstash can get involved. I'd really like to see this happen. I also think that this would GREATLY help the ELK stack in general by added flexibility.

@ziporah
Copy link

ziporah commented May 10, 2016

+1
We use rabbitmq as a redundant failover buffer inbetween systems. It is our main AMQP system, as redis was not yet easily configured to run redundant while the design was made.
Our entire stack is now built upon rabbitmq and we are not planning to change the entire design only for managing to push the logs with beats. I also think it is stupid to first run beats and then logstash on the producer side, to make it then push to rabbitmq and the dynamic logstash pool in the backend. You can just as wel only run logstash and then push directly to rabbitmq, making beats absolete. No processing is done on the producer side anyway, simple input {file *} output{rabbitmq}

@ranleyos
Copy link

Quite right! Adding I have several teams, and me requiring that they install Logstash at the producer side is not an option, and doesn't make sense either. I cannot see how adding rabbitmq output would be that extra work. Perhaps if Elastic pushed the original source then let the community take care of it would be an option?

@johntdyer
Copy link

I could not agree more !

@pietervogelaar
Copy link

+1

An output for a message queue seems very logical. As RabbitMQ is a very popular message queue, I would really like an output for RabbitMQ!

@ankopainting
Copy link

+1 we use rabbitmq w/ beaver currently and it would be good to replace beaver with all the beats

@lucasreed
Copy link

+1 this would help greatly!

@froztbyte
Copy link

I don't see anything mentioning a branch in this thread, so: has anyone taken a stab at implementing this?

If not, which code in beats should I look at to get an idea for starting on this?

@andrewkroh
Copy link
Member

andrewkroh commented Sep 14, 2016

@froztbyte I'm not aware of anyone working on this. The relevant interfaces to look at are in https://github.com/elastic/beats/blob/master/libbeat/outputs/outputs.go and you can use the existing outputs as examples.

I recommend following the guidance in this comment with regard to how to do the development outside of the main project. This will enable you to develop and maintain the output without the overhead of maintaining a fork.

To be clear we are not interested in maintaining additional outputs at the current time. There is a lot of work involved for us to support additional outputs. We are small team and there are a bunch of other enhancement requests that we are focused on. We are happy to help by answering questions you have about the code or by reviewing code you develop.

@froztbyte
Copy link

@andrewkroh Thanks for the pointers, I'll dig into them.

At the risk of sounding nagging, is it possible that Elastic might reconsider the position held on other outputs? Is there a possible middleground of external contribution for the feature/support?

I understand the effort cost involved in developing and supporting additional outputs, but on balance it seems that there is both a large amount of community desire for this feature and the benefit of this feature adding support for a mode that would otherwise require a trampoline logstash instance (at this stage, at least).

@selfieblue
Copy link

+1

3 similar comments
@cdemi
Copy link

cdemi commented Nov 29, 2016

+1

@gplesz
Copy link

gplesz commented Feb 16, 2017

👍

@chhuang0123
Copy link

+1

@warbaugh
Copy link

We'll be releasing a RMQ plugin for libbeats in about a month. We need to clean it up, and do some more testing. But, it has been working reliably for a few months now.

@clyons42
Copy link

clyons42 commented Apr 7, 2017

warbaugh I am very interested in your RMQ plugin, would love to help test if you need?

@viniiciusconceicao
Copy link

@warbaugh I am also very interested in your RMQ plugin, let us know when you are ready to release it :)

@ranleyos
Copy link

ranleyos commented Apr 10, 2017 via email

@warbaugh
Copy link

We've written this for a specific use case, and therefore isn't a fully featured implementation. It has a fair amount of run time against it now, but lots of RMQ features are missing.

If people are ok with that, we can make the github repository public. I just don't want people's expectations to be too high.

@timstoop
Copy link

We're ok with that, as long as you have something that works, we'll create the PRs to expand the functionality ;-)

@pikatoste
Copy link

@warbaugh I'm also interested in the RMQ plugin.

@Wernervdmerwe
Copy link

+1

We use Graylog for a multitude of reasons. Graylog supports configuration management of both filebeat and NXLog, as well as supporting RabbitMQ as an intermediary.
Graylog also supports pulling messages off a RabbitMQ queue, this is specifically useful for us having multiple servers in geographical different locations, both from not having multiple machines requiring access to the internet, as well as having to poke only one hole into a firewall and then the obvious reliability and security this provides.

@relgames
Copy link

hey @warbaugh, how is it going with RMQ plugin? Has it been released?

@nasirus
Copy link

nasirus commented Sep 18, 2017

+1

1 similar comment
@hbos
Copy link

hbos commented Sep 28, 2017

+1

@ebuildy
Copy link

ebuildy commented Oct 5, 2017

Released on v6 => https://www.elastic.co/guide/en/beats/metricbeat/6.0/metricbeat-metricset-rabbitmq-node.html

Not for v5.

@hbos
Copy link

hbos commented Oct 5, 2017

@ebuildy i think we want to use rabbitmq for transport, its not about the metrics.

@sidleal
Copy link

sidleal commented Oct 5, 2017

Hi all, I managed to send events to my RabbitMQ cluster using a custom output and MQTT plugin. More tests needed but it seems to work for me. And you need to clone and build your own version of the desired beat, so it's not so simple to use.

https://github.com/sidleal/mqttout

@ebuildy
Copy link

ebuildy commented Oct 5, 2017

Doh @hbos :/ I -1 on me ^^

@monicasarbu
Copy link
Contributor

Thank you, everyone, for the input. I am closing this issue because leaving it open was interpreted as a sign that we plan to implement a RabbitMQ output. Our position didn't change, but I'd like to clarify it.

Currently, we are not planning to add support for other outputs in Beats. We support Elasticsearch, Logstash, Redis and Kafka and we spend a considerable amount of time maintaining and supporting the outputs as we are trying to support the same features in all outputs, features like backpressure and guaranteed delivery. I think Logstash is doing an excellent job here supporting a lot of different inputs/outputs, and we would like not to double the effort to achieve the same thing.

I understand the concern in this thread that Beats require the installation of another component for sending to other outputs that we don't support, but we need to put that in balance with the scope of the project. We would like to concentrate our efforts in building features that make it easier to monitor your infrastructure, and to name a few: auto-discovery, monitoring your Docker containers, your Kubernetes pods, or just your RabbitMQ cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests