New syslog output #1525

avleen · 2016-04-28T23:45:52Z

We had a need to have Filebeat send logs to a syslog server, so wrote this output for libbeat :-)

If there are any changes/improvements that I can make to help get this accepted, I'd be happy to do them.

There is possibly one suboptimal part: we're using fields as a way to let users configure 3 syslog parameters.
This could be moved into a separate place, but it didn't seem that bad because the output is sending the original log line, as read, rather than converting it to JSON, so the fields doesn't have any meaning in this context.
And if we made the output format (plain text vs json) configurable, then it may make sense to have this in the fields anyway.

Thanks!

elasticsearch-release · 2016-04-28T23:47:02Z

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

tsg · 2016-04-29T08:57:25Z

Thanks @avleen for opening the PR. In general, we're very conservative when it comes to adding more outputs, see for example this (old) discussion. In the meantime, we added Redis and Kafka because they were very often requested, but we don't plan to have lots of output modules like Logstash has.

We don't see the request for a syslog output nearly as much as for Redis/Kafka, but it was requested once more relatively recently. May I ask what the use case is, and if using Logstash as an intermediary would be an option?

avleen · 2016-04-29T18:06:46Z

Hi @tsg!

I read over that discussion, and the reasons are similar, with some distinctions:

In addition to using LSF to forward logs through our ELK stack, we use syslog to ship logs to a separate, secure environment. This pipeline has basically zero processing, and the intention is to have two copies of logs, through two separate pipelines. The reasons for this touch security, compliance, etc, and also having the ability to replay logs into Elasticsearch if the cluster breaks. Raw text logs are our long-term backup :-) I don't think this is an uncommon design - in AWS, people back logs up to S3, etc for long term storage. We're on bare metal and have a central syslog server.
syslog-ng/rsyslog each have their bugs and all we really want is a super lightweight simple forwarder.
It makes sense to have one forwarder do all the forwarding, rather than multiple applications with very different configurations.
We thought about putting logstash on all our servers to ship the logs, but that's a much heavier-weight solution than we'd like.

We could use Logstash as an intermediary, but that adds an additional moving piece to the infrastructure, which is less desirable.

If this additional output makes folks uncomfortable, we could keep a local fork. The code is fairly well isolated from the rest of libbeat and I don't think that would be too expensive for us to do.
But of course, the ideal situation would be to have it upstream :)

karmi · 2016-05-03T20:18:58Z

Hi @avleen, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

avleen · 2016-05-03T20:21:01Z

Thanks @karmi ! Just done now :-)

kimchy · 2016-05-04T14:40:08Z

heya @avleen, looking at this change, and I have concerns. Initially, we wanted to only support ES and LS outputs for libbeat, in order to simplify the scope of libbeat, with the assumption that if other outputs are needed, it will be done through LS.

We revisited this plan as people made good points around the architectural importance of supporting queuing, so we decided to take the additional overhead of also supporting certain queue implementations in libbeat.

This is a new territory, that I very reluctant for us to get into. First, syslog is, well, hard..., and officially supporting syslog also means taking all the many different variations and end points that might support it. Also, it gets us into non queue based outputs, and here again it will be hard to say no to other formats, which means additional burden on beats (instead of focusing on other improvements) and duplication of efforts between LS and Beats.

Is there a reason why you can't have an LS in the middle, beats shipping to it and output to syslog, to support this use case? It might be a good solution for now, and see how it goes from there? This is the first time we heard a request for a syslog output, and would hate to add the support for syslog in libbeat just based on one request.

avleen · 2016-05-04T17:41:14Z

Hi @kimchy! Thanks for the response 😄
I absolutely understand both the change in direction this takes, and the extra burden it places on Elastic to maintain this.

We thought about adding Logstash in the middle, but there were a couple of reasons we didn't go down that route:

Our syslog server does a fair amount of work: 500k EPS, filtering thousands of streams into different files. Adding Logstash would require significant CPU, likely needing multiple Logstash servers to feed things into syslog. That's much more complexity in architecture than we hoped for, on this pipeline which is essentially a fallback incase our main ELK stack has problems.
The alternative would be to replace {{syslog-ng}} on the syslog server completely with Logstash. Initial testing showed Logstash required much more CPU than was available to do this, even though it would be the better approach of the two.

One of the wonderful advantages of {{filebeat}} and {{syslog-ng}} is their incredible performance.
The main reason we're shifting away from the existing syslog daemons for log shipping is partly their complexity, and partly design flaws (eg, when syslog-ng is following a file: if the file is rotated, anything unread at the end of the file is forgotten. we know LJ/LSF/Filebeat don't suffer such things).

If this causes too much of a departure of the Beats design, we completely understand and we're happy to put the work in to maintain this just for ourselves and we can close this PR.
Mostly, we hoped that it might be useful to others in the community 😄

We had also considered adding other outputs too (eg HTTP), at which point we start having even more overlap with Logstash features. However, I do think there's value in this overlap. Pushing work out to the edges before events hit Logstash is a decent approach to scaling and has many benefits - I figure this is why Filebeat can do filtering and multi-line joining.

Thanks again for your time and insights :-)

avleen · 2016-05-05T16:26:39Z

@kimchy, @tsg
A question that came up when we were discussing this internally this morning:
Would you be open to a patch, which makes the outputs more pluggable?
This would enable others to build custom outputs on top of libbeat, without needing to add any extra burden to yourselves.
This is also something we would be open to doing the work for, as it could be a huge win for the community.

Thanks!

urso · 2016-05-07T16:55:55Z

@avleen How exactly you want to make outputs more pluggable. There is currently some boilerplate required in order to use modeutil.NewConnectionMode and others. But this boilerplate is on purpose, as the ConnectionMode interface is not used by all outputs. It's usage is optional and mostly tailored for network based outputs. e.g. it's only used for kafka output, due to lib not providing support for "guaranteed" publisher mode.

Check out filebeat/main.go it's already very minimal. One can have a custom filebeat with custom output plugin like this:

package main

import (
    "os"

        _ "github.com/myuser/customfilebeat/outputs/syslog"

    "github.com/elastic/beats/filebeat/beater"
    "github.com/elastic/beats/libbeat/beat"
)

func main() {
    if err := beat.Run("customfilebeat, "", beater.New()); err != nil {
        os.Exit(1)
    }
}

Advantages:

no need to fork filebeat and getting all fixes done to filebeat by recompiling.
you can still share your custom output (it's just one import to have it available).

Disadvantages:

need to recompile customfilebeat every so often
you have to maintain output plugin in case of interface changes

I think this is already a good compromise.

kimchy · 2016-05-12T14:04:27Z

@avleen heya, as @urso mentioned, it is problematic to make a pluggable arch in go and allow pluggable outputs, what was your idea around it?

McStork · 2016-05-15T14:02:22Z

you can still share your custom output (it's just one import to have it available).

It would be nice if that step could be made easier. For example, by importing plugins at compile time:

./configure --plugin-output https://github.com/avleen/beats-output-syslog.git
make

Better, if community-plugins could be registered with the help of Elastic:

./configure --plugin beats-output-syslog
make

There could be a version to go along with the specified plugin name.
Plugins would then be imported, registered in the source code, and compiled along with Libbeat.

@avleen, could it answer your needs? That probably would encourage contributions like yours to Beats.

avleen · 2016-05-20T01:10:49Z

@kimchy @urso Thanks folks! We actually had a very similar idea in the end, and we're going to do it this way. I think that makes the most sense. It does mean we need to have our own main.go, but that's a trivial matter really.

I love @McStork's approach, and having a way to support community plugins like this would be a big advantage I think. That way, Elastic engineers can focus on the core product and the community can be responsible for the other stuff.

In the mean time, I'm going to close this pull request. I very much appreciate the time you've all taken on this :-)

trixpan · 2016-10-04T05:25:17Z

@urso sorry to necrobump this but I think it would be a great idea to publish a sample output plugin end to end into a git repo. It could be something as simple as Syslog, although perhaps, RELP (rsyslog's reliable protocol) could be a better idea as it introduce delivery guarantees.

A basic RELP library exists here:

https://github.com/stith/gorelp

ruflin · 2016-10-04T08:24:34Z

@trixpan Agree that we probably should provide an example Github repo that will make it easier for people to get started on this and will reduce the input need from our side. I would suggest not create a new output on our side but take an existing one and just put it in a separate repo (like elasticsearch-example output). Otherwise we have a maintance burden again.

ruflin · 2016-10-04T09:00:33Z

@trixpan Best option would be if there is a community output that followed all our suggestion that we can just point to as reference ;-)

urso · 2016-10-04T14:01:24Z

I don't think I would use RELP/syslog for this though. Maybe some plain TCP/UDP will do.

avleen added 2 commits April 28, 2016 23:13

Initial commit of syslog output

54a3e1c

Update example fileeat.yml with syslog output notes.

8f45f51

tsg added the discuss Issue needs further discussion. label Apr 29, 2016

avleen and others added 3 commits April 29, 2016 14:06

Merge branch 'master' into syslog_output

38ddaf8

Merge branch 'master' into syslog_output

c1ff883

Some functions moved from mode to modeutil

829ed7e

Merge branch 'master' into syslog_output

f750a83

Merge branch 'master' into syslog_output

03eafdd

avleen closed this May 20, 2016

andrewkroh mentioned this pull request Sep 14, 2016

RabbitMQ output #581

Closed

ruflin mentioned this pull request Sep 16, 2016

S3 Output #2551

Closed

tsg mentioned this pull request May 2, 2017

Added output for AWS CloudWatch Logs #4150

Closed

tsg mentioned this pull request Sep 22, 2017

Support influxdb output #5225

Closed

mustafaakin mentioned this pull request Feb 2, 2018

Extending beats for new outputs for new versions #6264

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New syslog output #1525

New syslog output #1525

avleen commented Apr 28, 2016

elasticsearch-release commented Apr 28, 2016

tsg commented Apr 29, 2016

avleen commented Apr 29, 2016

karmi commented May 3, 2016

avleen commented May 3, 2016

kimchy commented May 4, 2016

avleen commented May 4, 2016

avleen commented May 5, 2016

urso commented May 7, 2016

kimchy commented May 12, 2016

McStork commented May 15, 2016 •

edited

Loading

avleen commented May 20, 2016

trixpan commented Oct 4, 2016

ruflin commented Oct 4, 2016

ruflin commented Oct 4, 2016

urso commented Oct 4, 2016

New syslog output #1525

New syslog output #1525

Conversation

avleen commented Apr 28, 2016

elasticsearch-release commented Apr 28, 2016

tsg commented Apr 29, 2016

avleen commented Apr 29, 2016

karmi commented May 3, 2016

avleen commented May 3, 2016

kimchy commented May 4, 2016

avleen commented May 4, 2016

avleen commented May 5, 2016

urso commented May 7, 2016

kimchy commented May 12, 2016

McStork commented May 15, 2016 • edited Loading

avleen commented May 20, 2016

trixpan commented Oct 4, 2016

ruflin commented Oct 4, 2016

ruflin commented Oct 4, 2016

urso commented Oct 4, 2016

McStork commented May 15, 2016 •

edited

Loading