S3 Output #2551

fritzhardy · 2016-09-14T21:46:46Z

Implements S3 output. Marries well with logstash S3 input, using S3 as a queuing/archival mechanism.

Stages lines to a local file and uploads at configurable number of bytes (based on fileout). Uploaded object path/name somewhat resembles CloudTrail logs currently, /somebucket/YYYY/MM/DD/hostname_ISO8601DATE.gz in UTC.

Example configuration:

output:
  s3:
    enabled: true
    path: "/var/log/s3"
    filename: s3
    upload_every_kb: 3
    #number_of_files: 2
    bucket: somebucket

karmi · 2016-09-14T21:46:47Z

Hi @fritzhardy, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

elasticsearch-release · 2016-09-14T21:47:04Z

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

elasticsearch-release · 2016-09-14T21:47:06Z

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

ruflin · 2016-09-16T07:56:09Z

@fritzhardy Thanks a lot for all your work on the S3 output and providing directly a PR. Unfortunately we currently do not plan to add further outputs as each output does add additional work on the maintenance and support side. We are a small team and currently focused on other enhancements on the beats side.

For additional outputs we recommend to keep them in a separate repository without maintaining a full fork and add them during compile time. For further details see the comment here: #1525 (comment) Like this outputs work almost like pPlugins. Perhaps one day Golang will support it, which will make adding external outputs even simpler.

We are always happy to help with questions or reviewing code. Feel free to ping us on discuss at any time: https://discuss.elastic.co/c/beats

ruflin · 2016-09-16T08:06:22Z

Based on the comment above, I'm closing this PR. @fritzhardy Feel free to use this PR for further discussions.

fritzhardy · 2016-09-16T13:25:14Z

I think it would be a boon to the project to have this as an official addition at some point. We find this approach gives us "networkless" log forwarding at little expense, and on the logstash side, all the configurability necessary to deal with a bucket of logs. However, I am aware that additional output plugins have been a source of discussion. In the meantime, I will review #1525 (comment). I was looking at some way to do that externally in the first place, akin to the external beats. Thanks for your time and feedback.

trixpan · 2016-10-04T04:44:53Z

Got to this while reading #1525...

@fritzhardy the straight to persistence output is indeed a clever approach which is not so widely used.

However, from what I gather, ( @ruflin can correct me if I am wrong) Elastic community is trying to avoid the inclusion of too many additional transport features as they end up resulting in a code base that is hard do manage the in the long run. To be honest I think their approach is fair call.

Having said that, projects like Apache MiNiFi allow you to do this by simply adding existing processors (PutS3 in this case) to the MiNiFi install but at the cost of running a lightweight JVM @ the edge.

Disclaimer: I am a NiFi committer.

ruflin · 2016-10-04T08:21:28Z

@trixpan We are actually hoping that Golang Plugins become a reality in the near future and this would directly solve the above issue.

About the approach you mentioned: Does that mean in the end 2 processes are running? MiNiFi + PutS3?

trixpan · 2016-10-04T09:58:30Z

@ruflin yes. GoLang plugins will be handy. Will be happy to have a go trying to add NiFi site 2 site support to filebeat when your feature is stable.

Regarding MiNiFi + PutS3: Just a single JVM is executed, MiNiFi (the framework). Whitin MiNiFi we run the PutS3 processor (think of a processor as a Input/Codec/Filter/Output in logstash terms).

To certain extend, MiNiFi is a stripped version of the overall framework (NiFi) and as such is able to reutilise processors of the main code base as illustrated here:

https://github.com/apache/nifi-minifi/blob/master/minifi-nar-bundles/minifi-standard-nar/pom.xml#L42

To cater for the S3 usecase, user would have to add NiFi NARs (nifi-aws-bundle) into the minifi install package (i.e. copy into the adequate folder) and use the installed processors (PutS3) as (s)he would normally do when using the main NiFi platform. Ironically enoug - if I recall correctly - outputting to a central NiFi would be optional.

In ELK terms, this would be akin to running a minimalist version of Logstash at the producer level, reusing some of the jRuby code that powers Logstash, instead of developing a golang producer (i.e. lumberjack-forwarder / *beats) from scratch. There are pros and cons on both approaches and no wonder we are also building minifi-cpp. 😃

ktham · 2018-10-22T21:18:55Z

@ruflin We're looking at dumping our logs into S3 and would like to see S3 as an output for filebeats. Looks like there was a previous effort here in this PR to add this. What is your recommendation here? Is it possible to support S3 output?

fritzhardy · 2018-10-23T12:59:52Z

We have been using this fork in production for two years. It needs some polish, but gets the job done.

ktham · 2018-10-23T17:05:17Z

I see, given that it's been working without issue for the past 2 years, I'd like to see if it's possible to include this output in the upstream filebeats project, or if there's a way to factor this into a plug-in so as to avoid using a fork of filebeats.

ktham · 2019-11-19T17:21:31Z

@ruflin/@fritzhardy Can we revisit the topic of adding an S3 output? We are willing to help with adding the code if necessary.

fritzhardy added 2 commits September 14, 2016 17:03

S3 output.

65b2625

Addition of aws-sdk to packer toolchain for S3 output.

9841698

ruflin closed this Sep 16, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 Output #2551

S3 Output #2551

fritzhardy commented Sep 14, 2016 •

edited by andrewkroh

Loading

karmi commented Sep 14, 2016

elasticsearch-release commented Sep 14, 2016

elasticsearch-release commented Sep 14, 2016

ruflin commented Sep 16, 2016

ruflin commented Sep 16, 2016

fritzhardy commented Sep 16, 2016

trixpan commented Oct 4, 2016 •

edited

Loading

ruflin commented Oct 4, 2016

trixpan commented Oct 4, 2016 •

edited

Loading

ktham commented Oct 22, 2018

fritzhardy commented Oct 23, 2018

ktham commented Oct 23, 2018

ktham commented Nov 19, 2019

S3 Output #2551

S3 Output #2551

Conversation

fritzhardy commented Sep 14, 2016 • edited by andrewkroh Loading

karmi commented Sep 14, 2016

elasticsearch-release commented Sep 14, 2016

elasticsearch-release commented Sep 14, 2016

ruflin commented Sep 16, 2016

ruflin commented Sep 16, 2016

fritzhardy commented Sep 16, 2016

trixpan commented Oct 4, 2016 • edited Loading

ruflin commented Oct 4, 2016

trixpan commented Oct 4, 2016 • edited Loading

ktham commented Oct 22, 2018

fritzhardy commented Oct 23, 2018

ktham commented Oct 23, 2018

ktham commented Nov 19, 2019

fritzhardy commented Sep 14, 2016 •

edited by andrewkroh

Loading

trixpan commented Oct 4, 2016 •

edited

Loading

trixpan commented Oct 4, 2016 •

edited

Loading