-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 Output #2551
S3 Output #2551
Conversation
Hi @fritzhardy, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile? |
Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run. |
1 similar comment
Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run. |
@fritzhardy Thanks a lot for all your work on the S3 output and providing directly a PR. Unfortunately we currently do not plan to add further outputs as each output does add additional work on the maintenance and support side. We are a small team and currently focused on other enhancements on the beats side. For additional outputs we recommend to keep them in a separate repository without maintaining a full fork and add them during compile time. For further details see the comment here: #1525 (comment) Like this outputs work almost like pPlugins. Perhaps one day Golang will support it, which will make adding external outputs even simpler. We are always happy to help with questions or reviewing code. Feel free to ping us on discuss at any time: https://discuss.elastic.co/c/beats |
Based on the comment above, I'm closing this PR. @fritzhardy Feel free to use this PR for further discussions. |
I think it would be a boon to the project to have this as an official addition at some point. We find this approach gives us "networkless" log forwarding at little expense, and on the logstash side, all the configurability necessary to deal with a bucket of logs. However, I am aware that additional output plugins have been a source of discussion. In the meantime, I will review #1525 (comment). I was looking at some way to do that externally in the first place, akin to the external beats. Thanks for your time and feedback. |
Got to this while reading #1525... @fritzhardy the straight to persistence output is indeed a clever approach which is not so widely used. However, from what I gather, ( @ruflin can correct me if I am wrong) Elastic community is trying to avoid the inclusion of too many additional transport features as they end up resulting in a code base that is hard do manage the in the long run. To be honest I think their approach is fair call. Having said that, projects like Apache MiNiFi allow you to do this by simply adding existing processors (PutS3 in this case) to the MiNiFi install but at the cost of running a lightweight JVM @ the edge. Disclaimer: I am a NiFi committer. |
@trixpan We are actually hoping that Golang Plugins become a reality in the near future and this would directly solve the above issue. About the approach you mentioned: Does that mean in the end 2 processes are running? MiNiFi + PutS3? |
@ruflin yes. GoLang plugins will be handy. Will be happy to have a go trying to add NiFi site 2 site support to filebeat when your feature is stable. Regarding MiNiFi + PutS3: Just a single JVM is executed, MiNiFi (the framework). Whitin MiNiFi we run the PutS3 processor (think of a processor as a Input/Codec/Filter/Output in logstash terms). To certain extend, MiNiFi is a stripped version of the overall framework (NiFi) and as such is able to reutilise processors of the main code base as illustrated here: To cater for the S3 usecase, user would have to add NiFi NARs (nifi-aws-bundle) into the minifi install package (i.e. copy into the adequate folder) and use the installed processors (PutS3) as (s)he would normally do when using the main NiFi platform. Ironically enoug - if I recall correctly - outputting to a central NiFi would be optional. In ELK terms, this would be akin to running a minimalist version of Logstash at the producer level, reusing some of the jRuby code that powers Logstash, instead of developing a golang producer (i.e. lumberjack-forwarder / *beats) from scratch. There are pros and cons on both approaches and no wonder we are also building minifi-cpp. 😃 |
@ruflin We're looking at dumping our logs into S3 and would like to see S3 as an output for filebeats. Looks like there was a previous effort here in this PR to add this. What is your recommendation here? Is it possible to support S3 output? |
We have been using this fork in production for two years. It needs some polish, but gets the job done. |
I see, given that it's been working without issue for the past 2 years, I'd like to see if it's possible to include this output in the upstream filebeats project, or if there's a way to factor this into a plug-in so as to avoid using a fork of filebeats. |
@ruflin/@fritzhardy Can we revisit the topic of adding an S3 output? We are willing to help with adding the code if necessary. |
Implements S3 output. Marries well with logstash S3 input, using S3 as a queuing/archival mechanism.
Stages lines to a local file and uploads at configurable number of bytes (based on fileout). Uploaded object path/name somewhat resembles CloudTrail logs currently, /somebucket/YYYY/MM/DD/hostname_ISO8601DATE.gz in UTC.
Example configuration: