-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An EEL sink for Flume #168
Comments
You can assign this one to me. |
Do we still want a flume connector? I think flume is more suited to streaming data, running on a continuous basis, and eel is very much batch based. |
I am fine with this...Yeah I guess its main purpose is for streaming l, but I have used it for ingesting large batches of events (rows) into HDFS. My initial thoughts is you could have a scenario like this... JdbcSource -> FlumeSink
With Flume there's no need for Hadoop to be installed on the client machine - there are other features I haven't touched upon. Kite have written an interceptor (Morphlines) which is invoked before events hit the sink...they have a bunch of modules and a DSL for transform events. |
Ok lets do a flume sink. |
The experimental kite data set sink exist for Flume 1.6.0 which looks on the face of it has the capability of ingesting directly into Hive tables.
Headers are usually used for content based routing and multiplexing an event to different sinks.
See https://flume.apache.org/FlumeUserGuide.html#kite-dataset-sink
I think for EEL we should do something similar:
There are various options for batching up events and sending securely over SSL - you could even send via Kafka to a Flume Kafka Source
The text was updated successfully, but these errors were encountered: