Skip to content

Python implementation of Kafka Streams? #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dalejin2014 opened this issue Aug 29, 2016 · 43 comments
Open

Python implementation of Kafka Streams? #38

dalejin2014 opened this issue Aug 29, 2016 · 43 comments
Labels
FAQ Not a bug, not an enhancement, good to know question

Comments

@dalejin2014
Copy link

We are interested in using kafka streaming.
Is it on the road map for confluent kafka python library?

@ewencp
Copy link
Contributor

ewencp commented Aug 30, 2016

@dalejin2014 We'd love to have native stream processing libraries in different languages and having really good Kafka clients is the basis for that. That said, we don't have a timeline for adding this yet.

@miguno
Copy link

miguno commented Aug 30, 2016

@dalejin2014: As @ewencp mentioned we don't have a timeline yet. The reason for this is that we first want to ensure we have a strong foundation in the form of the Java implementation of Kafka Streams before venturing into non-JVM languages.

That said, of course I took a note of your request. :-)

Do you mind sharing some information about your use case where you'd use Kafka Streams from Python?

@miguno miguno changed the title kafka streaming Python implementation of Kafka Streams? Aug 30, 2016
@dalejin2014
Copy link
Author

We are interested in developing a commenting feature kind of like google doc.
The use case is as follows:

  • users in a thread should be notified on the events at a frequency of their choosing (realtime, hourly, daily, etc)

So we are thinking about using Kafka Streaming since it provides us:

  • windowing
  • group-by
  • accumulation
  • etc

Is there an easy way to port the features from Java client?

@miguno
Copy link

miguno commented Aug 31, 2016

Thanks for sharing the background info @dalejin2014.

Is there an easy way to port the features from Java client?

It's not super-hard but also not trivial. Also, one would need to continuously maintain any such Kafka Streams libraries for other languages with the same commitment and high quality as the current Kafka Streams library for Java, so "porting" is not a one-off effort but an ongoing time investment. Hence our current decision to focus our efforts first on the Java implementation of Kafka Streams.

@jacqvdm
Copy link

jacqvdm commented Oct 11, 2016

+100 :)

@zzbennett
Copy link

Kafka Streams for Python would be so amazing. I'm currently evaluating stream processing frameworks and I like what I've been reading about Kafka Streams. My use case is essentially this: I'm laying down the infrastructure to enable realtime analytics and processing of log/event data. The primary users of this data are data scientists who would be standing up their own Kafka streams apps mostly for doing transformations, joins, partitioning and windowed analytics. I think Kafka streams fits this use case nicely since the streams library eliminates a lot of the boiler plate code involved in configuring Kafka consumers and producers but leaves developers the freedom and flexibility to do lots of cool stuff with the data in each Kafka topic. The only catch is that not many of the data scientists are well versed in Java--our language of choice is Python for almost everything. As much as I like Kafka and as excited as I am about Kafka Streams, getting the data scientists on board with writing Java will be an uphill battle.

With that said, have there been any developments with regards to supporting a Python based Kafka Streams library?

@miguno
Copy link

miguno commented Dec 5, 2016

@zzbennett I hear you, Elizabeth. :-)

Unfortunately our short-term roadmap does not include work on a Python library of Kafka Streams. (We'd definitely welcome contributors though!) Same situation for e.g. kafka-python, a community project.

I'm kinda hesitant to suggest this, but perhaps it would be worth a try to experiment with Jython? IIRC some Ruby users have been experimenting with Kafka Streams' Java library via JRuby. FWIW, there are a few community/external projects already working on various "wrappers" (in a broad sense) for Kafka's Streams and Connect APIs, but they haven't been released yet; I don't remember off the top of my hat whether a Python-based one was amongst that.

@zzbennett
Copy link

Thanks for your reply @miguno and thanks for the suggestions. Jython might be a good option for prototyping. I may actually be able to drum up support for Scala based Streams apps, which would work a bit better with the Java libraries.

As far as contributing, I may even end up putting together a Python port of Kafka Streams for our uses cases. Eventually with the help of some collaborators in the kafka python community we'd hopefully be able to contribute something upstream. But I suppose we can cross that bridge when we get there. At any rate, thanks again for the help!

@murphyke
Copy link

@zzbennett Somebody in my group was talking about working on this also. If you create a repo with issues laying out the work and then solicit help, you may find yourself with some contributors reasonably soon.

@zzbennett
Copy link

@murphyke that would be super. I actually just created a repo last weekend to start working on it (https://github.com/python-kafka-streams/python-kafka-streams). I haven't committed any work or created any tickets yet, but hopefully I'll get a chance to do that in the next couple of days. Feel free to send people over there if they are itching to work on it. Once a little momentum gets built up I'll post to some user groups to solicit help.

@supertramp01
Copy link

@zzbennett I'd love to contribute to the python-kafka-streams repo.

@ayanguha
Copy link

I would love to work on this, as well as love the idea itself :)

Wondering if someone has some initial design which I can start working with?

@ghost
Copy link

ghost commented Jul 8, 2017

so... what's best practice? use Jython?

@miguno
Copy link

miguno commented Jul 10, 2017

Jython is one option, yes. And some users are actually running Jython-based Kafka Streams applications in production.

Also: There's an upcoming, community-driven Python implementation of Kafka Streams (a first MVP = not all features are already implemented) that will be presented at EuroPython later this month.

@llawall
Copy link

llawall commented Jul 12, 2017

The code @miguno is referring to is now on GitHub: https://github.com/wintoncode/winton-kafka-streams

Check it out and get involved with the project!

@ghost
Copy link

ghost commented Aug 14, 2017

no updates for a month on winton, I hope they continue their good project

@ghost
Copy link

ghost commented Sep 16, 2017

seems dead unfortunately

@ghost
Copy link

ghost commented Oct 10, 2017

Would be great to have a bit of help from Confluent on this, given python is the most wanted language in 2017 according to Stack Overflow
51eef3d9dcc6a0ca8642a6d58fd182fcb0c8b419

@miguno
Copy link

miguno commented Oct 10, 2017

@pouledodue: I'd suggest to bring this up at https://github.com/wintoncode/winton-kafka-streams -- the last commit in that project was actually 5 days ago.

@rdehouss
Copy link

+1 on this.
Question for the community about renaming the projet to a more "standard name": wintoncode/winton-kafka-streams#8

@ghost
Copy link

ghost commented Feb 21, 2018

at this point I decided to learn the java ecosystem instead of using an half-baked python solution

@g-rd
Copy link

g-rd commented Jun 22, 2018

Are there any developments on this request ? I was so excited about kafka but with no streaming api implementation in python I am unsure now.

@rnpridgeon
Copy link
Contributor

@g-rd, as of today we are still tracking interest but it doesn't currently have a place on the roadmap.

@ghost
Copy link

ghost commented Jun 23, 2018

@g-rd you may look into Apache Pulsar

@edenhill
Copy link
Contributor

@g-rd
Copy link

g-rd commented Jun 24, 2018

@edenhill I have looked at it already, but it looks to me that this project is either perfect with no developing needed or just not being developed. I go with not being actively developed.
I am looking now at Apache Pulsar and I think Pulsar is a better fit for me.

@vineetgoel
Copy link

vineetgoel commented Jul 31, 2018

Check out a Kafka Streams inspired Python Stream Processing library we just open sourced: https://robinhood.engineering/faust-stream-processing-for-python-a66d3a51212d

dtheodor pushed a commit to dtheodor/confluent-kafka-python that referenced this issue Sep 4, 2018
@ghost
Copy link

ghost commented Aug 28, 2019

It's been over a year. Any further comment on if Kafka Stream will be available?

@edenhill
Copy link
Contributor

We do not have any immediate plans to create a non-java Kafka Streams implementation.
Either look into using KSQL or https://github.com/wintoncode/winton-kafka-streams

@ZisisFl
Copy link

ZisisFl commented Jan 15, 2020

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

@callamd
Copy link

callamd commented Feb 7, 2020

There seems to be viable alternatives to an officially supported implementation.

@rnpridgeon rnpridgeon reopened this Feb 10, 2020
@rnpridgeon rnpridgeon added the FAQ Not a bug, not an enhancement, good to know label Feb 10, 2020
@federicofontana
Copy link

federicofontana commented Feb 13, 2020

There seems to be viable alternatives to an officially supported implementation.

This is true. However, with non-officially supported APIs there is always the risk that they will stop being maintained. The last commit in the popular winton-kafka-streams was 1.5 years ago.

We do not have any immediate plans to create a non-java Kafka Streams implementation.

Has there been any change in this regard? @edenhill

@gvdmarck
Copy link

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Sasl authentication (which you will certainly use with a confluent kafka cluster) is broken since 1.9.
It has been 4 months now trying to have at least a comment from a programmer in robinhood, without any success.

@ghost
Copy link

ghost commented May 13, 2020

Why not use Python on GraalVM? It's getting better :)

For almost a year, I am playing with the idea to develop a functional abstraction which allows to use the Kafka Client API including Kafka Streams from Python/JS/R via GraalVM. Then you wouldn't be dependent on separate solutions like Faust which most probably will not be able to always keep up with the latest developments (and will probably not offer optimal performance and feature-richness).

@ghost
Copy link

ghost commented May 13, 2020

BTW if anybody would like to join me to start developing this abstraction layer on top of the Java-based Kafka Streams API which enables the use of it via GraalVM in Python/JS/R/C etc. - I'd be happy :)

@pratapagiri
Copy link

Any news on adding Kafka-Streams library?

@waydegilliam
Copy link

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Unfortunately the project looks to be unmaintained for months now :(

@austinnichols101
Copy link

It's alive - check out the fork:
https://github.com/faust-streaming/faust

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Unfortunately the project looks to be unmaintained for months now :(

@waydegilliam
Copy link

waydegilliam commented Dec 17, 2020

It's alive - check out the fork:
https://github.com/faust-streaming/faust

Yeah I saw the fork that's being updated which is nice to see, I meant the official project isn't being maintained anymore by the folks at Robinhood. My worry is the same will happen to that fork (or any future forks of Faust for that matter) as now it's a huge project being maintained only by a few people from the open-source community.

@ScaryAardvark
Copy link

Has kafka streams for Python made it onto the roadmap yet ?

@edenhill
Copy link
Contributor

We have no plans to implement Kafka Streams for Python.

@nathan-audette
Copy link

This is a big shame. My team had recently started discussing the possibility of migrating from our microservices architecture to an event-sourced one. We've all been pretty excited about the idea as we have a lot of microservices and analysis services that work together - bringing all the data together could simplify a lot. But we're a Python shop and the more I've been looking into the libraries available, the less appealing this idea has become. We've floated the idea of adopting Java just for this reason, so we could make use of Kafka Streaming, but so much of our system would have to be translated to Java to make this feasible. It just doesn't seem worth it.

@g-rd
Copy link

g-rd commented Sep 16, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FAQ Not a bug, not an enhancement, good to know question
Projects
None yet
Development

No branches or pull requests