Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an alternative channel layer implementation that uses Redis Pub/Sub #251

Merged
merged 1 commit into from
Jun 25, 2021

Conversation

acu192
Copy link
Collaborator

@acu192 acu192 commented May 13, 2021

Fixes Issue django/channels#245

This is a cleaned-up version of my first PR (#247). The only new thing in this PR is that it includes tests! See tests/test_pubsub.py.

At this point, this is a very solid first-version of this pub/sub implementation, and we can continue improving it over time as needed.

@acu192
Copy link
Collaborator Author

acu192 commented May 19, 2021

Maybe run the tests again? The failure seems spurious and unrelated.

They pass on the fork: https://github.com/acu192/channels_redis/actions/runs/839642627

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @acu192 — Thanks for this.

TBH it looks good.

I need a little more time to go over it fully, and I'm lining up for DjangoCon Europe at the moment, so I'm going to say that I will continue to look at this, and then process it in the period after the conference, so middle/end of June is realistic for a release. I hope that makes sense.

@acu192
Copy link
Collaborator Author

acu192 commented May 23, 2021

No rush; it's your project. Just trying to give back where I can. Thanks!

@carltongibson
Copy link
Member

Thanks @acu192. It's a nice addition.

@jalaziz
Copy link
Contributor

jalaziz commented May 23, 2021

If this fixes django/channels#1683, it would be a miracle. Hoping to deploy it on our tests environments and give it a shot soon.

We've noticed issues with messages being lost on occasion, so really hoping this fixes the issue!

@acu192
Copy link
Collaborator Author

acu192 commented May 23, 2021

@jalaziz Cool, yeah it fixes it for us at least. Hope it fixes for you as well! Please post back here after you try it out. 🤓

@WorkShoft
Copy link

Hi there, are there any TODOs left tor this feature? Just like @jalaziz I would benefit from fixing django/channels#1683, I think it would be a great improvement for Channels

@acu192
Copy link
Collaborator Author

acu192 commented Jun 22, 2021

@WorkShoft Nothing important remains. We could improve it by supporting more of the configuration parameters (from the README), but honestly I want to get it merged as-is before I commit more time to it. I've had it in production on my site for ~2 months now, and it is working splendidly (no more bug!). IMO it just needs a review from the maintainers then can be merged.

@carltongibson
Copy link
Member

... so middle/end of June is realistic for a release

@WorkShoft
Copy link

@WorkShoft Nothing important remains. We could improve it by supporting more of the configuration parameters (from the README), but honestly I want to get it merged as-is before I commit more time to it. I've had it in production on my site for ~2 months now, and it is working splendidly (no more bug!). IMO it just needs a review from the maintainers then can be merged.

@acu192 @carltongibson Great, thanks for your work guys

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @acu192 — thanks for this. It looks great. Let's take it now.

I will add some release notes etc and put out a release in the next week.

I'll mark it beta and ask folks to give it a try, but assuming no blockers turn up I would imagine we'd promote this to the preferred option. (It's significantly clearer than the original layer, and I'm quite keen on offloading responsibilities to Redis if we can.)

If you don't mind I'd like to invite you as a maintainer on the repo so that you can help address issues. I assume from your comments you'd be up for that?

Thanks again for the work you've put into this! 🥇

@carltongibson carltongibson merged commit 084ddc4 into django:main Jun 25, 2021
@acu192
Copy link
Collaborator Author

acu192 commented Jun 25, 2021

@carltongibson I'd be honored to be one of the maintainers of this repo! Woohoo!

I'd like to add more to this impl in the future (like some of the missing configuration parameters from the README).

I agree, let's put it in beta for now, then we can promote it after some time passes assuming no issues come up.

Indeed, it offloads to Redis really well. After deploying this we've seen a drop in CPU usage on Redis and on our Django app servers as well.

@carltongibson
Copy link
Member

Indeed, it offloads to Redis really well. After deploying this we've seen a drop in CPU usage on Redis and on our Django app server as well.

Nice. 😄

Welcome aboard! ⛵

@qeternity
Copy link
Contributor

Hi all - just one thing to note - this doesn't use the core ConnectionPool which means that we lose the recently introduced sentinel functionality.

@carltongibson
Copy link
Member

@qeternity — happy to take that as an addition. We can do that quickly or when it's ready?

@qeternity
Copy link
Contributor

@carltongibson looking at the new layer now - I don't think it's much work.

@acu192 howdy - what is your redis setup like in prod? I think might also be nice to pull together some collective knowledge about running channels_redis in prod/at scale.

@carltongibson
Copy link
Member

Happy to add general points to the docs to help people. (I think there's a lot if unknown art.)

@qeternity
Copy link
Contributor

qeternity commented Jun 25, 2021

Maybe best to take this to a new issue, but I have a few concerns around the architecture for pubsub. While the reconnect logic is admirable, I'm not sure that it's appropriate in pubsub because it allows for silent missed messages.

Let's say between our _do_keepalive() loops the connection is lost, and a publisher elsewhere sends a message. RedisSingleShardConnection will silently reconnect having missed the message. In the current blocking list architecture, this would be fine because we could reconnect to the same key and continue popping items...they would simply queue up like a log (redis streams would work well). But in pubsub, those messages will be lost.

This is very relevant to our usage. In the event of a network blip or sentinel failover, the websocket consumers disconnect and the frontend attempts to gacefully recover by reconnecting websockets and by performing a full state refresh to ensure no data has been missed.

@carltongibson
Copy link
Member

Maybe best to take this to a new issue...

Yep, do that. We'll put it out as beta to begin, so feedback welcome.

@LiteWait
Copy link

LiteWait commented Jun 25, 2021

@acu192 @carltongibson initial results are promising...again not using groups just massive number of sockets. The issues I noticed in #313 no longer happens. I'll keep giving feedback as QA starts to bang on it.

@acu192
Copy link
Collaborator Author

acu192 commented Oct 24, 2021

@jalaziz @WorkShoft - In case you haven't been following the progress of the RedisPubSubChannelLayer ... with the 3.3.1 release (which fixed a few bugs) I'd say you should now give it a try. I expect it will fix the issue you both mentioned above (it did for me, at least).

@LiteWait
Copy link

@acu192 Ryan, preliminary tests look good, I'll stage this for a week or so before moving to production. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants