Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flags changes are not reflected if ldclient is initialized in uWSGI pre-fork worker #174

Closed
pb-dod opened this issue Jun 14, 2022 · 7 comments

Comments

@pb-dod
Copy link

pb-dod commented Jun 14, 2022

Describe the bug
Flag changes will not be reflected on forked web workers if the Launch Darkly client is initialized before the process is forked by uWSGI.

I think this is happening because the Launch Darkly client will spawn a thread for monitoring updates to flags, but when it's forked that thread will no longer work.

This can happen if the client initialization is triggered when a module is imported when the app is started (rather than initialized when it's first used inside of an endpoint).

The workaround for this requires you to avoid initializing the Launch Darkly client before forking occurs, but this can be a frustrating issue to debug if you're not aware of what is happening.

To reproduce
See this repo with a minimal example and instructions to reproduce the issue: https://github.com/pb-dod/launch_darkly_python_prefork#issue-reproduction-steps

Expected behavior
Maybe it's possible to detect forking and require the thread for monitoring flag updates to be re-initialized?

Or, maybe it should throw an exception if you try to initialize the Launch Darkly client inside of a pre-fork uWSGI worker? For example: https://gist.github.com/pb-dod/231b1b5dac9ea296918346a9288a598a

A warning might need to be added to the docs too.

Logs
n/a

SDK version
See example

Language version, developer tools
See example

OS/platform
See example

Additional context
n/a

@eli-darkly
Copy link
Contributor

Hi. Besides the underlying issue, it sounds like there may be a documentation gap - the fact that the SDK won't work if you initialize it prior to forking is a long-standing known issue and I thought we had documented it, so I'll look into that.

But, back to the main problem: it's a little more of a fundamental problem than you might think, because it's not just about the threads, it's about the basic functionality of how streaming updates work. The SDK opens a long-lived streaming HTTP connection to receive updates. There is no way for that one socket to be inherited by multiple forked processes, regardless of what thread might be reading it.

@pb-dod
Copy link
Author

pb-dod commented Jun 15, 2022

This probably isn't a surprise, but I just tested this on gunicorn with preload_app enabled and it's an issue there too: https://github.com/pb-dod/launch_darkly_python_prefork/tree/test_gunicorn

@pb-dod
Copy link
Author

pb-dod commented Jun 15, 2022

@eli-darkly

There is no way for that one socket to be inherited by multiple forked processes, regardless of what thread might be reading it.

Agreed, and the New Relic python client doesn't handle this either (it has a similar background collector thread setup).

Newrelic's python agent addresses this in their docs though:

Maybe the change to the launch darkly client docs should be similar to that section?

@pscelzamelo
Copy link

pscelzamelo commented Jun 15, 2022

I'm surfacing this issue, do you have any recommendations on how to fix?

Where should I initialize the SDK for flag refreshing to work properly? i.e when should I call set_config?

Should I consider disabling streaming to rely on polling?

LaunchDarklyReleaseBot pushed a commit that referenced this issue Jun 15, 2022
update CONTRIBUTING.md and provide make targets
@mblayman
Copy link

mblayman commented Jun 15, 2022

@eli-darkly, I work with @pb-dod at Included Health. One of the things we observed in this process is that the worker process would stop working if we tried to initialize a new client. We would initialize a new client in the worker process with ldclient.set_config followed by ldclient.get. The last message we would see is "Closing LaunchDarkly client.." from the client's close method.

My best theory is that any threadpool connected to the client can't be stopped after the fork so the close method (invoked by old_client.close() in set_config) just hangs. If the close method were modified to fail in a more graceful manner, then perhaps a new client started in a worker process could replace the old and broken client from the pre-forked environment. This isn't a perfect solution, but I think it would allow a forking server the chance to recover in a way that the current situation cannot.

@eli-darkly
Copy link
Contributor

@mblayman I can't say off the top of my head whether it is feasible to make close "fail in a more graceful manner", or (as @pb-dod suggested) to have the SDK "detect forking" and reinitialize itself. We'll take those comments into consideration, and we do want to keep thinking about ways to make the product easier to use in this kind of environment, since this has been a known issue for quite a while in Python and other languages where forking-based web frameworks are popular. But the fact that New Relic has not found a better way to handle it suggests that it's a non-trivial problem.

In the nearer term, we will focus on making the documentation clearer. It looks like we did some work to address that in Ruby, but not in Python.

@keelerm84
Copy link
Member

We have updated the Python documentation to more clearly detail out the issues with pre-forking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants