Seldon does not work with Gunicorn async workers #2499

pdstrnadJC · 2020-09-29T00:44:10Z

Describe the bug

There are actually two issues, closely related, so I'm posting them in one bug report.

Seldon does not provide a way to use Gunicorn's async workers. At first glance it seemed that setting the -k or --worker-class flag in environment variable GUNICORN_CMD_ARGS would do the trick, but Seldon doesn't consider that variable when deriving the config it passes to Gunicorn (see load_config() in StandaloneApplication - and for comparison this is how Gunicorn's BaseApplication does it).
Since I wanted to test with async workers locally regardless of the above, I cloned the Seldon repo and hard-coded the worker class in microservice.py.

            def rest_prediction_server():
                options = {
                    "bind": "%s:%s" % ("0.0.0.0", port),
                    # ...
                    "worker_class": "eventlet" # or "gevent"
                }

When I try to run Seldon locally it starts fine - once it tries to handle a prediction request (POST /api/v1.0/predictions endpoint) it returns a 500 and I see this error in the logs:

2020-09-28 22:43:54,870 - seldon_core.wrapper:log_exception:1892 - ERROR:  Exception on /api/v1.0/predictions [POST]
Traceback (most recent call last):
  File "/opt/conda/envs/mlflow/lib/python3.7/multiprocessing/managers.py", line 811, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

This actually seems to be a Python issue? Many workarounds I found (one example) suggested calling join() on the mp.Process object, but you're doing that already. I'm not a Python expert or power user, so I'm not sure what the options are!

To reproduce

Edit microservice.py as described above.
Update requirements.txt to install either gunicorn[eventlet] or gunicorn[gevent] - depending on which worker class you decide to use.
Start Seldon and send a request to the prediction endpoint.

Expected behaviour

I would expect the prediction request to return a 200.

Environment

I've only run this locally so far - it hasn't made it to our k8s cluster yet. Locally I ran on MacOS or in a Docker container based on python:3.7-slim.

The text was updated successfully, but these errors were encountered:

RafalSkolasinski · 2020-09-29T10:30:37Z

Hi @pdstrnadJC, thanks for the bug report.

I am just curious, what exact advantages you would see by using the Gunicorn's Async Workers? Most of the prediction tasks are CPU-bound.

pdstrnadJC · 2020-09-29T18:48:08Z

Hi @RafalSkolasinski, so before calling the model's predict function I'm making a HTTP request to get some data that will be passed to the model. I was hoping that by using the async workers, Seldon could do other work while it waits for the response to come back. Initially I was using a Seldon TRANSFORMER to make the request, and I had setup the inference graph such that the TRANSFORMER is called before the MODEL, so that could be an option again, but I preferred to keep the graph as simple as possible.

adriangonz · 2020-10-06T15:46:35Z

Hey @pdstrnadJC, in general both gevent and greenlet are a bit fiddly to use. Both require you to monkeypatch your code. This then gets even fiddlier when you consider that we leverage multiprocessing to run a couple separate servers at once within the same Python container.

After running a couple small experiments, it seems that gevent just doesn't play well with multiprocessing at all. You can check this issue, where the conclusion is plainly that you can't use both at the same time.

On the other hand, it seems that eventlet works a bit better alongside multiprocessing. The only extra thing I had to do was to explicitly add the monkeypatch call to the top of python/seldon_core/microservice.py:

import eventlet
eventlet.monkey_patch()

After doing that, I was able to set worker_class: eventlet and it seemed to work. By that, I mean that I was able to send a couple requests and I was able to get the responses. Note that this doesn't mean that there couldn't be other issues downstream. For example, I did notice that the process seems to hang whenever you try to exit it. There is an open issue keeping track of all incompatibilities between eventlet and multiprocessing: eventlet/eventlet#210

It's also worth mentioning that our current plan is to eventually introduce our new language wrapper, MLServer, which has been built with asyncio in mind. We've recently introduced early support for the SKLearn and XGBoost pre-packaged servers (we'll be adding some examples soon). You can also check an example of how to add custom inference logic with MLServer. Keep in mind that this is still considered an incubating project, although it could still be worth evaluating it for your use case.

Feel free to re-open if you've got any further questions!

pdstrnadJC added bug triage Needs to be triaged and prioritised accordingly labels Sep 29, 2020

axsaucedo added this to the 1.4 milestone Sep 29, 2020

adriangonz self-assigned this Sep 29, 2020

ukclivecox removed the triage Needs to be triaged and prioritised accordingly label Oct 1, 2020

adriangonz closed this as completed Oct 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seldon does not work with Gunicorn async workers #2499

Seldon does not work with Gunicorn async workers #2499

pdstrnadJC commented Sep 29, 2020

RafalSkolasinski commented Sep 29, 2020

pdstrnadJC commented Sep 29, 2020

adriangonz commented Oct 6, 2020

Seldon does not work with Gunicorn async workers #2499

Seldon does not work with Gunicorn async workers #2499

Comments

pdstrnadJC commented Sep 29, 2020

Describe the bug

To reproduce

Expected behaviour

Environment

RafalSkolasinski commented Sep 29, 2020

pdstrnadJC commented Sep 29, 2020

adriangonz commented Oct 6, 2020