Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running server locally errors #61

Open
alxiang opened this issue Dec 3, 2024 · 0 comments
Open

Running server locally errors #61

alxiang opened this issue Dec 3, 2024 · 0 comments

Comments

@alxiang
Copy link

alxiang commented Dec 3, 2024

I'm trying to set up the cog-flux server locally (on a VM with an H100). After server startup, I get this error when trying to make a prediction:

{"logger": "cog.server.runner", "timestamp": "2024-12-03T01:42:41.166186Z", "severity": "INFO", "message": "setup succeeded"}
{"logger": "cog.server.probes", "timestamp": "2024-12-03T01:42:41.166524Z", "severity": "INFO", "message": "Not running in Kubernetes: disabling probe helpers."}
{"prediction_id": null, "logger": "cog.server.runner", "timestamp": "2024-12-03T01:43:01.446053Z", "severity": "INFO", "message": "starting prediction"}
{"prediction_id": null, "logger": "cog.server.runner", "timestamp": "2024-12-03T01:43:01.446524Z", "severity": "INFO", "message": "started prediction"}
running quantized prediction
Using seed: 123
Process _ChildWorker-1:
Traceback (most recent call last):
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py", line 348, in run
    self._loop(
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py", line 420, in _loop
    self._predict(e.tag, e.event.payload, predict, redirector)
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py", line 461, in _predict
    with self._handle_predict_error(redirector, tag=tag):
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/contextlib.py", line 144, in __exit__
    next(self.gen)
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/worker.py", line 597, in _handle_predict_error
    redirector.drain(timeout=10)
  File "/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/cog/server/helpers.py", line 259, in drain
    raise CogTimeoutError("output streams failed to drain")
cog.server.errors.CogTimeoutError: Cog: output streams failed to drain

Here are commands to reproduce this error:

script/select.sh dev-lora
cog run -p 5000 python -m cog.server.http
python3 samples.py

This error occurs on cog versions 0.13.3 (latest) and 0.13.2. I'm on Ubuntu 22.04 with NVIDIA Driver Version: 550.90.07, CUDA Version: 12.4, H100 80GB HBM3. I was also able to set up a toy model and make predictions successfully, following the cog docs, so I'm wondering if there's a specific setup/configuration step that I'm missing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant