Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propagate Exception from inference workers to main process #141

Closed
aniketmaurya opened this issue Jun 17, 2024 · 0 comments · Fixed by #143
Closed

propagate Exception from inference workers to main process #141

aniketmaurya opened this issue Jun 17, 2024 · 0 comments · Fixed by #143
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@aniketmaurya
Copy link
Collaborator

aniketmaurya commented Jun 17, 2024

🐛 Bug

Exceptions are not propagated from inference workers to main process when using OpenAISpec. This results in silent failure.

Code sample

import litserve as ls

class SimpleLitAPI(ls.LitAPI):
    def setup(self, device):
        self.model = None

    def decode_request(self, request):
        content = request.messages[-1].content
        if "BAD WORD" in content:
            raise Exception("Guardrail detected inappropriate content.")
        return [{"role": "user", "content": content}]

    def predict(self, prompt):
        yield "This is a sample generated text"

if __name__ == '__main__':
    api = SimpleLitAPI()
    server = ls.LitServer(api, spec=ls.OpenAISpec())
    server.run(port=8000)

This server will always give HTTP 200 response since the FastAPI StreamingResponse is sent before any actual computation is performed.

Expected behavior

Environment

If you published a Studio with your bug report, we can automatically get this information. Otherwise, please describe:

  • PyTorch/Jax/Tensorflow Version (e.g., 1.0):
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

@aniketmaurya aniketmaurya added bug Something isn't working help wanted Extra attention is needed labels Jun 17, 2024
@aniketmaurya aniketmaurya changed the title propagate Exception from inference workers during streaming to main process propagate Exception from inference workers to main process Jun 17, 2024
@aniketmaurya aniketmaurya self-assigned this Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant