fix(llm): handle subscription Responses streaming#1760

Closed

enyst wants to merge 1 commit intoOpenHands:feat/openai-subscription-authfrom

enyst:fix/1682-subscription-responses-streaming

Collaborator

enyst commented Jan 20, 2026

Follow-up to #1682.

Subscription mode requires for Responses; LiteLLM returns a streaming iterator.
Drain the iterator and use the completed response so Agent/Conversation can proceed.
Emit a one-time info log when subscription mode is active (helps confirm runtime path).


          fix(llm): support subscription responses streaming

7ce7e39

Subscription auth requires Responses streaming; LiteLLM returns a streaming iterator instead of a ResponsesAPIResponse. Drain the iterator and use the completed response, and log once when subscription mode is active.

enyst closed this

all-hands-bot reviewed

View reviewed changes

Collaborator

all-hands-bot left a comment

This PR successfully implements subscription mode streaming support for the Responses API. However, there are several issues that should be addressed, particularly around error handling, type validation, and exception usage. See inline comments for details.

openhands-sdk/openhands/sdk/llm/llm.py

Comment on lines +808 to +810

+                                      assert isinstance(completed_resp, ResponsesAPIResponse), (
+                                          f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
+                                      )

Collaborator

all-hands-bot Jan 20, 2026

🔴 Critical: Using assert for runtime type validation is problematic. Assertions can be disabled with Python's -O (optimize) flag, which would skip this check in production and could lead to silent failures.

Replace with an explicit check and proper exception:

Suggested change

      
                                    assert isinstance(completed_resp, ResponsesAPIResponse), (
          
                                        f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
          
                                    )
          
                                    if not isinstance(completed_resp, ResponsesAPIResponse):
          
                                        raise TypeError(
          
                                            f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
          
                                        )

openhands-sdk/openhands/sdk/llm/llm.py

Comment on lines +787 to +790

+                                      if not isinstance(ret, SyncResponsesAPIStreamingIterator):
+                                          raise AssertionError(
+                                              f"Expected Responses stream iterator, got {type(ret)}"
+                                          )

Collaborator

all-hands-bot Jan 20, 2026

🟠 Important: Raising AssertionError for validation failures is unconventional. AssertionError should typically be reserved for programmer errors and invariants, not runtime type validation. Consider using TypeError or a custom exception like LLMResponseError instead.

Suggested change

      
                                    if not isinstance(ret, SyncResponsesAPIStreamingIterator):
          
                                        raise AssertionError(
          
                                            f"Expected Responses stream iterator, got {type(ret)}"
          
                                        )
          
                                    if not isinstance(ret, SyncResponsesAPIStreamingIterator):
          
                                        raise TypeError(
          
                                            f"Expected Responses stream iterator, got {type(ret)}"
          
                                        )

openhands-sdk/openhands/sdk/llm/llm.py

Comment on lines +792 to +794

+                                      # Drain stream until a completed response is captured.
+                                      for _ in ret:
+                                          pass

Collaborator

all-hands-bot Jan 20, 2026

🟠 Important: The stream draining loop has no timeout or error handling. If the stream:

Hangs or yields indefinitely
Raises an exception during iteration
Takes an unusually long time

This could cause the entire request to hang or fail unexpectedly. Consider adding a timeout mechanism or wrapping in a try-except to provide better error messages if streaming fails.

openhands-sdk/openhands/sdk/llm/llm.py

Comment on lines +814 to 816

+                                  raise AssertionError(
                                       f"Expected ResponsesAPIResponse, got {type(ret)}"
                                   )

Collaborator

all-hands-bot Jan 20, 2026

🟠 Important: Same issue as above - using AssertionError for runtime validation. This should be a TypeError or similar exception.

Suggested change

      
                                raise AssertionError(
          
                                    f"Expected ResponsesAPIResponse, got {type(ret)}"
          
                                )
          
                                raise TypeError(
          
                                    f"Expected ResponsesAPIResponse, got {type(ret)}"
          
                                )

openhands-sdk/openhands/sdk/llm/llm.py

Comment on lines +783 to +785

+                                      from litellm.responses.streaming_iterator import (
+                                          SyncResponsesAPIStreamingIterator,
+                                      )

Collaborator

all-hands-bot Jan 20, 2026

🟡 Suggestion: This import is performed inside the execution path. If the import fails (e.g., due to litellm version mismatch or API changes), the error message might not be immediately clear to developers.

Consider either:

Importing at module level if this is a stable API
Wrapping in try-except with a more descriptive error:

try:
    from litellm.responses.streaming_iterator import (
        SyncResponsesAPIStreamingIterator,
    )
except ImportError as e:
    raise ImportError(
        f"Failed to import SyncResponsesAPIStreamingIterator. "
        f"This may indicate an incompatible litellm version: {e}"
    ) from e

openhands-sdk/openhands/sdk/llm/llm.py

+                              "Streaming callbacks are not supported for Responses API yet"
+                          )
+                      streaming_requested = bool(call_kwargs.get("stream", False) or self.stream)

Collaborator

all-hands-bot Jan 20, 2026

🟢 Nit: Consider adding parentheses for clarity, since the or operation happens before the bool() conversion:

Suggested change

      
                    streaming_requested = bool(call_kwargs.get("stream", False) or self.stream)
          
                    streaming_requested = bool(call_kwargs.get("stream", False)) or self.stream

Actually, self.stream is already a bool, so this could be simplified to:

streaming_requested = call_kwargs.get("stream", False) or self.stream

enyst reopened this

enyst closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet