Skip to content

Conversation

@hanouticelina
Copy link
Contributor

AsyncInferenceClient creates an aiohttp.ClientSession each time it makes a request. when streaming, the session is only closed after the async generator that yields the data finishes iterating:

async for chunk in response.content:  
    yield chunk
await client.close()                   

if the user stops iteration as soon as they see the [DONE] event (see _async_stream_chat_completion_response and _format_chat_completion_stream_output), the await client.close() line is never executed. later, when the client object is garbage collected, we get this warning:

UserWarning: Deleting 'AsyncInferenceClient' client but some sessions are still open. This can happen if you've stopped streaming data from the server before the stream was complete. To close the client properly, you must call `await client.close()` or use an async context (e.g. `async with AsyncInferenceClient(): ...`.
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x1026d6120>
Unclosed connector
connections: ['deque([(<aiohttp.client_proto.ResponseHandler object at 0x10a8a6ed0>, 91165.093372916)])']
connector: <aiohttp.connector.TCPConnector object at 0x1026d5e80>

the fix simply wraps the iteration in a try/finally block, regardless of how or when the caller stops consuming the stream, the finally clause guarantees the underlying session s closed.


repro:

import asyncio
import os

from huggingface_hub import AsyncInferenceClient

client = AsyncInferenceClient(
    provider="novita",
    api_key=os.environ["HF_TOKEN"],
)


async def main():
    stream = await client.chat.completions.create(
        model="Qwen/Qwen3-Coder-480B-A35B-Instruct",
        messages=[{"role": "user", "content": "What is the capital of France?"}],
        stream=True,
    )

    async for chunk in stream:
        print(chunk.choices[0].delta.content, end="")


if __name__ == "__main__":
    asyncio.run(main())

@hanouticelina hanouticelina requested a review from Wauplin July 24, 2025 09:47
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Wauplin
Copy link
Contributor

Wauplin commented Jul 24, 2025

Seems like session is now correctly closed but when running your repro code on your branch, I now get:

UserWarning: Deleting 'AsyncInferenceClient' client but some sessions are still open. This can happen if you've stopped streaming data from the server before the stream was complete. To close the client properly, you must call `await client.close()` or use an async context (e.g. `async with AsyncInferenceClient(): ...`.

This warning comes from the AsyncInferenceClient directly so I guess we need to update the check there

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, the warning I reported is just because we need a await client.close() in the repro script but that's it. So all good to merge!

@hanouticelina hanouticelina merged commit 99b2325 into main Jul 24, 2025
26 checks passed
@hanouticelina hanouticelina deleted the fix-async-streaming-session-closing branch July 24, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants