-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support asyncio with AsyncInferenceClient #1524
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1524 +/- ##
==========================================
- Coverage 82.71% 82.00% -0.71%
==========================================
Files 58 60 +2
Lines 6357 6591 +234
==========================================
+ Hits 5258 5405 +147
- Misses 1099 1186 +87
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I understand why you chose to go this way with your implementation. I don't have any problems with the generation of the async file from the sync file.
I have a small issue with the way that it is generated in that it seems to be very specific to the existing sync client. If you add a method to that client, then you'll likely need to get your hands dirty with the script to adapt the file generation. This seems harder than just editing the async client file to ensure that the two files have the same methods. I understand that this is not the case for tasks methods (_make_tasks_methods_async
), so maybe I misunderstand and you already set it up this way.
In any case, happy to have this for now and up to you whether you want to update it down the line to have something different after maintaining it for a bit.
As said in the comments, I'd be clearer about the async file not being editable by contributors as it's being generated.
# WARNING | ||
# This entire file has been generated automatically based on `src/huggingface_hub/inference/_client.py`. | ||
# To re-generate it, run `make style` or `python ./utils/generate_async_inference_client.py --update`. | ||
# WARNING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd either add a pretty visible disclaimer that any code written in this file will be overwritten by make style
, or I would move this file entirely to a _generated
folder to ensure that it's not being modified by someone (with the correct links for imports as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a _generated/
folder! Far better than just a tiny disclaimer that no-one will read. Thanks for the suggestion :)
@@ -0,0 +1,138 @@ | |||
"""Contains tests for AsyncInferenceClient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have the copyright as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test that both the async and sync clients output the exact same things, but with one being synchronous and the other being asynchronous? (maybe that's already the case).
If you intend on the two clients having the same methods, with the same signatures, would it make sense to include a test for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test that both the async and sync clients output the exact same things, but with one being synchronous and the other being asynchronous?
Yes that's the case for text_generation
and sentence_similarity
tests. I chose to test those two since text_generation is by far the "most complicated" code. I don't think it's worth testing every single task as long as we know that they are generated via the exact same code.
About exact values, that's kinda what we are doing. But since the tests are using VCR (i.e. not calling the production endpoint), it's not really a complete test anyway. I tried making VCR load the same yaml file for both a sync and a async test but I didn't manage to make it work with VCR.py. In the end I don't think it's worth investigating to much in this.
If you intend on the two clients having the same methods, with the same signatures, would it make sense to include a test for that?
I can add such a test!
Thanks for the review and the feedback @LysandreJik. As discussed on Slack, the plan is to be pragmatic with the generation script:
Since we are aligned on this, let's move forward and merge this PR :) |
Yay! 🎉 |
Discussed in slack (internal link). Adds a AsyncInferenceClient for async calls to the inference endpoints. File is generated automatically by a script. The goal is to start with this and reassess in the future (maybe drop the generation-script at some point if it's too much work to maintain vs too little benefit).
TODO: