Fix `InferenceClient` for HF Nvidia NIM API #2482

Wauplin · 2024-08-22T13:54:35Z

Two things in this PR (for chat_completion):

the model value was not set correctly in the payload. This is now fixed.
None values are not sent to the server anymore. Only values from the user are forwarded.

With this, it is now possible to use the HF Nvidia NIM API using InferenceClient:

from huggingface_hub import InferenceClient


# instead of `client = OpenAI(...)`
client = InferenceClient(
    base_url="https://huggingface.co/api/integrations/dgx/v1",
    api_key="MY_FINEGRAINED_TOKEN"
)

output = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count to 10"},
    ],
    stream=True,
    max_tokens=1024,
)

for chunk in output:
    print(chunk.choices[0].delta.content, end="")

Here it goes:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10!

HuggingFaceDocBuilderDev · 2024-08-22T13:58:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MoritzLaurer

LGTM! installed from the branch and confirm that it works.

Also see that it works directly with client.chat_completion just like client.chat.completions.create

Wauplin · 2024-08-26T12:45:51Z

Let's get this merge! Thanks for the reviews :)

Fix InferenceClient for NIM

bc1c488

Wauplin requested review from LysandreJik and MoritzLaurer August 22, 2024 13:54

quality

d209630

MoritzLaurer approved these changes Aug 23, 2024

View reviewed changes

LysandreJik approved these changes Aug 26, 2024

View reviewed changes

Wauplin merged commit 6e9e4e4 into main Aug 26, 2024
15 of 17 checks passed

Wauplin deleted the 2480-fix-inference-client-for-nim branch August 26, 2024 12:45

plaguss mentioned this pull request Sep 4, 2024

[FEATURE] Add support for HF Nvidia NIM API on InferenceEndpointsLLM argilla-io/distilabel#947

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `InferenceClient` for HF Nvidia NIM API #2482

Fix `InferenceClient` for HF Nvidia NIM API #2482

Wauplin commented Aug 22, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

MoritzLaurer left a comment

Wauplin commented Aug 26, 2024

Fix InferenceClient for HF Nvidia NIM API #2482

Fix InferenceClient for HF Nvidia NIM API #2482

Conversation

Wauplin commented Aug 22, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

MoritzLaurer left a comment

Choose a reason for hiding this comment

Wauplin commented Aug 26, 2024

Fix `InferenceClient` for HF Nvidia NIM API #2482

Fix `InferenceClient` for HF Nvidia NIM API #2482