Skip to content

Conversation

jhpiedrahitao
Copy link
Contributor

@jhpiedrahitao jhpiedrahitao commented Mar 12, 2025

What does this PR do?

switch sambanova inference adaptor to LiteLLM usage to simplify integration and solve issues with current adaptor when streaming and tool calling, models and templates updated

Test Plan

pytest -s -v tests/integration/inference/test_text_inference.py --stack-config=sambanova --text-model=sambanova/Meta-Llama-3.3-70B-Instruct

pytest -s -v tests/integration/inference/test_vision_inference.py --stack-config=sambanova --vision-model=sambanova/Llama-3.2-11B-Vision-Instruct

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 12, 2025
@jhpiedrahitao jhpiedrahitao changed the title feat(providers): sambanova updated to use LiteLLM openai-compat, also models and templates updated feat(providers): sambanova updated to use LiteLLM openai-compat Mar 12, 2025
Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have listed vision models in the supported models for sambanova also. Can your test plan show the outputs for executing the entire integration suite (pytest -s -v tests/integration) please?

@jhpiedrahitao
Copy link
Contributor Author

jhpiedrahitao commented Mar 13, 2025

Hi @ashwinb. I'll let one comment for each test plan

  • for test_text_inference:
    Screenshot 2025-03-13 at 9 06 46 AM

here, 3 test are failing
1- chat_completion:ttft: error when empty tool_call in content
2- chat_completion:structured_output: reason sambanova does not support yet structured ouput with json_Schema yet

3- chat_completion:tool_calling_tools_absent-False]: error when empty tool_call in content

> The error when empty tool_call in content (1 and 2) is because sambanova API does not admit an empty tool call; this will be fixed at API level soon

@jhpiedrahitao
Copy link
Contributor Author

  • for test_text_inference:
    image

2 test faling
reasons
1- sambanova api does not support direct url sending; it is required to always sent the image in b64
2- when a turn in the convesation has more than one element in the content:

    message = {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": {
                    "url": {
                        "uri": "https://raw.githubusercontent.com/meta-llama/llama-stack/main/tests/api/inference/dog.png"
                    },
                },
            },
            {
                "type": "text",
                "text": "Describe what is in this image.",
            },
        ],
    }

instead of multiple messages each one with its content llamastack is failig, how ever this is working directly with litellm:
image

test manually in separate messages in llmastack working:
image

in this one, I'm guessing can be something odd in how the lite llm util is getting/sending the messages when multiple elements in same content.

@jhpiedrahitao
Copy link
Contributor Author

Finally, remote::sambanova doesn't support embedding model yet

@jhpiedrahitao jhpiedrahitao requested a review from ashwinb March 13, 2025 15:24
@jhpiedrahitao
Copy link
Contributor Author

jhpiedrahitao commented Mar 18, 2025

@ashwinb I found the reason of some test failures (multimodal whith multiple content type elements inside content),
looks like there is an issue when using the litellm litellm_openai_mixin util for inference adaptors in some cases (when the content of a message has multiple content items: image_urls, texts), the content items are inside a list each one, this structure is problematic for lite llm, given their implementation tries to search the type with a .get("type"), but because the actual content items dicts are in a list we get this error

image

this is because in the providers.utlis.inference.openai_compat::_convert_message_content, if the content is not string or list it is put into a list, and in the case of having content items those are dicts so as result we get in content a list of lists (each with only one dict: the content item), do you know the reason of putting the content dicts into a list, could we simply change the return to also return the element with out putting it into a list if their type is dict?

the issue was solved on PR 1150

@jhpiedrahitao
Copy link
Contributor Author

cc @snova-luiss

@ashwinb
Copy link
Contributor

ashwinb commented Mar 18, 2025

2- chat_completion:structured_output: reason sambanova does not support yet structured ouput with json_Schema yet


does it support structured outputs with other schemas? when do you expect this feature to be available?

@ashwinb
Copy link
Contributor

ashwinb commented Mar 18, 2025

1- sambanova api does not support direct url sending; it is required to always sent the image in b64

in that case, we should download the image in the sambanova adapter, encode it and send it as base64 downstream. we do this in a couple places (I believe maybe in the ollama adapter). we don't want the clients to be aware of this restriction unnecessarily when the Stack can hide it transparently.

@jhpiedrahitao
Copy link
Contributor Author

Hi @ashwinb, could you please take a look

@jhpiedrahitao
Copy link
Contributor Author

jhpiedrahitao commented Apr 8, 2025

@ehhuang this PR was already modifying sambanova to use lite llm + some extra modifications to include support for things like api_key, and base url setting, support for url images, etc. this was done because current version was broken for several functionalities like streaming or tool call, should I include these changes to the new folder you have created in remote providers (opeaai compat)?, or can this be merged in parallel, given the template is still pointing to the previous folder?

@ashwinb
Copy link
Contributor

ashwinb commented Apr 12, 2025

@jhpiedrahitao Hey sorry about keeping this so stale. Could you resolve conflicts one last time? I will merge it after that.

@jhpiedrahitao
Copy link
Contributor Author

jhpiedrahitao commented Apr 12, 2025

@jhpiedrahitao Hey sorry about keeping this so stale. Could you resolve conflicts one last time? I will merge it after that.

Hi @ashwinb thanks, no problem, conflicts solved 👍🏻

@jhpiedrahitao
Copy link
Contributor Author

Hi @ashwinb were you able to take a look in this one?

@jhpiedrahitao
Copy link
Contributor Author

Hi @ashwinb tagging you again to check if this can be merged

@ashwinb ashwinb merged commit b2b00a2 into llamastack:main May 6, 2025
2 checks passed
franciscojavierarceo pushed a commit to franciscojavierarceo/llama-stack that referenced this pull request May 9, 2025
…astack#1596)

# What does this PR do?

switch sambanova inference adaptor to LiteLLM usage to simplify
integration and solve issues with current adaptor when streaming and
tool calling, models and templates updated

## Test Plan
pytest -s -v tests/integration/inference/test_text_inference.py
--stack-config=sambanova
--text-model=sambanova/Meta-Llama-3.3-70B-Instruct

pytest -s -v tests/integration/inference/test_vision_inference.py
--stack-config=sambanova
--vision-model=sambanova/Llama-3.2-11B-Vision-Instruct
@jhpiedrahitao jhpiedrahitao deleted the feat/litellm_sambanova_usage branch May 20, 2025 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants