Skip to content

Conversation

@darshankparmar
Copy link
Contributor

@darshankparmar darshankparmar commented Dec 10, 2025

No description provided.

@CLAassistant
Copy link

CLAassistant commented Dec 10, 2025

CLA assistant check
All committers have signed the CLA.

"command-light-nightly",
]

EmbeddingModels = Literal[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that embedding models can be used in voice calls

@Hormold
Copy link
Contributor

Hormold commented Dec 11, 2025

Hello, and thank you for your PR!
In most cases, if the API is quite similar to OpenAI's, we push new providers as additional function to OpenAI (e.g., recent PRs regarding OpenRouter or OVH) plugin. Could you update it to just be part of openai plugin?

@darshankparmar
Copy link
Contributor Author

Hello, and thank you for your PR! In most cases, if the API is quite similar to OpenAI's, we push new providers as additional function to OpenAI (e.g., recent PRs regarding OpenRouter or OVH) plugin. Could you update it to just be part of openai plugin?

Hi @Hormold thanks for your message! I will. 😄

@darshankparmar darshankparmar marked this pull request as draft December 11, 2025 05:13
@darshankparmar darshankparmar marked this pull request as ready for review December 11, 2025 17:33
@Hormold Hormold self-assigned this Dec 11, 2025
@Hormold
Copy link
Contributor

Hormold commented Dec 11, 2025

Hey! Tested the Cohere integration and found two issues that need fixing before this can work properly with voice agents.

First, Cohere API returns 400: message must be at least 1 token long when there's no user message in the chat context. This happens in voice agents when generate_reply(instructions="...") is called without user input (like in on_enter() to greet the user). OpenAI handles this fine but Cohere doesn't... looks like Cohere requires at least one user message to generate a response.

Second, tool calling breaks with 400: schema 'type' must be a string. Array 'type' is unsupported for this model. Cohere's OpenAI-compatible API seems to have stricter JSON schema requirements than OpenAI - it doesn't accept union/array types that LiveKit generates for function tools. Need to figure out what schema format Cohere actually expects and adapt the tool serialization.

@Hormold Hormold marked this pull request as draft December 11, 2025 21:27
@darshankparmar
Copy link
Contributor Author

Thanks for the feedback!

@darshankparmar
Copy link
Contributor Author

I took a look at the first issue. One idea: we could add a check in the chat method (Cohere-only) to see if there’s at least one user message in the context. If not, we either auto-inject a small placeholder user message (e.g. “Hello”) so Cohere is happy, or just throw an exception instead. Not sure which direction makes more sense here, but this would at least guarantee we don’t hit that 400 on empty contexts.

@darshankparmar
Copy link
Contributor Author

I dug into the second issue as well. Ended up fixing it by setting _strict_tool_schema=False for Cohere.

@darshankparmar
Copy link
Contributor Author

I took a look at the first issue. One idea: we could add a check in the chat method (Cohere-only) to see if there’s at least one user message in the context. If not, we either auto-inject a small placeholder user message (e.g. “Hello”) so Cohere is happy, or just throw an exception instead. Not sure which direction makes more sense here, but this would at least guarantee we don’t hit that 400 on empty contexts.

@Hormold can you confirm?

@chenghao-mou
Copy link
Member

Adding a placeholder message should be fine. This is what we did with Gemini Realtime:
Screenshot 2025-12-19 at 11 27 11

@darshankparmar darshankparmar marked this pull request as ready for review December 21, 2025 03:11
@darshankparmar
Copy link
Contributor Author

@Hormold pushed the Cohere issues fix, PR is ready for review.

@Hormold
Copy link
Contributor

Hormold commented Dec 22, 2025

Hey, I tested, and the PR looks good. One minor thing is a conflict here. Also, I encountered a couple of timeouts on Cohere responses.

@Hormold
Copy link
Contributor

Hormold commented Dec 22, 2025

Merge break imports. Could you please add ChatMessage to imports again?

@darshankparmar
Copy link
Contributor Author

Hey, I tested, and the PR looks good. One minor thing is a conflict here. Also, I encountered a couple of timeouts on Cohere responses.

Thanks for testing! Regarding the timeouts - Cohere API can have high latency (25+ seconds TTFT) which may cause timeout errors in real-time applications. Consider:

  1. For a small, fast model: command-r7b-12-2024
  2. For general purpose use: command-r-08-2024
  3. For more advanced capabilities: command-a-03-2025
  4. Increasing timeout values for production use
  5. Setting appropriate max_completion_tokens to reduce response time

I also updated the latest Cohere Command models (text generation).

@Hormold
Copy link
Contributor

Hormold commented Dec 23, 2025

Thanks! Works great!

@darshankparmar
Copy link
Contributor Author

Thanks! Works great!

Thanks for the feedback! Glad it's working well.
Any final changes needed or ready to merge? 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants