|
| 1 | += Cohere Chat |
| 2 | + |
| 3 | +Spring AI supports the various AI language models from Cohere. You can interact with Cohere language models and create multilingual conversational assistants based on Cohere's powerful models. |
| 4 | + |
| 5 | +== Prerequisites |
| 6 | + |
| 7 | +You will need to create an API key with Cohere to access Cohere language models. |
| 8 | + |
| 9 | +Create an account at https://dashboard.cohere.com/welcome/register[Cohere registration page] and generate the token on the https://dashboard.cohere.com/api-keys[API Keys page]. |
| 10 | + |
| 11 | +The Spring AI project defines a configuration property named `spring.ai.cohere.api-key` that you should set to the value of the API Key obtained from dashboard.cohere.com. |
| 12 | + |
| 13 | +You can set this configuration property in your `application.properties` file: |
| 14 | + |
| 15 | +[source,properties] |
| 16 | +---- |
| 17 | +spring.ai.cohere.api-key=<your-cohere-api-key> |
| 18 | +---- |
| 19 | + |
| 20 | +Alternatively, you can set this as an environment variable: |
| 21 | + |
| 22 | +[source,bash] |
| 23 | +---- |
| 24 | +export COHERE_API_KEY=<your-cohere-api-key> |
| 25 | +---- |
| 26 | + |
| 27 | +=== Add Repositories and BOM |
| 28 | + |
| 29 | +Spring AI artifacts are published in Maven Central and Spring Snapshot repositories. |
| 30 | +Refer to the xref:getting-started.adoc#artifact-repositories[Artifact Repositories] section to add these repositories to your build system. |
| 31 | + |
| 32 | +To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build system. |
| 33 | + |
| 34 | +== Auto-configuration |
| 35 | + |
| 36 | +[NOTE] |
| 37 | +==== |
| 38 | +There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names. |
| 39 | +Please refer to the https://docs.spring.io/spring-ai/reference/upgrade-notes.html[upgrade notes] for more information. |
| 40 | +==== |
| 41 | + |
| 42 | +Spring AI provides Spring Boot auto-configuration for the Cohere Chat Client. |
| 43 | +To enable it add the following dependency to your project's Maven `pom.xml` file: |
| 44 | + |
| 45 | +[source, xml] |
| 46 | +---- |
| 47 | +<dependency> |
| 48 | + <groupId>org.springframework.ai</groupId> |
| 49 | + <artifactId>spring-ai-starter-model-cohere</artifactId> |
| 50 | +</dependency> |
| 51 | +---- |
| 52 | + |
| 53 | +or to your Gradle `build.gradle` build file. |
| 54 | + |
| 55 | +[source,groovy] |
| 56 | +---- |
| 57 | +dependencies { |
| 58 | + implementation 'org.springframework.ai:spring-ai-starter-model-cohere' |
| 59 | +} |
| 60 | +---- |
| 61 | + |
| 62 | +TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. |
| 63 | + |
| 64 | +=== Chat Properties |
| 65 | + |
| 66 | +==== Retry Properties |
| 67 | + |
| 68 | +The prefix `spring.ai.retry` is used as the property prefix that lets you configure the retry mechanism for the Cohere chat model. |
| 69 | + |
| 70 | +[cols="3,5,1", stripes=even] |
| 71 | +|==== |
| 72 | +| Property | Description | Default |
| 73 | + |
| 74 | +| spring.ai.retry.max-attempts | Maximum number of retry attempts. | 10 |
| 75 | +| spring.ai.retry.backoff.initial-interval | Initial sleep duration for the exponential backoff policy. | 2 sec. |
| 76 | +| spring.ai.retry.backoff.multiplier | Backoff interval multiplier. | 5 |
| 77 | +| spring.ai.retry.backoff.max-interval | Maximum backoff duration. | 3 min. |
| 78 | +| spring.ai.retry.on-client-errors | If false, throw a NonTransientAiException, and do not attempt retry for `4xx` client error codes | false |
| 79 | +| spring.ai.retry.exclude-on-http-codes | List of HTTP status codes that should not trigger a retry (e.g. to throw NonTransientAiException). | empty |
| 80 | +| spring.ai.retry.on-http-codes | List of HTTP status codes that should trigger a retry (e.g. to throw TransientAiException). | empty |
| 81 | +|==== |
| 82 | + |
| 83 | +==== Connection Properties |
| 84 | + |
| 85 | +The prefix `spring.ai.cohere` is used as the property prefix that lets you connect to Cohere. |
| 86 | + |
| 87 | +[cols="3,5,1", stripes=even] |
| 88 | +|==== |
| 89 | +| Property | Description | Default |
| 90 | + |
| 91 | +| spring.ai.cohere.base-url | The URL to connect to | https://api.cohere.com |
| 92 | +| spring.ai.cohere.api-key | The API Key | - |
| 93 | +|==== |
| 94 | + |
| 95 | +==== Configuration Properties |
| 96 | + |
| 97 | +[NOTE] |
| 98 | +==== |
| 99 | +Enabling and disabling of the chat auto-configurations are now configured via top level properties with the prefix `spring.ai.model.chat`. |
| 100 | +
|
| 101 | +To enable, spring.ai.model.chat=cohere (It is enabled by default) |
| 102 | +
|
| 103 | +To disable, spring.ai.model.chat=none (or any value which doesn't match cohere) |
| 104 | +
|
| 105 | +This change is done to allow configuration of multiple models. |
| 106 | +==== |
| 107 | + |
| 108 | +The prefix `spring.ai.cohere.chat` is the property prefix that lets you configure the chat model implementation for Cohere. |
| 109 | + |
| 110 | +[cols="3,5,1", stripes=even] |
| 111 | +|==== |
| 112 | +| Property | Description | Default |
| 113 | + |
| 114 | +| spring.ai.cohere.chat.enabled (Removed and no longer valid) | Enable Cohere chat model. | true |
| 115 | +| spring.ai.model.chat | Enable Cohere chat model. | cohere |
| 116 | +| spring.ai.cohere.chat.base-url | Optional override for the `spring.ai.cohere.base-url` property to provide chat-specific URL. | - |
| 117 | +| spring.ai.cohere.chat.api-key | Optional override for the `spring.ai.cohere.api-key` to provide chat-specific API Key. | - |
| 118 | +| spring.ai.cohere.chat.options.model | This is the Cohere Chat model to use | `command-r7b-12-2024` (see available models below) |
| 119 | +| spring.ai.cohere.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify `temperature` and `p` for the same completions request as the interaction of these two settings is difficult to predict. | 0.3 |
| 120 | +| spring.ai.cohere.chat.options.max-tokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | - |
| 121 | +| spring.ai.cohere.chat.options.p | Ensures that only the most likely tokens, with total probability mass of p, are considered for generation at each step. If both k and p are enabled, p acts after k. min value of 0.01, max value of 0.99. | 1.0 |
| 122 | +| spring.ai.cohere.chat.options.k | Ensures that only the top k most likely tokens are considered for generation at each step. When k is set to 0, k-sampling is disabled. min value of 0, max value of 500. | 0 |
| 123 | +| spring.ai.cohere.chat.options.frequency-penalty | Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation. Min value of 0.0, max value of 1.0. | 0.0 |
| 124 | +| spring.ai.cohere.chat.options.presence-penalty | Used to reduce repetitiveness of generated tokens. Similar to frequency_penalty, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies. Min value of 0.0, max value of 1.0. | 0.0 |
| 125 | +| spring.ai.cohere.chat.options.seed | If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. | - |
| 126 | +| spring.ai.cohere.chat.options.stop-sequences | A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens. | - |
| 127 | +| spring.ai.cohere.chat.options.response-format | An object specifying the format that the model must output. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON.| - |
| 128 | +| spring.ai.cohere.chat.options.safety-mode | Used to select the safety instruction inserted into the prompt. Can be OFF, CONTEXTUAL, or STRICT. When OFF is specified, the safety instruction will be omitted. | CONTEXTUAL |
| 129 | +| spring.ai.cohere.chat.options.logprobs | When set to true, the log probabilities of the generated tokens will be included in the response. | false |
| 130 | +| spring.ai.cohere.chat.options.strict-tools | When enabled, tool calls are validated against the tool JSON schemas. | - |
| 131 | +| spring.ai.cohere.chat.options.tools | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. | - |
| 132 | +| spring.ai.cohere.chat.options.tool-choice | Controls which (if any) function is called by the model. `none` means the model will not call a function and instead generates a message. `required` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"type: "function", "function": {"name": "my_function"}}` forces the model to call that function. `required` is the default when no functions are present. `required` is the default if functions are present. | - |
| 133 | +| spring.ai.cohere.chat.options.tool-names | List of tools, identified by their names, to enable for function calling in a single prompt request. Tools with those names must exist in the ToolCallback registry. | - |
| 134 | +| spring.ai.cohere.chat.options.tool-callbacks | Tool Callbacks to register with the ChatModel. | - |
| 135 | +| spring.ai.cohere.chat.options.internal-tool-execution-enabled | If false, the Spring AI will not handle the tool calls internally, but will proxy them to the client. Then it is the client's responsibility to handle the tool calls, dispatch them to the appropriate function, and return the results. If true (the default), the Spring AI will handle the function calls internally. Applicable only for chat models with function calling support | true |
| 136 | +|==== |
| 137 | + |
| 138 | +NOTE: You can override the common `spring.ai.cohere.base-url` and `spring.ai.cohere.api-key` for the `ChatModel` and `EmbeddingModel` implementations. |
| 139 | +The `spring.ai.cohere.chat.base-url` and `spring.ai.cohere.chat.api-key` properties, if set, take precedence over the common properties. |
| 140 | +This is useful if you want to use different Cohere accounts for different models and different model endpoints. |
| 141 | + |
| 142 | +TIP: All properties prefixed with `spring.ai.cohere.chat.options` can be overridden at runtime by adding request-specific <<chat-options>> to the `Prompt` call. |
| 143 | + |
| 144 | +== Available Models |
| 145 | + |
| 146 | +Cohere provides several chat models, each optimized for different use cases: |
| 147 | + |
| 148 | +[cols="2,1,4", stripes=even] |
| 149 | +|==== |
| 150 | +| Model | Context Length | Description |
| 151 | + |
| 152 | +| `command-a-03-2025` |
| 153 | +| 128K tokens |
| 154 | +| Latest flagship model with enhanced reasoning capabilities. Best overall performance for complex tasks. |
| 155 | + |
| 156 | +| `command-a-reasoning-08-2025` |
| 157 | +| 128K tokens |
| 158 | +| Specialized model optimized for reasoning tasks, mathematical problem-solving, and logical deduction. |
| 159 | + |
| 160 | +| `command-a-translate-08-2025` |
| 161 | +| 128K tokens |
| 162 | +| Optimized for translation tasks across multiple languages. Provides high-quality translations. |
| 163 | + |
| 164 | +| `command-a-vision-07-2025` |
| 165 | +| 128K tokens |
| 166 | +| Multimodal model with vision capabilities. Can process and understand images along with text. |
| 167 | + |
| 168 | +| `command-r7b-12-2024` |
| 169 | +| 128K tokens |
| 170 | +| Lightweight 7 billion parameter model. Faster and more cost-effective while maintaining good quality. Default model. |
| 171 | + |
| 172 | +| `command-r-plus-08-2024` |
| 173 | +| 128K tokens |
| 174 | +| Enhanced version of Command R with improved performance and multilingual capabilities. |
| 175 | + |
| 176 | +| `command-r-08-2024` |
| 177 | +| 128K tokens |
| 178 | +| General-purpose model with strong multilingual support and retrieval-augmented generation capabilities. |
| 179 | +|==== |
| 180 | + |
| 181 | + |
| 182 | +== Runtime Options [[chat-options]] |
| 183 | + |
| 184 | +The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatOptions.java[CohereChatOptions.java] provides model configurations, such as the model to use, the temperature, the frequency penalty, etc. |
| 185 | + |
| 186 | +On start-up, the default options can be configured with the `CohereChatModel(api, options)` constructor or the `spring.ai.cohere.chat.options.*` properties. |
| 187 | + |
| 188 | +At run-time, you can override the default options by adding new, request-specific options to the `Prompt` call. |
| 189 | +For example, to override the default model and temperature for a specific request: |
| 190 | + |
| 191 | +[source,java] |
| 192 | +---- |
| 193 | +ChatResponse response = chatModel.call( |
| 194 | + new Prompt( |
| 195 | + "Generate the names of 5 famous pirates.", |
| 196 | + CohereChatOptions.builder() |
| 197 | + .model(CohereApi.ChatModel.COMMAND_A.getName()) |
| 198 | + .temperature(0.5) |
| 199 | + .build() |
| 200 | + )); |
| 201 | +---- |
| 202 | + |
| 203 | +TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatOptions.java[CohereChatOptions] you can use a portable link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-model/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-model/src/main/java/org/springframework/ai/chat/prompt/DefaultChatOptionsBuilder.java[ChatOptions#builder()]. |
| 204 | + |
| 205 | +== Function Calling |
| 206 | + |
| 207 | +You can register custom Java functions with the `CohereChatModel` and have the Cohere model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. |
| 208 | +This is a powerful technique to connect the LLM capabilities with external tools and APIs. |
| 209 | +Read more about xref:api/tools.adoc[Tool Calling]. |
| 210 | + |
| 211 | +== Multimodal |
| 212 | + |
| 213 | +Multimodality refers to a model's ability to simultaneously understand and process information from various sources, including text, images, audio, and other data formats. |
| 214 | +Cohere supports text and vision modalities. |
| 215 | + |
| 216 | +=== Vision |
| 217 | + |
| 218 | +Cohere models that offer vision multimodal support include `command-a-vision-07-2025`. |
| 219 | +Refer to the link:https://docs.cohere.com/docs/vision[Vision] guide for more information. |
| 220 | + |
| 221 | +The Cohere link:https://docs.cohere.com/reference/chat[Chat API] can incorporate a list of base64-encoded images or image urls with the message. |
| 222 | +Spring AI's link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-client-chat/src/main/java/org/springframework/ai/chat/messages/Message.java[Message] interface facilitates multimodal AI models by introducing the link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-commons/src/main/java/org/springframework/ai/content/Media.java[Media] type. |
| 223 | +This type encompasses data and details regarding media attachments in messages, utilizing Spring's `org.springframework.util.MimeType` and a `org.springframework.core.io.Resource` for the raw media data. |
| 224 | + |
| 225 | +Below is a code example illustrating the fusion of user text with an image: |
| 226 | + |
| 227 | +[source,java] |
| 228 | +---- |
| 229 | +var imageResource = new ClassPathResource("/multimodal.test.png"); |
| 230 | +
|
| 231 | +var userMessage = new UserMessage("Explain what do you see on this picture?", |
| 232 | + new Media(MimeTypeUtils.IMAGE_PNG, imageResource)); |
| 233 | +
|
| 234 | +ChatResponse response = chatModel.call(new Prompt(userMessage, |
| 235 | + ChatOptions.builder().model(CohereApi.ChatModel.COMMAND_A_VISION.getName()).build())); |
| 236 | +---- |
| 237 | + |
| 238 | +TIP: You can pass multiple images as well. |
| 239 | + |
| 240 | +== Sample Controller (Auto-configuration) |
| 241 | + |
| 242 | +https://start.spring.io/[Create] a new Spring Boot project and add the `spring-ai-starter-model-cohere` to your pom (or gradle) dependencies. |
| 243 | + |
| 244 | +Add a `application.properties` file under the `src/main/resources` directory to enable and configure the Cohere chat model: |
| 245 | + |
| 246 | +[source,application.properties] |
| 247 | +---- |
| 248 | +spring.ai.cohere.api-key=YOUR_API_KEY |
| 249 | +spring.ai.cohere.chat.options.model=command-r7b-12-2024 |
| 250 | +spring.ai.cohere.chat.options.temperature=0.7 |
| 251 | +---- |
| 252 | + |
| 253 | +TIP: Replace the `api-key` with your Cohere credentials. |
| 254 | + |
| 255 | +This will create a `CohereChatModel` implementation that you can inject into your classes. |
| 256 | +Here is an example of a simple `@RestController` class that uses the chat model for text generations. |
| 257 | + |
| 258 | +[source,java] |
| 259 | +---- |
| 260 | +@RestController |
| 261 | +public class ChatController { |
| 262 | +
|
| 263 | + private final CohereChatModel chatModel; |
| 264 | +
|
| 265 | + @Autowired |
| 266 | + public ChatController(CohereChatModel chatModel) { |
| 267 | + this.chatModel = chatModel; |
| 268 | + } |
| 269 | +
|
| 270 | + @GetMapping("/ai/generate") |
| 271 | + public Map<String,String> generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { |
| 272 | + return Map.of("generation", this.chatModel.call(message)); |
| 273 | + } |
| 274 | +
|
| 275 | + @GetMapping("/ai/generateStream") |
| 276 | + public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { |
| 277 | + var prompt = new Prompt(new UserMessage(message)); |
| 278 | + return this.chatModel.stream(prompt); |
| 279 | + } |
| 280 | +} |
| 281 | +---- |
| 282 | + |
| 283 | +== Manual Configuration |
| 284 | + |
| 285 | +The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatModel.java[CohereChatModel] implements the `ChatModel` and `StreamingChatModel` and uses the <<low-level-api>> to connect to the Cohere service. |
| 286 | + |
| 287 | +Add the `spring-ai-cohere` dependency to your project's Maven `pom.xml` file: |
| 288 | + |
| 289 | +[source, xml] |
| 290 | +---- |
| 291 | +<dependency> |
| 292 | + <groupId>org.springframework.ai</groupId> |
| 293 | + <artifactId>spring-ai-cohere</artifactId> |
| 294 | +</dependency> |
| 295 | +---- |
| 296 | + |
| 297 | +or to your Gradle `build.gradle` build file. |
| 298 | + |
| 299 | +[source,groovy] |
| 300 | +---- |
| 301 | +dependencies { |
| 302 | + implementation 'org.springframework.ai:spring-ai-cohere' |
| 303 | +} |
| 304 | +---- |
| 305 | + |
| 306 | +TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. |
| 307 | + |
| 308 | +Next, create a `CohereChatModel` and use it for text generations: |
| 309 | + |
| 310 | +[source,java] |
| 311 | +---- |
| 312 | +var cohereApi = new CohereApi(System.getenv("COHERE_API_KEY")); |
| 313 | +var chatModel = new CohereChatModel(cohereApi, CohereChatOptions.builder() |
| 314 | + .model(CohereApi.ChatModel.COMMAND_A.getName()) |
| 315 | + .temperature(0.4) |
| 316 | + .build()); |
| 317 | +
|
| 318 | +ChatResponse response = chatModel.call(new Prompt("Generate the names of 5 famous pirates.")); |
| 319 | +---- |
| 320 | + |
| 321 | +=== Low-level CohereApi Client [[low-level-api]] |
| 322 | + |
| 323 | +The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/api/CohereApi.java[CohereApi] provides a lightweight Java client for link:https://docs.cohere.com/reference/chat[Cohere API]. |
| 324 | + |
| 325 | +Here is a simple snippet showing how to use the API programmatically: |
| 326 | + |
| 327 | +[source,java] |
| 328 | +---- |
| 329 | +CohereApi cohereApi = new CohereApi(System.getenv("COHERE_API_KEY")); |
| 330 | +ChatCompletionMessage message = new ChatCompletionMessage("Hello world", Role.USER); |
| 331 | +ResponseEntity<ChatCompletion> response = cohereApi.chatCompletionEntity( |
| 332 | + new ChatCompletionRequest(List.of(message), CohereApi.ChatModel.COMMAND_A.getName(), 0.8, false)); |
| 333 | +---- |
| 334 | + |
| 335 | +==== CohereApi Samples |
| 336 | + |
| 337 | +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/test/java/org/springframework/ai/cohere/api/CohereApiIT.java[CohereApiIT.java] tests provide some general examples of how to use the lightweight library. |
| 338 | + |
| 339 | +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/test/java/org/springframework/ai/cohere/chat/CohereChatModelIT.java[CohereChatModelIT.java] tests show examples of using function calling and streaming. |
0 commit comments