Skip to content

Commit 8ed38a2

Browse files
committed
added cohere support :: documentation
Signed-off-by: Ricken Bazolo <ricken.bazolo@gmail.com>
1 parent 9d9e2ee commit 8ed38a2

File tree

3 files changed

+651
-0
lines changed

3 files changed

+651
-0
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/nav.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
**** xref:api/chat/bedrock-converse.adoc[Amazon Bedrock Converse]
1818
**** xref:api/chat/anthropic-chat.adoc[Anthropic]
1919
**** xref:api/chat/azure-openai-chat.adoc[Azure OpenAI]
20+
**** xref:api/chat/cohere-chat.adoc[Cohere]
2021
**** xref:api/chat/deepseek-chat.adoc[DeepSeek]
2122
**** xref:api/chat/dmr-chat.adoc[Docker Model Runner]
2223
**** Google
@@ -56,6 +57,7 @@
5657
***** xref:api/embeddings/vertexai-embeddings-text.adoc[Text Embedding]
5758
***** xref:api/embeddings/vertexai-embeddings-multimodal.adoc[Multimodal Embedding]
5859
**** xref:api/embeddings/zhipuai-embeddings.adoc[ZhiPu AI]
60+
**** xref:api/embeddings/cohere-embeddings.adoc[Cohere]
5961

6062
*** xref:api/imageclient.adoc[Image Models]
6163
**** xref:api/image/azure-openai-image.adoc[Azure OpenAI]
Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
= Cohere Chat
2+
3+
Spring AI supports the various AI language models from Cohere. You can interact with Cohere language models and create multilingual conversational assistants based on Cohere's powerful models.
4+
5+
== Prerequisites
6+
7+
You will need to create an API key with Cohere to access Cohere language models.
8+
9+
Create an account at https://dashboard.cohere.com/welcome/register[Cohere registration page] and generate the token on the https://dashboard.cohere.com/api-keys[API Keys page].
10+
11+
The Spring AI project defines a configuration property named `spring.ai.cohere.api-key` that you should set to the value of the API Key obtained from dashboard.cohere.com.
12+
13+
You can set this configuration property in your `application.properties` file:
14+
15+
[source,properties]
16+
----
17+
spring.ai.cohere.api-key=<your-cohere-api-key>
18+
----
19+
20+
Alternatively, you can set this as an environment variable:
21+
22+
[source,bash]
23+
----
24+
export COHERE_API_KEY=<your-cohere-api-key>
25+
----
26+
27+
=== Add Repositories and BOM
28+
29+
Spring AI artifacts are published in Maven Central and Spring Snapshot repositories.
30+
Refer to the xref:getting-started.adoc#artifact-repositories[Artifact Repositories] section to add these repositories to your build system.
31+
32+
To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build system.
33+
34+
== Auto-configuration
35+
36+
[NOTE]
37+
====
38+
There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names.
39+
Please refer to the https://docs.spring.io/spring-ai/reference/upgrade-notes.html[upgrade notes] for more information.
40+
====
41+
42+
Spring AI provides Spring Boot auto-configuration for the Cohere Chat Client.
43+
To enable it add the following dependency to your project's Maven `pom.xml` file:
44+
45+
[source, xml]
46+
----
47+
<dependency>
48+
<groupId>org.springframework.ai</groupId>
49+
<artifactId>spring-ai-starter-model-cohere</artifactId>
50+
</dependency>
51+
----
52+
53+
or to your Gradle `build.gradle` build file.
54+
55+
[source,groovy]
56+
----
57+
dependencies {
58+
implementation 'org.springframework.ai:spring-ai-starter-model-cohere'
59+
}
60+
----
61+
62+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
63+
64+
=== Chat Properties
65+
66+
==== Retry Properties
67+
68+
The prefix `spring.ai.retry` is used as the property prefix that lets you configure the retry mechanism for the Cohere chat model.
69+
70+
[cols="3,5,1", stripes=even]
71+
|====
72+
| Property | Description | Default
73+
74+
| spring.ai.retry.max-attempts | Maximum number of retry attempts. | 10
75+
| spring.ai.retry.backoff.initial-interval | Initial sleep duration for the exponential backoff policy. | 2 sec.
76+
| spring.ai.retry.backoff.multiplier | Backoff interval multiplier. | 5
77+
| spring.ai.retry.backoff.max-interval | Maximum backoff duration. | 3 min.
78+
| spring.ai.retry.on-client-errors | If false, throw a NonTransientAiException, and do not attempt retry for `4xx` client error codes | false
79+
| spring.ai.retry.exclude-on-http-codes | List of HTTP status codes that should not trigger a retry (e.g. to throw NonTransientAiException). | empty
80+
| spring.ai.retry.on-http-codes | List of HTTP status codes that should trigger a retry (e.g. to throw TransientAiException). | empty
81+
|====
82+
83+
==== Connection Properties
84+
85+
The prefix `spring.ai.cohere` is used as the property prefix that lets you connect to Cohere.
86+
87+
[cols="3,5,1", stripes=even]
88+
|====
89+
| Property | Description | Default
90+
91+
| spring.ai.cohere.base-url | The URL to connect to | https://api.cohere.com
92+
| spring.ai.cohere.api-key | The API Key | -
93+
|====
94+
95+
==== Configuration Properties
96+
97+
[NOTE]
98+
====
99+
Enabling and disabling of the chat auto-configurations are now configured via top level properties with the prefix `spring.ai.model.chat`.
100+
101+
To enable, spring.ai.model.chat=cohere (It is enabled by default)
102+
103+
To disable, spring.ai.model.chat=none (or any value which doesn't match cohere)
104+
105+
This change is done to allow configuration of multiple models.
106+
====
107+
108+
The prefix `spring.ai.cohere.chat` is the property prefix that lets you configure the chat model implementation for Cohere.
109+
110+
[cols="3,5,1", stripes=even]
111+
|====
112+
| Property | Description | Default
113+
114+
| spring.ai.cohere.chat.enabled (Removed and no longer valid) | Enable Cohere chat model. | true
115+
| spring.ai.model.chat | Enable Cohere chat model. | cohere
116+
| spring.ai.cohere.chat.base-url | Optional override for the `spring.ai.cohere.base-url` property to provide chat-specific URL. | -
117+
| spring.ai.cohere.chat.api-key | Optional override for the `spring.ai.cohere.api-key` to provide chat-specific API Key. | -
118+
| spring.ai.cohere.chat.options.model | This is the Cohere Chat model to use | `command-r7b-12-2024` (see available models below)
119+
| spring.ai.cohere.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify `temperature` and `p` for the same completions request as the interaction of these two settings is difficult to predict. | 0.3
120+
| spring.ai.cohere.chat.options.max-tokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | -
121+
| spring.ai.cohere.chat.options.p | Ensures that only the most likely tokens, with total probability mass of p, are considered for generation at each step. If both k and p are enabled, p acts after k. min value of 0.01, max value of 0.99. | 1.0
122+
| spring.ai.cohere.chat.options.k | Ensures that only the top k most likely tokens are considered for generation at each step. When k is set to 0, k-sampling is disabled. min value of 0, max value of 500. | 0
123+
| spring.ai.cohere.chat.options.frequency-penalty | Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation. Min value of 0.0, max value of 1.0. | 0.0
124+
| spring.ai.cohere.chat.options.presence-penalty | Used to reduce repetitiveness of generated tokens. Similar to frequency_penalty, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies. Min value of 0.0, max value of 1.0. | 0.0
125+
| spring.ai.cohere.chat.options.seed | If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. | -
126+
| spring.ai.cohere.chat.options.stop-sequences | A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens. | -
127+
| spring.ai.cohere.chat.options.response-format | An object specifying the format that the model must output. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON.| -
128+
| spring.ai.cohere.chat.options.safety-mode | Used to select the safety instruction inserted into the prompt. Can be OFF, CONTEXTUAL, or STRICT. When OFF is specified, the safety instruction will be omitted. | CONTEXTUAL
129+
| spring.ai.cohere.chat.options.logprobs | When set to true, the log probabilities of the generated tokens will be included in the response. | false
130+
| spring.ai.cohere.chat.options.strict-tools | When enabled, tool calls are validated against the tool JSON schemas. | -
131+
| spring.ai.cohere.chat.options.tools | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. | -
132+
| spring.ai.cohere.chat.options.tool-choice | Controls which (if any) function is called by the model. `none` means the model will not call a function and instead generates a message. `required` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"type: "function", "function": {"name": "my_function"}}` forces the model to call that function. `required` is the default when no functions are present. `required` is the default if functions are present. | -
133+
| spring.ai.cohere.chat.options.tool-names | List of tools, identified by their names, to enable for function calling in a single prompt request. Tools with those names must exist in the ToolCallback registry. | -
134+
| spring.ai.cohere.chat.options.tool-callbacks | Tool Callbacks to register with the ChatModel. | -
135+
| spring.ai.cohere.chat.options.internal-tool-execution-enabled | If false, the Spring AI will not handle the tool calls internally, but will proxy them to the client. Then it is the client's responsibility to handle the tool calls, dispatch them to the appropriate function, and return the results. If true (the default), the Spring AI will handle the function calls internally. Applicable only for chat models with function calling support | true
136+
|====
137+
138+
NOTE: You can override the common `spring.ai.cohere.base-url` and `spring.ai.cohere.api-key` for the `ChatModel` and `EmbeddingModel` implementations.
139+
The `spring.ai.cohere.chat.base-url` and `spring.ai.cohere.chat.api-key` properties, if set, take precedence over the common properties.
140+
This is useful if you want to use different Cohere accounts for different models and different model endpoints.
141+
142+
TIP: All properties prefixed with `spring.ai.cohere.chat.options` can be overridden at runtime by adding request-specific <<chat-options>> to the `Prompt` call.
143+
144+
== Available Models
145+
146+
Cohere provides several chat models, each optimized for different use cases:
147+
148+
[cols="2,1,4", stripes=even]
149+
|====
150+
| Model | Context Length | Description
151+
152+
| `command-a-03-2025`
153+
| 128K tokens
154+
| Latest flagship model with enhanced reasoning capabilities. Best overall performance for complex tasks.
155+
156+
| `command-a-reasoning-08-2025`
157+
| 128K tokens
158+
| Specialized model optimized for reasoning tasks, mathematical problem-solving, and logical deduction.
159+
160+
| `command-a-translate-08-2025`
161+
| 128K tokens
162+
| Optimized for translation tasks across multiple languages. Provides high-quality translations.
163+
164+
| `command-a-vision-07-2025`
165+
| 128K tokens
166+
| Multimodal model with vision capabilities. Can process and understand images along with text.
167+
168+
| `command-r7b-12-2024`
169+
| 128K tokens
170+
| Lightweight 7 billion parameter model. Faster and more cost-effective while maintaining good quality. Default model.
171+
172+
| `command-r-plus-08-2024`
173+
| 128K tokens
174+
| Enhanced version of Command R with improved performance and multilingual capabilities.
175+
176+
| `command-r-08-2024`
177+
| 128K tokens
178+
| General-purpose model with strong multilingual support and retrieval-augmented generation capabilities.
179+
|====
180+
181+
182+
== Runtime Options [[chat-options]]
183+
184+
The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatOptions.java[CohereChatOptions.java] provides model configurations, such as the model to use, the temperature, the frequency penalty, etc.
185+
186+
On start-up, the default options can be configured with the `CohereChatModel(api, options)` constructor or the `spring.ai.cohere.chat.options.*` properties.
187+
188+
At run-time, you can override the default options by adding new, request-specific options to the `Prompt` call.
189+
For example, to override the default model and temperature for a specific request:
190+
191+
[source,java]
192+
----
193+
ChatResponse response = chatModel.call(
194+
new Prompt(
195+
"Generate the names of 5 famous pirates.",
196+
CohereChatOptions.builder()
197+
.model(CohereApi.ChatModel.COMMAND_A.getName())
198+
.temperature(0.5)
199+
.build()
200+
));
201+
----
202+
203+
TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatOptions.java[CohereChatOptions] you can use a portable link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-model/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-model/src/main/java/org/springframework/ai/chat/prompt/DefaultChatOptionsBuilder.java[ChatOptions#builder()].
204+
205+
== Function Calling
206+
207+
You can register custom Java functions with the `CohereChatModel` and have the Cohere model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions.
208+
This is a powerful technique to connect the LLM capabilities with external tools and APIs.
209+
Read more about xref:api/tools.adoc[Tool Calling].
210+
211+
== Multimodal
212+
213+
Multimodality refers to a model's ability to simultaneously understand and process information from various sources, including text, images, audio, and other data formats.
214+
Cohere supports text and vision modalities.
215+
216+
=== Vision
217+
218+
Cohere models that offer vision multimodal support include `command-a-vision-07-2025`.
219+
Refer to the link:https://docs.cohere.com/docs/vision[Vision] guide for more information.
220+
221+
The Cohere link:https://docs.cohere.com/reference/chat[Chat API] can incorporate a list of base64-encoded images or image urls with the message.
222+
Spring AI's link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-client-chat/src/main/java/org/springframework/ai/chat/messages/Message.java[Message] interface facilitates multimodal AI models by introducing the link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-commons/src/main/java/org/springframework/ai/content/Media.java[Media] type.
223+
This type encompasses data and details regarding media attachments in messages, utilizing Spring's `org.springframework.util.MimeType` and a `org.springframework.core.io.Resource` for the raw media data.
224+
225+
Below is a code example illustrating the fusion of user text with an image:
226+
227+
[source,java]
228+
----
229+
var imageResource = new ClassPathResource("/multimodal.test.png");
230+
231+
var userMessage = new UserMessage("Explain what do you see on this picture?",
232+
new Media(MimeTypeUtils.IMAGE_PNG, imageResource));
233+
234+
ChatResponse response = chatModel.call(new Prompt(userMessage,
235+
ChatOptions.builder().model(CohereApi.ChatModel.COMMAND_A_VISION.getName()).build()));
236+
----
237+
238+
TIP: You can pass multiple images as well.
239+
240+
== Sample Controller (Auto-configuration)
241+
242+
https://start.spring.io/[Create] a new Spring Boot project and add the `spring-ai-starter-model-cohere` to your pom (or gradle) dependencies.
243+
244+
Add a `application.properties` file under the `src/main/resources` directory to enable and configure the Cohere chat model:
245+
246+
[source,application.properties]
247+
----
248+
spring.ai.cohere.api-key=YOUR_API_KEY
249+
spring.ai.cohere.chat.options.model=command-r7b-12-2024
250+
spring.ai.cohere.chat.options.temperature=0.7
251+
----
252+
253+
TIP: Replace the `api-key` with your Cohere credentials.
254+
255+
This will create a `CohereChatModel` implementation that you can inject into your classes.
256+
Here is an example of a simple `@RestController` class that uses the chat model for text generations.
257+
258+
[source,java]
259+
----
260+
@RestController
261+
public class ChatController {
262+
263+
private final CohereChatModel chatModel;
264+
265+
@Autowired
266+
public ChatController(CohereChatModel chatModel) {
267+
this.chatModel = chatModel;
268+
}
269+
270+
@GetMapping("/ai/generate")
271+
public Map<String,String> generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
272+
return Map.of("generation", this.chatModel.call(message));
273+
}
274+
275+
@GetMapping("/ai/generateStream")
276+
public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
277+
var prompt = new Prompt(new UserMessage(message));
278+
return this.chatModel.stream(prompt);
279+
}
280+
}
281+
----
282+
283+
== Manual Configuration
284+
285+
The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/chat/CohereChatModel.java[CohereChatModel] implements the `ChatModel` and `StreamingChatModel` and uses the <<low-level-api>> to connect to the Cohere service.
286+
287+
Add the `spring-ai-cohere` dependency to your project's Maven `pom.xml` file:
288+
289+
[source, xml]
290+
----
291+
<dependency>
292+
<groupId>org.springframework.ai</groupId>
293+
<artifactId>spring-ai-cohere</artifactId>
294+
</dependency>
295+
----
296+
297+
or to your Gradle `build.gradle` build file.
298+
299+
[source,groovy]
300+
----
301+
dependencies {
302+
implementation 'org.springframework.ai:spring-ai-cohere'
303+
}
304+
----
305+
306+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
307+
308+
Next, create a `CohereChatModel` and use it for text generations:
309+
310+
[source,java]
311+
----
312+
var cohereApi = new CohereApi(System.getenv("COHERE_API_KEY"));
313+
var chatModel = new CohereChatModel(cohereApi, CohereChatOptions.builder()
314+
.model(CohereApi.ChatModel.COMMAND_A.getName())
315+
.temperature(0.4)
316+
.build());
317+
318+
ChatResponse response = chatModel.call(new Prompt("Generate the names of 5 famous pirates."));
319+
----
320+
321+
=== Low-level CohereApi Client [[low-level-api]]
322+
323+
The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/main/java/org/springframework/ai/cohere/api/CohereApi.java[CohereApi] provides a lightweight Java client for link:https://docs.cohere.com/reference/chat[Cohere API].
324+
325+
Here is a simple snippet showing how to use the API programmatically:
326+
327+
[source,java]
328+
----
329+
CohereApi cohereApi = new CohereApi(System.getenv("COHERE_API_KEY"));
330+
ChatCompletionMessage message = new ChatCompletionMessage("Hello world", Role.USER);
331+
ResponseEntity<ChatCompletion> response = cohereApi.chatCompletionEntity(
332+
new ChatCompletionRequest(List.of(message), CohereApi.ChatModel.COMMAND_A.getName(), 0.8, false));
333+
----
334+
335+
==== CohereApi Samples
336+
337+
* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/test/java/org/springframework/ai/cohere/api/CohereApiIT.java[CohereApiIT.java] tests provide some general examples of how to use the lightweight library.
338+
339+
* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-cohere/src/test/java/org/springframework/ai/cohere/chat/CohereChatModelIT.java[CohereChatModelIT.java] tests show examples of using function calling and streaming.

0 commit comments

Comments
 (0)