TanStack · jherr · Dec 13, 2025 · Dec 12, 2025 · Dec 12, 2025 · Dec 13, 2025
diff --git a/docs/adapters/gemini.md b/docs/adapters/gemini.md
@@ -3,7 +3,7 @@ title: Gemini Adapter
 id: gemini-adapter
 ---
 
-The Google Gemini adapter provides access to Google's Gemini models, including text generation, embeddings, and image generation with Imagen.
+The Google Gemini adapter provides access to Google's Gemini models, including text generation, embeddings, image generation with Imagen, and experimental text-to-speech.
 
 ## Installation
 
@@ -75,6 +75,10 @@ const adapter = createGeminiText(process.env.GEMINI_API_KEY!, config);
 - `imagen-3.0-generate-002` - Imagen 3.0
 - `gemini-2.0-flash-preview-image-generation` - Gemini with image generation
 
+### Text-to-Speech Models (Experimental)
+
+- `gemini-2.5-flash-preview-tts` - Gemini TTS
+
 ## Example: Chat Completion
 
 ```typescript
@@ -269,6 +273,27 @@ const result = await ai({
 });
 ```
 
+## Text-to-Speech (Experimental)
+
+> **Note:** Gemini TTS is experimental and may require the Live API for full functionality.
+
+Generate speech from text:
+
+```typescript
+import { ai } from "@tanstack/ai";
+import { geminiTTS } from "@tanstack/ai-gemini";
+
+const adapter = geminiTTS();
+
+const result = await ai({
+  adapter,
+  model: "gemini-2.5-flash-preview-tts",
+  text: "Hello from Gemini TTS!",
+});
+
+console.log(result.audio); // Base64 encoded audio
+```
+
 ## Environment Variables
 
 Set your API key in environment variables:
@@ -340,6 +365,18 @@ Creates a Gemini image generation adapter with an explicit API key.
 
 **Returns:** A Gemini image adapter instance.
 
+### `geminiTTS(config?)`
+
+Creates a Gemini TTS adapter using environment variables.
+
+**Returns:** A Gemini TTS adapter instance.
+
+### `createGeminiTTS(apiKey, config?)`
+
+Creates a Gemini TTS adapter with an explicit API key.
+
+**Returns:** A Gemini TTS adapter instance.
+
 ## Next Steps
 
 - [Getting Started](../getting-started/quick-start) - Learn the basics

diff --git a/docs/adapters/openai.md b/docs/adapters/openai.md
@@ -3,7 +3,7 @@ title: OpenAI Adapter
 id: openai-adapter
 ---
 
-The OpenAI adapter provides access to OpenAI's models, including GPT-4o, GPT-5, embeddings, and image generation (DALL-E).
+The OpenAI adapter provides access to OpenAI's models, including GPT-4o, GPT-5, embeddings, image generation (DALL-E), text-to-speech (TTS), and audio transcription (Whisper).
 
 ## Installation
 
@@ -77,6 +77,18 @@ const adapter = createOpenaiText(process.env.OPENAI_API_KEY!, config);
 - `gpt-image-1` - Latest image generation model
 - `dall-e-3` - DALL-E 3
 
+### Text-to-Speech Models
+
+- `tts-1` - Standard TTS (fast)
+- `tts-1-hd` - High-definition TTS
+- `gpt-4o-audio-preview` - GPT-4o with audio output
+
+### Transcription Models
+
+- `whisper-1` - Whisper large-v2
+- `gpt-4o-transcribe` - GPT-4o transcription
+- `gpt-4o-mini-transcribe` - GPT-4o Mini transcription
+
 ## Example: Chat Completion
 
 ```typescript
@@ -267,6 +279,83 @@ const result = await ai({
 });
 ```
 
+## Text-to-Speech
+
+Generate speech from text:
+
+```typescript
+import { ai } from "@tanstack/ai";
+import { openaiTTS } from "@tanstack/ai-openai";
+
+const adapter = openaiTTS();
+
+const result = await ai({
+  adapter,
+  model: "tts-1",
+  text: "Hello, welcome to TanStack AI!",
+  voice: "alloy",
+  format: "mp3",
+});
+
+// result.audio contains base64-encoded audio
+console.log(result.format); // "mp3"
+```
+
+### TTS Voices
+
+Available voices: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`, `ash`, `ballad`, `coral`, `sage`, `verse`
+
+### TTS Provider Options
+
+```typescript
+const result = await ai({
+  adapter: openaiTTS(),
+  model: "tts-1-hd",
+  text: "High quality speech",
+  providerOptions: {
+    speed: 1.0, // 0.25 to 4.0
+  },
+});
+```
+
+## Transcription
+
+Transcribe audio to text:
+
+```typescript
+import { ai } from "@tanstack/ai";
+import { openaiTranscription } from "@tanstack/ai-openai";
+
+const adapter = openaiTranscription();
+
+const result = await ai({
+  adapter,
+  model: "whisper-1",
+  audio: audioFile, // File object or base64 string
+  language: "en",
+});
+
+console.log(result.text); // Transcribed text
+```
+
+### Transcription Provider Options
+
+```typescript
+const result = await ai({
+  adapter: openaiTranscription(),
+  model: "whisper-1",
+  audio: audioFile,
+  providerOptions: {
+    response_format: "verbose_json", // Get timestamps
+    temperature: 0,
+    prompt: "Technical terms: API, SDK",
+  },
+});
+
+// Access segments with timestamps
+console.log(result.segments);
+```
+
 ## Environment Variables
 
 Set your API key in environment variables:
@@ -331,6 +420,30 @@ Creates an OpenAI image generation adapter with an explicit API key.
 
 **Returns:** An OpenAI image adapter instance.
 
+### `openaiTTS(config?)`
+
+Creates an OpenAI TTS adapter using environment variables.
+
+**Returns:** An OpenAI TTS adapter instance.
+
+### `createOpenaiTTS(apiKey, config?)`
+
+Creates an OpenAI TTS adapter with an explicit API key.
+
+**Returns:** An OpenAI TTS adapter instance.
+
+### `openaiTranscription(config?)`
+
+Creates an OpenAI transcription adapter using environment variables.
+
+**Returns:** An OpenAI transcription adapter instance.
+
+### `createOpenaiTranscription(apiKey, config?)`
+
+Creates an OpenAI transcription adapter with an explicit API key.
+
+**Returns:** An OpenAI transcription adapter instance.
+
 ## Next Steps
 
 - [Getting Started](../getting-started/quick-start) - Learn the basics

diff --git a/docs/config.json b/docs/config.json
@@ -69,6 +69,14 @@
         {
           "label": "Per-Model Type Safety",
           "to": "guides/per-model-type-safety"
+        },
+        {
+          "label": "Text-to-Speech",
+          "to": "guides/text-to-speech"
+        },
+        {
+          "label": "Transcription",
+          "to": "guides/transcription"
         }
       ]
     },