Skip to content

Conversation

@ms-feizhao
Copy link

@ms-feizhao ms-feizhao commented Oct 21, 2025

What does this PR do?

Add Azure.Mcp.Tools.Speech tool azmcp_speech_tts_synthesize

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech

GitHub issue number?

#852

Pre-merge Checklist

  • Required for All PRs
    • Read contribution guidelines
    • PR title clearly describes the change
    • Commit history is clean with descriptive messages (cleanup guide)
    • Added comprehensive tests for new/modified functionality
    • Updated servers/Azure.Mcp.Server/CHANGELOG.md and/or servers/Fabric.Mcp.Server/CHANGELOG.md for product changes (features, bug fixes, UI/UX, updated dependencies)
  • For MCP tool changes:
    • One tool per PR: This PR adds or modifies only one MCP tool for faster review cycles
    • Updated servers/Azure.Mcp.Server/README.md and/or servers/Fabric.Mcp.Server/README.md documentation
    • Validate README.md changes using script at eng/scripts/Process-PackageReadMe.ps1. See Package README
    • Updated command list in /servers/Azure.Mcp.Server/docs/azmcp-commands.md and/or /docs/fabric-commands.md
    • For new or modified tool descriptions, ran ToolDescriptionEvaluator and obtained a score of 0.4 or more and a top 3 ranking for all related test prompts
    • For new tools associated with Azure services or publicly available tools/APIs/products, add URL to documentation in the PR description
  • Extra steps for Azure MCP Server tool changes:
    • Updated test prompts in /servers/Azure.Mcp.Server/docs/e2eTestPrompts.md
    • 👉 For Community (non-Microsoft team member) PRs:
      • Security review: Reviewed code for security vulnerabilities, malicious code, or suspicious activities before running tests (crypto mining, spam, data exfiltration, etc.)
      • Manual tests run: added comment /azp run mcp - pullrequest - live to run Live Test Pipeline

@ms-feizhao ms-feizhao requested a review from a team as a code owner October 21, 2025 10:18
Copilot AI review requested due to automatic review settings October 21, 2025 10:18
@ms-feizhao ms-feizhao requested a review from a team as a code owner October 21, 2025 10:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new Azure MCP tool azmcp_speech_tts_synthesize that enables text-to-speech synthesis using Azure AI Services Speech. The tool converts text to audio files with configurable language, voice, format, and custom voice model support.

Key Changes

  • Added TTS synthesis command with comprehensive parameter validation
  • Implemented streaming-based audio synthesis for efficient memory management
  • Added extensive unit and live tests for various synthesis scenarios

Reviewed Changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/Azure.Mcp.Tools.Speech/src/Commands/Tts/TtsSynthesizeCommand.cs New command implementation for TTS synthesis with validation and error handling
tools/Azure.Mcp.Tools.Speech/src/Services/SpeechService.cs Core TTS synthesis logic with streaming support and error handling
tools/Azure.Mcp.Tools.Speech/src/Services/ISpeechService.cs Interface extension for TTS synthesis method
tools/Azure.Mcp.Tools.Speech/src/Options/Tts/TtsSynthesizeOptions.cs Options class for TTS synthesis parameters
tools/Azure.Mcp.Tools.Speech/src/Options/SpeechOptionDefinitions.cs Command-line option definitions for TTS parameters
tools/Azure.Mcp.Tools.Speech/src/Models/SynthesisResult.cs Result model for TTS synthesis output
tools/Azure.Mcp.Tools.Speech/src/Commands/SpeechJsonContext.cs JSON serialization context updates
tools/Azure.Mcp.Tools.Speech/src/SpeechSetup.cs Registration of TTS command group
tools/Azure.Mcp.Tools.Speech/tests/Azure.Mcp.Tools.Speech.UnitTests/Tts/TtsSynthesizeCommandTests.cs Comprehensive unit tests for TTS command
tools/Azure.Mcp.Tools.Speech/tests/Azure.Mcp.Tools.Speech.LiveTests/SpeechCommandTests.cs Live integration tests for TTS functionality
servers/Azure.Mcp.Server/docs/azmcp-commands.md Documentation for TTS command usage
servers/Azure.Mcp.Server/docs/e2eTestPrompts.md E2E test prompts for TTS scenarios
servers/Azure.Mcp.Server/README.md README updates describing TTS capabilities
eng/tools/ToolDescriptionEvaluator/prompts.json Test prompts for tool description evaluation

@joshfree joshfree added server-Azure.Mcp Azure.Mcp.Server tools-Speech Do Not Merge Do Not Merge / WIP PRs labels Oct 21, 2025
@joshfree joshfree moved this from Untriaged to In Progress in Azure MCP Server Oct 21, 2025
@joshfree joshfree added this to the 2025-11 milestone Oct 21, 2025
@joshfree
Copy link
Member

Do Not Merge - we're holding new features until after October 28th when the branch opens for new 2.0-beta.x work.

@dilin-MS2
Copy link
Contributor

Hi @joshfree , this PR is for Ignite. We can hold it until October 28th to merge it. But it would be the best if you can start reviewing the PR so we can merge it after Oct 28th. Thanks!

@ms-feizhao
Copy link
Author

Hi @joshfree @alzimmermsft , could you help review the PR? we'd like to publish before Ignite if possible. Thanks!

@joshfree joshfree removed the Do Not Merge Do Not Merge / WIP PRs label Oct 30, 2025
@ms-feizhao ms-feizhao requested a review from a team as a code owner November 4, 2025 07:48
@ms-feizhao
Copy link
Author

Hi @joshfree , the PR is updated to fix conflicts, could you help review?

Thanks.

@ms-feizhao ms-feizhao enabled auto-merge (squash) November 5, 2025 02:58
@ms-feizhao
Copy link
Author

Hi @joshfree @alzimmermsft , could you help review the PR? Thanks.

@joshfree
Copy link
Member

joshfree commented Nov 7, 2025

@dilin-MS2 please review your team mates PR. Thanks

@joshfree
Copy link
Member

joshfree commented Nov 7, 2025

This PR doesn't look like it is rebased from main correctly. It shows many unrelated edits for AI Foundry tools

Copy link
Member

@joshfree joshfree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.

Please double check you're not including unrelated edits

]
},
{
"name": "list",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this included in the PR if you're adding 1 new speech tool?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is changed by the ToolDescriptionEvaluator tool after my local running.

```bash
# Synthesize speech from text and save to an audio file using Azure AI Services Speech
# ❌ Destructive | ✅ Idempotent | ❌ OpenWorld | ❌ ReadOnly | ❌ Secret | ✅ LocalRequired
azmcp speech tts synthesize --endpoint <endpoint> \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangyan99 please review. This doesn't look like the file was generated, it instead looks hand-edited and I'm assuming this will break the next time the file is generated?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are auto-generated.

"""
Convert text to speech using Azure AI Services Speech. This command takes text input and generates an audio file using advanced neural text-to-speech capabilities.
You must provide an Azure AI Services endpoint (e.g., https://your-service.cognitiveservices.azure.com/), the text to convert, and an output file path.
Optional parameters include language specification (default: en-US), voice selection, audio output format (default: Riff24Khz16BitMonoPcm), and custom voice endpoint ID.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested with other tool descriptions which teach the LLM the rest of the optional parameters? Eg more locale examples more encoding examples

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, tested with different locales and formats.

Comment on lines +79 to +82
var supportedExtensions = new HashSet<string>
{
".wav", ".mp3", ".ogg", ".raw"
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this into a static field

// Validate output file path
if (string.IsNullOrWhiteSpace(fileValue))
{
commandResult.AddError("Output file path cannot be empty.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there also be a check for the file already existing? I don't want to support the ability to overwrite local files at this time.

Also based on destructive=false in metadata overwriting a local file would require that being true. Which again, I don't really want to support yet.

context.Response.Status = HttpStatusCode.OK;
context.Response.Message = "Speech synthesis completed successfully.";
context.Response.Results = ResponseResult.Create(
new TtsSynthesizeCommandResult(result),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new TtsSynthesizeCommandResult(result),
new(result),

// Parse and validate the JSON result
var jsonResult = JsonDocument.Parse(resultText);
var resultObject = jsonResult.RootElement;
Assert.True(resultObject.TryGetProperty("result", out var resultProperty));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When requiring a JSON property use AssertProperty

Suggested change
Assert.True(resultObject.TryGetProperty("result", out var resultProperty));
var resultProperty = resultObject.AssertProperty("result");

This will provide better debugging information if property retrieval fails.

Comment on lines +44 to +64
[Fact]
public void Constructor_WithValidLogger_ShouldCreateInstance()
{
var command = new TtsSynthesizeCommand(_logger);
Assert.NotNull(command);
Assert.Equal("synthesize", command.Name);
}

[Fact]
public void Properties_ShouldHaveExpectedValues()
{
Assert.Equal("synthesize", _command.Name);
Assert.Equal("Synthesize Speech from Text", _command.Title);
Assert.NotEmpty(_command.Description);
Assert.False(_command.Metadata.Destructive);
Assert.True(_command.Metadata.Idempotent);
Assert.False(_command.Metadata.OpenWorld);
Assert.False(_command.Metadata.ReadOnly);
Assert.True(_command.Metadata.LocalRequired);
Assert.False(_command.Metadata.Secret);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these tests, they aren't very useful and will make maintenance more painful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants