Release: Amazon Transcribe Streaming Client #26

jonathan343 · 2025-11-03T21:15:00Z

This PR adds a new client for the Amazon Transcribe Streaming Service.

Summary

Add support for the Amazon Transcribe Streaming service by adding the latest Smithy model and code-generating a new client package (aws-sdk-transcribe-streaming).

Changes

Add Smithy model: Added transcribe-streaming.json model definition to codegen/aws-models/
Generate client package: Install the latest code generator from https://github.com/smithy-lang/smithy-python

Testing

Change into the transcribe directory: cd clients/aws-sdk-transcribe-streaming
Create a virtual environment:uv venv -p 3.14
Activate the virtual environment: source .venv/bin/activate
Install the transcribe client and sounddevice: uv pip install . sounddevice
Copy AWS creds to my environment (I'm using the EnvironmentCredentialsResolver)
Run the following example:

import asyncio
import sounddevice

from smithy_aws_core.identity import EnvironmentCredentialsResolver
from smithy_core.aio.interfaces.eventstream import EventPublisher, EventReceiver
from aws_sdk_transcribe_streaming.client import (
    TranscribeStreamingClient,
    StartStreamTranscriptionInput,
)
from aws_sdk_transcribe_streaming.models import (
    AudioStreamAudioEvent,
    AudioEvent,
    TranscriptEvent,
    AudioStream,
    TranscriptResultStream,
)
from aws_sdk_transcribe_streaming.config import Config

AWS_REGION = "us-west-2"
ENDPOINT_URI = f"https://transcribestreaming.{AWS_REGION}.amazonaws.com"


async def mic_stream():
    # This function wraps the raw input stream from the microphone forwarding
    # the blocks to an asyncio.Queue.
    loop = asyncio.get_event_loop()
    input_queue = asyncio.Queue()

    def callback(indata, frame_count, time_info, status):
        loop.call_soon_threadsafe(input_queue.put_nowait, (bytes(indata), status))

    # Be sure to use the correct parameters for the audio stream that matches
    # the audio formats described for the source language you'll be using:
    # https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html
    stream = sounddevice.RawInputStream(
        channels=1,
        samplerate=16000,
        callback=callback,
        blocksize=1024 * 2,
        dtype="int16",
    )
    # Initiate the audio stream and asynchronously yield the audio chunks
    # as they become available.
    with stream:
        while True:
            indata, status = await input_queue.get()
            yield indata, status


class TranscriptResultStreamHandler:
    def __init__(self, stream: EventReceiver[TranscriptResultStream]):
        self.stream = stream

    async def handle_events(self):
        """Process generic incoming events from Amazon Transcribe
        and delegate to appropriate sub-handlers.
        """
        async for event in self.stream:
            if isinstance(event.value, TranscriptEvent):
                await self.handle_transcript_event(event.value)

    async def handle_transcript_event(self, event: TranscriptEvent):
        # This handler can be implemented to handle transcriptions as needed.
        # Here's an example to get started.
        results = event.transcript.results
        for result in results:
            for alt in result.alternatives:
                print(alt.transcript)


async def write_chunks(audio_stream: EventPublisher[AudioStream]):
    # This connects the raw audio chunks generator coming from the microphone
    # and passes them along to the transcription stream.
    async for chunk, _ in mic_stream():
        await audio_stream.send(
            AudioStreamAudioEvent(value=AudioEvent(audio_chunk=chunk))
        )


async def main():
    client = TranscribeStreamingClient(
        config=Config(
            endpoint_uri=ENDPOINT_URI,
            region=AWS_REGION,
            aws_credentials_identity_resolver=EnvironmentCredentialsResolver(),
        )
    )

    stream = await client.start_stream_transcription(
        input=StartStreamTranscriptionInput(
            language_code="en-US", media_sample_rate_hertz=16000, media_encoding="pcm"
        )
    )
    _, output_stream = await stream.await_output()

    handler = TranscriptResultStreamHandler(output_stream)
    print("Start talking to see transcription!")
    print("===================================")
    await asyncio.gather(write_chunks(stream.input_stream), handler.handle_events())
    await stream.close()


if __name__ == "__main__":
    asyncio.run(main())

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

clients/aws-sdk-transcribe-streaming/docs/models/__init__.py

clients/aws-sdk-transcribe-streaming/src/aws_sdk_transcribe_streaming/client.py

nateprewitt · 2025-11-03T21:53:47Z

clients/aws-sdk-transcribe-streaming/src/aws_sdk_transcribe_streaming/client.py

+        * ``sample-rate``
+
+        For more information on streaming with Amazon Transcribe, see `Transcribing streaming audio <https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html>`_
+        .


This newline also seems potentially wrong?

In this case, our code-generator chooses to go over the line limit so the link isn't broken. It then decides to put all remaining characters on a newline. We can potentially special case the .. But that can probably be addressed with smithy-lang/smithy-python#571

clients/aws-sdk-transcribe-streaming/src/aws_sdk_transcribe_streaming/config.py

nateprewitt · 2025-11-03T22:01:52Z

clients/aws-sdk-transcribe-streaming/tests/test_protocol.py

+class RequestTestHTTPClient:
+    """An asynchronous HTTP client solely for testing purposes."""
+
+    def __init__(self, *, client_config: HTTPClientConfiguration | None = None):
+        self._client_config = client_config
+
+    async def send(
+        self,
+        request: HTTPRequest,
+        *,
+        request_config: HTTPRequestConfiguration | None = None,
+    ) -> HTTPResponse:
+        # Raise the exception with the request object to bypass actual request handling
+        raise TestHttpServiceError(request)


Did we generate this? If not, we should be using the fixtures from @alexgromero's PR.

Yeah this is generated. I asked Jordon about this file and whether it was intentional to generate this in codegen/protocol-test and in the generated clients. He mentioned "It's intentional in that any service can have protocol tests. But the actual file could be omitted if there’s no trait"

jonathan343 added 5 commits October 21, 2025 23:55

Add transcribe-streaming smithy model

6aeb8cb

Add aws-sdk-transcribe-streaming client

cff1fef

update version in conf.py

6d1fa13

update to the latest transcribe-streaming model

f5e75ae

minor changelog wording update

e461109

jonathan343 requested a review from a team as a code owner November 3, 2025 21:15

nateprewitt reviewed Nov 3, 2025

View reviewed changes

clients/aws-sdk-transcribe-streaming/docs/models/__init__.py Outdated Show resolved Hide resolved

nateprewitt suggested changes Nov 3, 2025

View reviewed changes

jonathan343 added 2 commits November 5, 2025 21:36

remove init files under docs

09cb9a1

Docstring formatting

7c4b843

nateprewitt approved these changes Nov 7, 2025

View reviewed changes

jonathan343 merged commit 278c7ca into develop Nov 7, 2025
2 checks passed

jonathan343 deleted the transcribe-streaming branch November 7, 2025 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release: Amazon Transcribe Streaming Client #26

Release: Amazon Transcribe Streaming Client #26

Uh oh!

jonathan343 commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nateprewitt Nov 3, 2025

Uh oh!

jonathan343 Nov 6, 2025

Uh oh!

Uh oh!

nateprewitt Nov 3, 2025

Uh oh!

alexgromero Nov 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Release: Amazon Transcribe Streaming Client #26

Release: Amazon Transcribe Streaming Client #26

Uh oh!

Conversation

jonathan343 commented Nov 3, 2025

Summary

Changes

Testing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nateprewitt Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

jonathan343 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nateprewitt Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

alexgromero Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexgromero Nov 3, 2025 •

edited

Loading