Skip to content

Conversation

@jonathan343
Copy link
Contributor

This PR adds a new client for the Amazon Transcribe Streaming Service.

Summary

Add support for the Amazon Transcribe Streaming service by adding the latest Smithy model and code-generating a new client package (aws-sdk-transcribe-streaming).

Changes

Testing

  • Change into the transcribe directory: cd clients/aws-sdk-transcribe-streaming
  • Create a virtual environment:uv venv -p 3.14
  • Activate the virtual environment: source .venv/bin/activate
  • Install the transcribe client and sounddevice: uv pip install . sounddevice
  • Copy AWS creds to my environment (I'm using the EnvironmentCredentialsResolver)
  • Run the following example:
import asyncio
import sounddevice

from smithy_aws_core.identity import EnvironmentCredentialsResolver
from smithy_core.aio.interfaces.eventstream import EventPublisher, EventReceiver
from aws_sdk_transcribe_streaming.client import (
    TranscribeStreamingClient,
    StartStreamTranscriptionInput,
)
from aws_sdk_transcribe_streaming.models import (
    AudioStreamAudioEvent,
    AudioEvent,
    TranscriptEvent,
    AudioStream,
    TranscriptResultStream,
)
from aws_sdk_transcribe_streaming.config import Config

AWS_REGION = "us-west-2"
ENDPOINT_URI = f"https://transcribestreaming.{AWS_REGION}.amazonaws.com"


async def mic_stream():
    # This function wraps the raw input stream from the microphone forwarding
    # the blocks to an asyncio.Queue.
    loop = asyncio.get_event_loop()
    input_queue = asyncio.Queue()

    def callback(indata, frame_count, time_info, status):
        loop.call_soon_threadsafe(input_queue.put_nowait, (bytes(indata), status))

    # Be sure to use the correct parameters for the audio stream that matches
    # the audio formats described for the source language you'll be using:
    # https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html
    stream = sounddevice.RawInputStream(
        channels=1,
        samplerate=16000,
        callback=callback,
        blocksize=1024 * 2,
        dtype="int16",
    )
    # Initiate the audio stream and asynchronously yield the audio chunks
    # as they become available.
    with stream:
        while True:
            indata, status = await input_queue.get()
            yield indata, status


class TranscriptResultStreamHandler:
    def __init__(self, stream: EventReceiver[TranscriptResultStream]):
        self.stream = stream

    async def handle_events(self):
        """Process generic incoming events from Amazon Transcribe
        and delegate to appropriate sub-handlers.
        """
        async for event in self.stream:
            if isinstance(event.value, TranscriptEvent):
                await self.handle_transcript_event(event.value)

    async def handle_transcript_event(self, event: TranscriptEvent):
        # This handler can be implemented to handle transcriptions as needed.
        # Here's an example to get started.
        results = event.transcript.results
        for result in results:
            for alt in result.alternatives:
                print(alt.transcript)


async def write_chunks(audio_stream: EventPublisher[AudioStream]):
    # This connects the raw audio chunks generator coming from the microphone
    # and passes them along to the transcription stream.
    async for chunk, _ in mic_stream():
        await audio_stream.send(
            AudioStreamAudioEvent(value=AudioEvent(audio_chunk=chunk))
        )


async def main():
    client = TranscribeStreamingClient(
        config=Config(
            endpoint_uri=ENDPOINT_URI,
            region=AWS_REGION,
            aws_credentials_identity_resolver=EnvironmentCredentialsResolver(),
        )
    )

    stream = await client.start_stream_transcription(
        input=StartStreamTranscriptionInput(
            language_code="en-US", media_sample_rate_hertz=16000, media_encoding="pcm"
        )
    )
    _, output_stream = await stream.await_output()

    handler = TranscriptResultStreamHandler(output_stream)
    print("Start talking to see transcription!")
    print("===================================")
    await asyncio.gather(write_chunks(stream.input_stream), handler.handle_events())
    await stream.close()


if __name__ == "__main__":
    asyncio.run(main())

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jonathan343 jonathan343 requested a review from a team as a code owner November 3, 2025 21:15
* ``sample-rate``

For more information on streaming with Amazon Transcribe, see `Transcribing streaming audio <https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html>`_
.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This newline also seems potentially wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, our code-generator chooses to go over the line limit so the link isn't broken. It then decides to put all remaining characters on a newline. We can potentially special case the .. But that can probably be addressed with smithy-lang/smithy-python#571

Comment on lines +19 to +32
class RequestTestHTTPClient:
"""An asynchronous HTTP client solely for testing purposes."""

def __init__(self, *, client_config: HTTPClientConfiguration | None = None):
self._client_config = client_config

async def send(
self,
request: HTTPRequest,
*,
request_config: HTTPRequestConfiguration | None = None,
) -> HTTPResponse:
# Raise the exception with the request object to bypass actual request handling
raise TestHttpServiceError(request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we generate this? If not, we should be using the fixtures from @alexgromero's PR.

Copy link
Contributor

@alexgromero alexgromero Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is generated. I asked Jordon about this file and whether it was intentional to generate this in codegen/protocol-test and in the generated clients. He mentioned "It's intentional in that any service can have protocol tests. But the actual file could be omitted if there’s no trait"

@jonathan343 jonathan343 merged commit 278c7ca into develop Nov 7, 2025
2 checks passed
@jonathan343 jonathan343 deleted the transcribe-streaming branch November 7, 2025 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants