Skip to content

deeptuneai/deeptune-python

Repository files navigation

Deeptune Python Library

fern shield pypi

The official Python API for Deeptune. Deeptune brings the most human-like text to speech and voice cloning to your project in only a few lines of code.

📖 API & Docs

Check out our documentation.

Installation

pip install deeptune

Usage

Instantiate and use the client with the following:

from deeptune.client import Deeptune
from deeptune.utils import play

client = Deeptune(
    api_key="YOUR_API_KEY",
)

audio = client.text_to_speech.generate(
    text="Wow, Deeptune's text to speech API is amazing!",
    voice="d770a0d0-d7b0-4e52-962f-1a41d252a5f6",
)
play(audio)

Using Prompt Audio

If you prefer to manage voices on your own, you can use your own audio file as a reference for the voice clone.

Using a URL prompt

from deeptune.client import Deeptune
from deeptune.utils import play

client = Deeptune(
    api_key="YOUR_API_KEY",
)

audio = client.text_to_speech.generate_from_prompt(
    text="Wow, Deeptune's text to speech API is amazing!",
    prompt_audio="https://deeptune-demo.s3.amazonaws.com/Michael.wav",
)
play(audio)

Using a file prompt

import base64
from deeptune.client import Deeptune
from deeptune.utils import play

client = Deeptune(
    api_key="YOUR_API_KEY",
)

# Open the file and read its contents as bytes
with open("Michael.wav", "rb") as audio_file:
    audio_bytes = audio_file.read()

# Encode the bytes to base64
audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")
audio = client.text_to_speech.generate_from_prompt(
    text="Wow, Deeptune's text to speech API is amazing!",
    prompt_audio=f"data:audio/wav;base64,{audio_base64}",
)
play(audio)

Voices

You can also store and manage voices inside of Deeptune.

# Get all available voices
voices = client.voices.list()
print(voices)

# Get a specific voices
voice = client.voices.get(voice_id="d770a0d0-d7b0-4e52-962f-1a41d252a5f6")
print(voice)

# Create a new cloned voice
voice = client.voices.create(
    name="Cool Name",
    file=open("./Michael.wav", "rb")
)
print(voice)

# Update an existing voice
voice = client.voices.update(
    voice_id=voice.id,
    name="Updated Name",
    file=open("./Michael.wav", "rb"),
)
print(voice)

# Delete an existing voice
client.voices.delete(voice.id)

Saving the output

Saving manually

The generate and generate_from_prompt endpoints return an iterator of bytes. Make sure to get all of the bytes before writing as demonstrated below.

audio = client.text_to_speech.generate(
    text="Wow, Deeptune's text to speech API is amazing!",
    voice="d770a0d0-d7b0-4e52-962f-1a41d252a5f6",
)
audio_bytes = b"".join(audio)

# Now, you can save however you'd like
with open("output.mp3", "wb") as audio_file:
    audio_file.write(audio_bytes)

Using built in utils

The also has inbuilt play, save, and stream utility methods. Under the hood, these methods use ffmpeg and mpv to play audio streams.

from deeptune.utils import play, save, stream

# plays audio using ffmpeg
play(audio)
# streams audio using mpv
stream(audio)
# saves audio to file
save(audio, "my-file.mp3")

Async Client

The SDK also exports an async client so that you can make non-blocking calls to our API.

from deeptune.client import Deeptune
from deeptune.utils import play

client = AsyncDeeptune(
    api_key="YOUR_API_KEY",
)

audio = await client.text_to_speech.generate_from_prompt(
    text="Wow, Deeptune's text to speech API is amazing!",
    voice="d770a0d0-d7b0-4e52-962f-1a41d252a5f6",
)
play(audio)

Contributing

While we value open-source contributions to this SDK, this library is generated programmatically. Additions made directly to this library would have to be moved over to our generation code, otherwise they would be overwritten upon the next generated release. Feel free to open a PR as a proof of concept, but know that we will not be able to merge it as-is. We suggest opening an issue first to discuss with us!

On the other hand, contributions to the README are always very welcome!