Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add text-to-speech beta samples #1421

Merged
merged 9 commits into from
Mar 26, 2018
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion scripts/prepare-testing-project.sh
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ echo "Creating speech resources."
gsutil cp speech/api-client/resources/audio.raw gs://$GCLOUD_PROJECT/speech/

echo "To finish setup, follow this link to enable APIs."
echo "https://console.cloud.google.com/flows/enableapi?project=${GCLOUD_PROJECT}&apiid=bigtable.googleapis.com,bigtableadmin.googleapis.com,bigquery,bigquerydatatransfer.googleapis.com,cloudmonitoring,compute_component,datastore,datastore.googleapis.com,dataproc,dns,plus,pubsub,logging,storage_api,vision.googleapis.com"
echo "https://console.cloud.google.com/flows/enableapi?project=${GCLOUD_PROJECT}&apiid=bigtable.googleapis.com,bigtableadmin.googleapis.com,bigquery,bigquerydatatransfer.googleapis.com,cloudmonitoring,compute_component,datastore,datastore.googleapis.com,dataproc,dns,plus,pubsub,logging,storage_api,texttospeech.googleapis.com,vision.googleapis.com"
167 changes: 167 additions & 0 deletions texttospeech/cloud-client/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
.. This file is automatically generated. Do not edit this file directly.

Google Cloud Text-to-Speech API Python Samples
===============================================================================

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/README.rst


This directory contains samples for Google Cloud Text-to-Speech API. The `Google Cloud Text To Speech API`_ enables you to generate and customize synthesized speech from text or SSML.




.. _Google Cloud Text-to-Speech API: https://cloud.google.com/text-to-speech/docs/

Setup
-------------------------------------------------------------------------------


Authentication
++++++++++++++

This sample requires you to have authentication setup. Refer to the
`Authentication Getting Started Guide`_ for instructions on setting up
credentials for applications.

.. _Authentication Getting Started Guide:
https://cloud.google.com/docs/authentication/getting-started

Install Dependencies
++++++++++++++++++++

#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions.

.. _Python Development Environment Setup Guide:
https://cloud.google.com/python/setup

#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

.. code-block:: bash

$ virtualenv env
$ source env/bin/activate

#. Install the dependencies needed to run the samples.

.. code-block:: bash

$ pip install -r requirements.txt

.. _pip: https://pip.pypa.io/
.. _virtualenv: https://virtualenv.pypa.io/

Samples
-------------------------------------------------------------------------------

Quickstart
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/quickstart.py;/README.rst




To run this sample:

.. code-block:: bash

$ python quickstart.py


List voices
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/list_voices.py;/README.rst




To run this sample:

.. code-block:: bash

$ python list_voices.py


Synthesize text
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/synthesize_text.py;/README.rst




To run this sample:

.. code-block:: bash

$ python synthesize_text.py

usage: synthesize_text.py [-h] (--text TEXT | --ssml SSML)

Google Cloud Text-To-Speech API sample application .

Example usage:
python synthesize_text.py --text "hello" --output hello.mp3
python synthesize_text.py --ssml "<?xml..." --output hello.mp3

optional arguments:
-h, --help show this help message and exit
--text TEXT The text from which to synthesize speech.
--ssml SSML The ssml string from which to synthesize speech.



Synthesize file
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/synthesize_file.py;/README.rst




To run this sample:

.. code-block:: bash

$ python synthesize_file.py

usage: synthesize_file.py [-h] (--text TEXT | --ssml SSML)

Google Cloud Text-To-Speech API sample application .

Example usage:
python synthesize_file.py --text_file resources/hello.txt
python synthesize_file.py --ssml_file resources/hello.ssml

optional arguments:
-h, --help show this help message and exit
--text TEXT The text file from which to synthesize speech.
--ssml SSML The ssml file from which to synthesize speech.





The client library
-------------------------------------------------------------------------------

This sample uses the `Google Cloud Client Library for Python`_.
You can read the documentation for more details on API usage and use GitHub
to `browse the source`_ and `report issues`_.

.. _Google Cloud Client Library for Python:
https://googlecloudplatform.github.io/google-cloud-python/
.. _browse the source:
https://github.com/GoogleCloudPlatform/google-cloud-python
.. _report issues:
https://github.com/GoogleCloudPlatform/google-cloud-python/issues


.. _Google Cloud SDK: https://cloud.google.com/sdk/
26 changes: 26 additions & 0 deletions texttospeech/cloud-client/README.rst.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# This file is used to generate README.rst

product:
name: Google Cloud Text-to-Speech API
short_name: Cloud TTS API
url: https://cloud.google.com/text-to-speech/docs/
description: >
The `Google Cloud Text To Speech API`_ enables you to generate and customize synthesized speech from text or SSML.

setup:
- auth
- install_deps

samples:
- name: Quickstart
file: quickstart.py
- name: List voices
file: list_voices.py
- name: Synthesize text
file: synthesize_text.py
show_help: True
- name: Synthesize file
file: synthesize_file.py
show_help: True

cloud_client_library: true
55 changes: 55 additions & 0 deletions texttospeech/cloud-client/list_voices.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/usr/bin/env python

# Copyright 2018 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Google Cloud Text-To-Speech API sample application.

Example usage:
python list_voices.py
"""


# [START tts_list_voices]
def list_voices():
"""Lists the available voices."""
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()

# Performs the list voices request
voices = client.list_voices()

for voice in voices.voices:
# Display the voice's name. Example: tpc-vocoded
print('Name: {}'.format(voice.name))

# Display the supported language codes for this voice. Example: "en-US"
for language_code in voice.language_codes:
print('Supported language: {}'.format(language_code))

# Names of SSML voice genders from google.cloud.texttospeech.enums
ssml_voice_genders = ['SSML_VOICE_GENDER_UNSPECIFIED', 'MALE',
'FEMALE', 'NEUTRAL']

# Display the supported SSML - gender for this voice. Example: FEMALE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

# SSML Voice Gender values from google.cloud.texttospeech.enums
ssml_voice_genders = ['SSML_VOICE_GENDER_UNSPECIFIED', 'MALE',
                      'FEMALE', 'NEUTRAL']

# Display the SSML Voice Gender
print('SSML Voice Gender: {}'.format(ssml_voice_genders[voice.ssml_gender]))

(And apply same to Java)

Simply making it as clear as possible that the values used map to the SSML spec, see Voice element here: https://www.w3.org/TR/speech-synthesis11/#g13

Copy link
Contributor

@beccasaurus beccasaurus Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit

It's good as is! The only thing I really would like to change is this:

the supported SSML - gender for this voice

Rather than "SSML - gender" which separates the 2, say "SSML gender" to be clear that the gender is SSML defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

print('SSML gender: {}'.format(ssml_voice_genders[voice.ssml_gender]))

# Display the natural sample rate hertz for this voice. Example: 24000
print('Natural Sample Rate Hertz: {}\n'.format(
voice.natural_sample_rate_hertz))
# [END tts_list_voices]


if __name__ == '__main__':
list_voices()
23 changes: 23 additions & 0 deletions texttospeech/cloud-client/list_voices_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright 2018, Google, Inc.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import list_voices


def test_list_voices(capsys):
list_voices.list_voices()
out, err = capsys.readouterr()

assert 'en-US' in out
assert 'SSML gender: MALE' in out
assert 'SSML gender: FEMALE' in out
61 changes: 61 additions & 0 deletions texttospeech/cloud-client/quickstart.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/usr/bin/env python

# Copyright 2018 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Google Cloud Text-To-Speech API sample application .

Example usage:
python quickstart.py
"""


def run_quickstart():
# [START tts_quickstart]
"""Synthesizes speech from the input string of text or ssml.

Note: ssml must be well-formed according to:
https://www.w3.org/TR/speech-synthesis/
"""
from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(language_code='en-US',
ssml_gender='NEUTRAL')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssml_gender should have its own enums, please use enums to specify this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)

# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
# [END tts_quickstart]


if __name__ == '__main__':
run_quickstart()
1 change: 1 addition & 0 deletions texttospeech/cloud-client/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
google-cloud-texttospeech==0.1.0
7 changes: 7 additions & 0 deletions texttospeech/cloud-client/resources/hello.ssml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<?xml version="1.0"?>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify this :)

For out samples, let's just use simple SSML:

<speak>Hello there.</speak>

Per request from product :)

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US">
Hello there.
</speak>
1 change: 1 addition & 0 deletions texttospeech/cloud-client/resources/hello.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Hello there!
Loading