Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Text to speech mixin #405

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from
Draft

[WIP] Text to speech mixin #405

wants to merge 23 commits into from

Conversation

duwudi
Copy link
Contributor

@duwudi duwudi commented Jul 28, 2021

Closes #376

Note on dependencies

pyfestival Python dependency is provided here into our apt repository, with a patch that is needed in order for import to work on RPi. Workflow run that added it into the apt repo is here. There should be no need to keep https://github.com/pi-top/pyfestival/ at this point.

Example Usage

from pitop import Pitop

pt = Pitop()
pt.speak("Hello")

# change voice to British English (default is "us")
pt.speak.set_voice("english")

# non-blocking voice request
pt.speak("Hello", blocking=False)

Architecture

Uses a generalized object factory that registers backend services we wish to make available - currently we only have "DEFAULT" and "FESTIVAL" but in future we'll ideally have "GOOGLE", "AMAZON" etc. The builders will accept the kwargs relevant to that service and ignore the rest, this should make it flexible when we add new services that require API keys etc.

Options for changing tts backend

One downside to this approach is that the speech service can't be changed easily from the Pitop() class, one way to do it is as follows:

from pitop import Pitop
from pitop.processing import tts

my_speech_service = speech.services.get("GOOGLE")

pitop = Pitop()
pitop.speech = my_speech_service 

Alternatively, we could pass in the speech service to Pitop class:

pitop = Pitop(tts_service_id ="FESTIVAL")

But it feels a shame to add parameters to the Pitop() class. We could also use a class method:

pitop = Pitop.using_speech_service("FESTIVAL")

which is probably a cleaner approach.

To do

  1. Use a factory pattern for allowing different speech backends to be swapped in as it's very likely we'll want to change the TTS engine in future (for compatibility or general improvement)
  2. Discuss if speech service settings will be included in the Pitop.from_config() usage
  3. Fix SIOD ERROR: the currently assigned stack limit has been exceeded error when using the non-blocking mode (thread)

@codecov
Copy link

codecov bot commented Jul 28, 2021

Codecov Report

Merging #405 (5807d2c) into master (854e835) will decrease coverage by 3.97%.
The diff coverage is 60.00%.

❗ Current head 5807d2c differs from pull request most recent head ef0a566. Consider uploading reports for the commit ef0a566 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #405      +/-   ##
==========================================
- Coverage   58.56%   54.58%   -3.98%     
==========================================
  Files          72       81       +9     
  Lines        2756     3270     +514     
==========================================
+ Hits         1614     1785     +171     
- Misses       1142     1485     +343     
Flag Coverage Δ
unittests 54.58% <60.00%> (-3.98%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pitop/system/pitop.py 50.00% <50.00%> (-7.15%) ⬇️
pitop/core/mixins/supports_speech.py 57.14% <57.14%> (ø)
pitop/core/mixins/__init__.py 100.00% <100.00%> (ø)
pitop/robotics/simple_pid/PID.py
pitop/robotics/simple_pid/__init__.py
pitop/miniscreen/oled/core/__init__.py 100.00% <0.00%> (ø)
pitop/miniscreen/buttons/buttons.py 47.69% <0.00%> (ø)
pitop/miniscreen/oled/oled.py 33.46% <0.00%> (ø)
pitop/miniscreen/miniscreen.py 48.33% <0.00%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b4d30d...ef0a566. Read the comment docs.

@duwudi duwudi changed the title Text to speech mixin [WIP] Text to speech mixin Aug 10, 2021
@duwudi duwudi requested a review from angusjfw August 11, 2021 08:24
@@ -21,3 +21,4 @@ spidev python3-spidev; PEP386
systemd_python python3-systemd; PEP386
wget python3-wget; PEP386
pyzmq python3-zmq; PEP386
pyfestival python3-pt-pyfestival
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pyfestival python3-pt-pyfestival
pyfestival python3-pyfestival; PEP386

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP386 flag will make the dependency versioned, which we want in most cases.

@@ -82,6 +82,9 @@ Replaces:
pt-device-manager (<< 4.0.0),
# pt-oled
python3-pt-oled (<< 3.0.0),
# Festival speech engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently under Replaces: field, which is definitely not what you want to be doing.

Are you trying to make this a core dependency, an optional dependency or a dependency of the full SDK?

##################
# Text to Speech #
##################
"pt-pyfestival",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"pt-pyfestival",
"pyfestival",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not install correctly for people installing via Python on RPi for now, but we want tackle that later on.

@m-roberts
Copy link
Contributor

I am not convinced with the design pattern. Why should the user need to know anything about festival? Why does the user need to handle the importing and management of TTS directly?

from pitop import Pitop

pitop = Pitop()
pitop.speak("Hello World")
# Another option might be a 'speaker' property of the pi-top:
# pitop.speaker.speak("Hello World")

seems like a much more sensible way to approach this.

@m-roberts
Copy link
Contributor

This is currently failing to build the package as there is a spelling mistake in this commit's message that is propagating into the snapshot version changelog.

This draft PR provides an easy way for all package builds to ignore changelog spelling mistakes for snapshot builds with a simple boolean input. This hasn't been tested, and this PR is a good test of this functionality.

We should update this PR to use this experimental branch until it passes, then merge in and build against this new master commit. Once this is working here, we can update ALL package build workflows to ignore typos for snapshot builds moving forwards.

@duwudi
Copy link
Contributor Author

duwudi commented Aug 25, 2021

I am not convinced with the design pattern. Why should the user need to know anything about festival? Why does the user need to handle the importing and management of TTS directly?

from pitop import Pitop

pitop = Pitop()
pitop.speak("Hello World")
# Another option might be a 'speaker' property of the pi-top:
# pitop.speaker.speak("Hello World")

seems like a much more sensible way to approach this.

Yeah this is exactly how it is now, though I do like the “speaker” idea as we can add more speaker-specific commands to that too (beeps, songs, pre-recorded voices etc)

@m-roberts
Copy link
Contributor

pyfestival is currently failing to build: https://github.com/pi-top/pi-topOS-Upstream-Package-Build/runs/3455683743?check_suite_focus=true

We will need to create a patch for the changes needed for this build to pass in CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add text-to-speech functionality
3 participants