Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unmute on hotword #73

Merged
merged 3 commits into from
Nov 23, 2023
Merged

Conversation

emphasize
Copy link
Member

Forces an unmute "force_unmute": True playing a sound in the audio callback when a hotword is detected. The counterpart PR OpenVoiceOS/ovos-audio#40 will be joining ovos-audio

This introduces another voice loop attribute: cmd_ready that is normally False and gets set when a sound is finished playing (or no sound payed at all). It gets resetted after a command is spoken. With that a continuation of the loop is granted while doing things that have to be done (in this case play sound - but could be X).

Both PRs take advantage of the muted flag when polling the volume: OpenVoiceOS/ovos-PHAL-plugin-alsa#22

@JarbasAl
Copy link
Member

JarbasAl commented Nov 5, 2023

what if ovos audio is not running or is an older version? we would block forever

i'm not so sure about cmd_ready.... perhaps we could make this configurable and/or add some timeout / check for ovos-audio running. but in general i don't think we want this to block on waiting for the sound playback

@emphasize
Copy link
Member Author

emphasize commented Nov 6, 2023

First off, it can be anything that has to be processed before the command should be recorded. (and it is pretty likely that future iteration add functionality)

This already (without waiting for cmd_ready) chips away from silence_seconds (and similar). I want them to be a pure, understandable variable that is true to its usage. The sound was crammed in because it just worked - this shouldn't be the standard.

@JarbasAl
Copy link
Member

JarbasAl commented Nov 6, 2023

First off, it can be anything that has to be processed before the command should be recorded. (and it is pretty likely that future iteration add functionality)

This already (without waiting for cmd_ready) chips away from silence_seconds (and similar). I want them to be a pure, understandable variable that is true to its usage. The sound was crammed in because it just worked - this shouldn't be the standard.

in that case cmd_ready should be set to True once STT has been handled and we start listening for ww again, not set by ovos-audio. you are supposed to be able to talk to your device even over audio playback, this is usually in the context of music but short sounds should be no different

I don't oppose the introduction of the variable per se, just the dependency on ovos-audio being the only thing that turns if back to True

@JarbasAl
Copy link
Member

JarbasAl commented Nov 6, 2023

I wouldn't oppose config options to change this behaviour either (defaulting to False), for example to disable this during TTS execution. it's another case i think we should be able to interrupt the voice assistant, but if it's configurable and theres a use case I'm all for more flexibility for downstream

@emphasize
Copy link
Member Author

emphasize commented Nov 6, 2023

in that case cmd_ready should be set to True once STT has been handled

i guess you mix things up here.
We can set it to False (= wait before command can be processed) after STT, but it is already toggled shortly after (AFTER_CMD) - actually directly after STT is processed

Before cmd_ready is set to True it loops caseless -dropping the queue chunks in the process- and the best indicator seems to be from audio when the sound is played (as a response msg). By now it is only audio processing being the blocker.

@emphasize
Copy link
Member Author

What are the suggestions around a config key?

@emphasize
Copy link
Member Author

emphasize commented Nov 8, 2023

But generally, yes, suboptimal.

The destination check is also problematic, since with a get_response the direction is reversed.
Ie. theoretically usable, but not practically.

@emphasize
Copy link
Member Author

emphasize commented Nov 22, 2023

Rewrote the implementation:

  • the (sound) confirmation is now classified as a ListeningState.CONFIRMATION, to make clear this is a state (also handier when dealing with the ovos-audio response)
  • a confirmation event get initialized in the hotword callback if a sound should be played
  • a 0.5 timer sets the event (in case it is not set until then from the ovos-audio response)
  • if in ListeningState.CONFIRMATION and confirmation event is set continue with ListeningState.BEFORE_CMD

if no sound is played the loop directly continues with ListeningState.BEFORE_CMD. Also continuous listening is not effected from the changes.

@JarbasAl JarbasAl added the enhancement New feature or request label Nov 23, 2023
@JarbasAl JarbasAl merged commit 782282c into OpenVoiceOS:dev Nov 23, 2023
9 checks passed
@github-actions github-actions bot mentioned this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants