Add input device for external STT via `ACTION_RECOGNIZE_SPEECH` #294

Stypox · 2025-02-26T23:31:40Z

I hate the ActivityForResultManager but I don't know if there is any other way to start an activity for result from an object that does not strictly belong to an activity.

Fixes part of #197 (the other part would be implementing yet another InputDevice with RecognitionService). This PR is anyway quite experimental as I didn't plan for the possibility to start the popup of another app to obtain input, but it seems to work.

References: #169 #293 woheller69/whisperIME#53 (comment). Note: with whisperIME all I get as speech is always just "you" independently of what I say, though I don't understand well how the whisperIME popup works, and I am not sure if my code is correct in the first place.

Please test this Dicio APK app-debug.apk along with this whisperIME APK woheller69/whisperIME#53 (comment) . Thank you @woheller69!

TODO:

Add an explanation in the settings, along with suggestions on apps to use

Edit: I tested this with FUTO Voice input and the code in this PR seems to be behaving correctly:

Screen_recording_20250227_005009.webm

Inhishonor · 2025-02-26T23:55:50Z

I downloaded the apk, and using it with whisperIME, it only outputs "you" every time. I also tested it with Futo Voice Input and it worked well.
A little extra feedback:

System Popup, to me seems a little confusing as the name for an option, maybe something like "System STT" or "External STT"
If you cancel the STT after clicking on the STT button but before you select a provider, it pops up an error message with Error Code 0, maybe it should just say kind of what Futo Voice Input outputs, and output "intent was canceled".
Dicio is appearing as an option and if you click on this, nothing happens. Is there away to exclude Dicio as an option? I saw something within the pull request about that.

Thanks for implementing this!

woheller69 · 2025-02-27T05:24:18Z

tested it. If you press and hold the button in whisper input while speaking it works.

woheller69 · 2025-02-27T09:04:41Z

in a updated version I moved the button to the bottom, like in the input method. That makes it easier to use

woheller69/whisperIME#53

woheller69 · 2025-02-27T16:15:03Z

When using an "external" voice input you should maybe not play the sound and display "listening". Otherwise this might interfere with the behavior of the external input. E.g. my whisper input does not use voice activity detection. In order to start it you have to press and hold the button, but your display already says "listening" before the user presses my mic button.
Instead of the sound I have a haptic feedback.

Stypox · 2025-02-28T12:02:31Z

Thank you for the feedback! New testing APK: https://github.com/Stypox/testing-apks/releases/download/14/app-debug.apk

System Popup, to me seems a little confusing as the name for an option, maybe something like "System STT" or "External STT"

I wanted to distinguish it from "System service" which is the other way to access other app's STT, which would feel seamless because it would run in the background. So one would be called "popup" and the other "service". But I will add an explanation to the settings to explain everything better anyway.

If you cancel the STT after clicking on the STT button but before you select a provider, it pops up an error message with Error Code 0, maybe it should just say kind of what Futo Voice Input outputs, and output "intent was canceled".

Fixed, now it just switches back to the loaded-not-listening state (which is what would happen with Vosk too).

Dicio is appearing as an option and if you click on this, nothing happens. Is there away to exclude Dicio as an option? I saw something within the pull request about that.

It turns out EXTRA_EXCLUDE_COMPONENTS only works on chooser intents, so I don't know if there is a way to exclude Dicio from choosing itself, unfortunately.

// Unfortunately the user could choose Dicio itself (starting SttPopupActivity), but there
// is no way to avoid this unless we use an Intent.createChooser() with
// `EXTRA_EXCLUDE_COMPONENTS`. A chooser, however, wouldn't allow the user to press
// "Always"/"Just once", and also wouldn't make it possible to check for availability like
// with .resolveIntent() (since the intent always resolves to the choooser).

When using an "external" voice input you should maybe not play the sound and display "listening". Otherwise this might interfere with the behavior of the external input. E.g. my whisper input does not use voice activity detection. In order to start it you have to press and hold the button, but your display already says "listening" before the user presses my mic button.

Done, now it shows "Waiting...". It still plays the "no input" sound if the request is canceled though, which I think is ok.

woheller69 · 2025-02-28T12:12:40Z

Done, now it shows "Waiting...". It still plays the "no input" sound if the request is canceled though, which I think is ok.

Much better

Stypox · 2025-02-28T16:53:25Z

Improved settings descriptions:

Inhishonor · 2025-02-28T17:17:00Z

Looks great! Thanks again!

paolo-caroni · 2025-03-02T21:49:01Z

I have tested it, seems to be functional, no bugs detected.

woheller69 · 2025-03-09T19:59:25Z

putExtra(RecognizerIntent.EXTRA_LANGUAGE, locale)

should probably be changed to

putExtra(RecognizerIntent.EXTRA_LANGUAGE, locale.toString())

This was referenced Feb 26, 2025

heads up : whisper is easier to run on android than ever #169

Open

Support external speech to text services #197

Open

Stypox added 5 commits February 28, 2025 12:25

Add SystemPopupInputDevice

430a572

Handle RESULT_CANCELED code in SystemPopupInputDevice

87c76e9

Fix not returing canceled result in SttPopupActivity

995ed9d

Add "waiting" state to STT

688b27c

The user could choose Dicio itself as STT but no way to avoid it

b0baa2b

Stypox force-pushed the system-popup-stt branch from 0e65023 to b0baa2b Compare February 28, 2025 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add input device for external STT via `ACTION_RECOGNIZE_SPEECH` #294

Add input device for external STT via `ACTION_RECOGNIZE_SPEECH` #294

Stypox commented Feb 26, 2025 •

edited

Loading

Inhishonor commented Feb 26, 2025 •

edited

Loading

woheller69 commented Feb 27, 2025

woheller69 commented Feb 27, 2025

woheller69 commented Feb 27, 2025

Stypox commented Feb 28, 2025

woheller69 commented Feb 28, 2025 •

edited

Loading

Stypox commented Feb 28, 2025

Inhishonor commented Feb 28, 2025

paolo-caroni commented Mar 2, 2025

woheller69 commented Mar 9, 2025

Add input device for external STT via ACTION_RECOGNIZE_SPEECH #294

Are you sure you want to change the base?

Add input device for external STT via ACTION_RECOGNIZE_SPEECH #294

Conversation

Stypox commented Feb 26, 2025 • edited Loading

Inhishonor commented Feb 26, 2025 • edited Loading

woheller69 commented Feb 27, 2025

woheller69 commented Feb 27, 2025

woheller69 commented Feb 27, 2025

Stypox commented Feb 28, 2025

woheller69 commented Feb 28, 2025 • edited Loading

Stypox commented Feb 28, 2025

Inhishonor commented Feb 28, 2025

paolo-caroni commented Mar 2, 2025

woheller69 commented Mar 9, 2025

Add input device for external STT via `ACTION_RECOGNIZE_SPEECH` #294

Add input device for external STT via `ACTION_RECOGNIZE_SPEECH` #294

Stypox commented Feb 26, 2025 •

edited

Loading

Inhishonor commented Feb 26, 2025 •

edited

Loading

woheller69 commented Feb 28, 2025 •

edited

Loading