Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add /api/speech endpoint #680

Merged
merged 8 commits into from
Apr 15, 2024
Merged

Add /api/speech endpoint #680

merged 8 commits into from
Apr 15, 2024

Conversation

cecheta
Copy link
Collaborator

@cecheta cecheta commented Apr 15, 2024

Closes #501, required by #101

Purpose

This PR adds a new /api/speech endpoint, which uses the Speech Key to return a short-lived token that the frontend can use to call the speech service.
The frontend calls the endpoint every time the microphone button is pressed.
When using RBAC, the speech key is fetched directly from Azure as it is not present on any env var.

https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/js/browser/README.md#token-exchange-process

The PR also refactors some of the tests/mocking in test_app.py ,so the mocking of the env_helper is done in one place.

Does this introduce a breaking change?

[ ] Yes
[x] No

Pull Request Type

What kind of change does this Pull Request introduce?

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

How to Test

  • Deploy application with Keys + with RBAC

What to Check

Verify that the following are valid

  • Test that speech-to-text is working

Copy link

github-actions bot commented Apr 15, 2024

Coverage

Coverage Report •
FileStmtsMissCoverMissing
code
   create_app.py148397%199, 204, 327
code/backend/batch/utilities/helpers
   EnvHelper.py114992%188, 193–194, 197–199, 206–208
TOTAL172483351% 

Tests Skipped Failures Errors Time
72 0 💤 0 ❌ 0 🔥 10.305s ⏱️

infra/main.bicep Outdated Show resolved Hide resolved
@ross-p-smith
Copy link
Collaborator

I think you've just made the same comment - but maybe some co-ordination with #660. It feels like maybe this goes in first and shows how to get config? What are your thoughts?

@cecheta
Copy link
Collaborator Author

cecheta commented Apr 15, 2024

I think you've just made the same comment - but maybe some co-ordination with #660. It feels like maybe this goes in first and shows how to get config? What are your thoughts?

Actually, this PR removes the ability to fetch config from the /api/config endpoint, but now I'm thinking it would actually be better to leave that in, so it can be used in #660

Will update

@superhindupur
Copy link
Collaborator

@cecheta so we're on the same page:

  • I'm now convinced that injecting the env variable at buildtime into the frontend typescript app in Make the speech recognizer's supported languages configurable. Required by #317 #660 is overly complicating the build-time dependencies.
  • I agree that /api/config is a good place to add the speech languages
  • Should I go ahead and remove the azureSpeechKey from the /api/config endpoint in my PR so that we don't return the key from there?

@cecheta
Copy link
Collaborator Author

cecheta commented Apr 15, 2024

@cecheta so we're on the same page:

  • I'm now convinced that injecting the env variable at buildtime into the frontend typescript app in Bhindupur/speech as input #660 is overly complicating the build-time dependencies.
  • I agree that /api/config is a good place to add the speech languages
  • Should I go ahead and remove the azureSpeechKey from the /api/config endpoint in my PR so that we don't return the key from there?

Sounds good, although I was going to try to remove the secret from the config response in #683 , however I don't mind if you remove it in yours since I haven't started yet

infra/main.bicep Outdated Show resolved Hide resolved
@superhindupur
Copy link
Collaborator

@cecheta and I chatted offline - the conclusion was that we will rename the /api/config to /api/speech and return the token from there. Then I will add the languages to it under #660

@cecheta cecheta changed the title Add /api/speech/token endpoint Add /api/speech endpoint Apr 15, 2024
adamdougal
adamdougal previously approved these changes Apr 15, 2024
Copy link
Collaborator

@adamdougal adamdougal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice

ross-p-smith
ross-p-smith previously approved these changes Apr 15, 2024
Copy link
Collaborator

@ross-p-smith ross-p-smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

@cecheta cecheta dismissed stale reviews from ross-p-smith and adamdougal via a8e5dd5 April 15, 2024 14:50
superhindupur
superhindupur previously approved these changes Apr 15, 2024
@cecheta cecheta added this pull request to the merge queue Apr 15, 2024
Merged via the queue into main with commit 92f9557 Apr 15, 2024
5 checks passed
@cecheta cecheta deleted the cecheta/speech-token branch April 15, 2024 14:58
eduardogch pushed a commit to devopsdale/chat-with-your-data-solution-accelerator that referenced this pull request Apr 30, 2024
* Add `/api/speech/token` endpoint

* Re-introduce `/api/config`

* Move env vars

* Remove `AZURE_SUBSCRIPTION_ID`

* Log error

* Rename to `/api/speech`, remove `/api/config`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speech from UI does not work when using RBAC
4 participants