Speech recognition privacy issues and solutions #99
Labels
Discussion topic
Topic discussed at the workshop
User's Perspective
Machine Learning Experiences on the Web: A User's Perspective
Milestone
The Wreck a Nice Beach in the Browser: Getting the Browser to Recognize Speech talk by @kdavis-mozilla articulates the standardization struggle around the Web Speech API with focus on its speech recognition part.
My interpretation is there are two broad categories of issues for this API in terms of speech recognition:
The slide 10 summarizes the pros/cons of placing the speech recognition engine on the client vs. server.
It seems the industry at large is still undecided whether the speech recognition engine should sit on the client or on the server. The Web Speech API spec reflects that compromise. While the API design issues are generally easier to resolve, the privacy issues with their regulatory dimension are multifaceted and complex.
Questions:
What if users could set a preference to only allow web sites to use the speech recognition feature if they can be confident their privacy is preserved? With advances in both DNN-based models and hardware accelerators for speech recognition embedded in modern clients, a client-side engine might be a pragmatic solution to the privacy issues?
How does a modern client-side engine perform in key UX metrics (latency, quality) in comparison to widely used server-based recognition solutions?
The text was updated successfully, but these errors were encountered: