Your support is very much appreciated! star on GitHub
Project.Demo.mp4
- About the Project
- Source Code Directory Structure
- Usage
- Getting Started
- Prerequisites
- Package Manager
- Environment Requirements
- Run for Development
- FAQ
- Future Work
- Community
- Contributors
- License
MA
stands for My Assistant. The aim of this project is to develop a programmable voice assistant desktop application
that provides users with a highly customizable and extensible interface to interact with
their devices. Our goal is to provide a voice assistant that can be tailored to the needs and
preferences of individual users, and that is optimized for desktop and laptop computers.
To achieve this aim, we have set the following objectives:
- Develop a voice assistant that is highly customizable and extensible, allowing users to add their own commands and actions based on their needs and preferences.
- Optimize the voice assistant for desktop and laptop computers, providing users with a convenient and intuitive interface to interact with their devices.
- Provide users with a seamless and integrated experience by enabling the voice assistant to interact with other tools and applications on the desktop.
- Ensure the voice assistant is secure and respects user privacy by implementing robust data privacy and security measures.
π¨ Color Reference
Build a customizable virtual voice assistant. The assistant should provide users with more control and flexibility over its features and functionality rather than limiting them to preset options. Allow users to define their own automation scenarios and workflows. Users should be able to craft new commands tailored to their unique needs and preferences. Include traditional voice assistant features. In addition to offering more customization options, the assistant should support all the features of a traditional Voice assistance
- Customization: Customize the assistant's behavior and capabilities to suit individual needs and workflows.
- Flexibility: Design complex automation scenarios and workflows beyond predefined actions.
- Extensibility: Integrate with external services and APIs to enhance functionality.
- Open-source Community: Benefit from community-contributed resources for expanded capabilities.
- Privacy and Security: Host locally for data control and end-to-end encryption.
- Learning and Exploration: Gain insights into AI and voice-based interaction systems through hands-on experience.
The idea of voice assistants has been around for decades, with the first voice recognition system being introduced in the 1950s by Bell Laboratories. However, it was not until the late 1990s that voice assistants began to gain popularity with the introduction of IBMβs βVia Voiceβ and βDragon NaturallySpeakingβ software. These early systems were limited by their inability to recognize natural language, their high cost, and the need for specialized hardware. In recent years, voice assistants have become increasingly prevalent due to the widespread adoption of smartphones and the emergence of smart speakers. Appleβs Siri, Amazonβs Alexa, and Google Assistant are some of the most popular voice assistants today. These assistants allow users to interact with their devices using natural language, perform tasks such as setting reminders, playing music, and controlling smart home devices.
-
- Traditional voice assistants provide a predefined set of automation options that are often limited and generic in nature. These assistants offer a restricted range of actions or tasks that can be performed, limiting their usefulness in addressing diverse user needs. Users are confined to the predefined set of commands and actions, without the ability to tailor or expand the assistant's capabilities to match their specific requirements.
-
- Another drawback of traditional voice assistants is the lack of customization options. Users have limited control over modifying or enhancing the assistant's features to align with their preferences and unique needs. The inability to personalize or customize the assistant's behavior hinders its ability to adapt to individual users' workflows or specific requirements, limiting its overall utility.
Our voice assistant addresses the limitations of limited customization found in traditional voice assistants by providing users with extensive customization and personalization options. The key features of our solution include:
-
- Our voice assistant empowers users to create their own automation scenarios and complex workflows, tailored to their specific needs. Users have the flexibility to define custom commands and actions, enabling them to automate repetitive tasks and streamline their workflows effectively.
-
- We offer an intuitive and user-friendly interface that simplifies the process of creating custom commands. Users can easily set up simple phrases or triggers that activate the desired automation, without the need for advanced technical knowledge.
-
- To further enhance customization options, our voice assistant includes a comprehensive Commands library. Users can access a collection of pre-built automation commands created by both other users and our core team. This allows users to reuse existing commands, leverage community-contributed automations, and easily expand the capabilities of their voice assistant.
- Account Creation and Login:
- Users can create an account securely to access personalized features, command management, and interaction with the application.
- The system allows users to log in with their credentials, maintaining user authentication throughout the session.
- Create New Commands:
- The system allows users to log in with their credentials, maintaining user authentication throughout the session.
- Metadata information includes command name, description, parameters, patterns, script, script type, dependency file, and command icon.
- Uploaded files (script, dependency, icon) are validated, saved, and linked to the command.
- Edit Existing Commands:
- Users can edit existing commands by modifying their metadata or uploading new files.
- The system updates the command accordingly, including retraining the user model or regenerating the executable.
- Delete Commands:
- Users can easily delete their commands, and the system handles necessary cleanup tasks, such as removing the executable file and updating the user model.
- Command Approval Workflow:
- Users can submit their commands for approval by an admin to make them available in the marketplace.
- The admin reviews and approves/rejects the command, updating its visibility accordingly.
- Users receive appropriate feedback regarding the approval status over awesome notifications service.
- My Command Table:
- Users can view a table displaying all commands they own, with options to edit and delete each command.
- The table is visually organized and user-friendly, supporting sorting and filtering options.
- Marketplace Command Installation:
- Users can seamlessly install commands from the marketplace.
- The command is added to the user's installed commands list, and the corresponding executable file is downloaded.
- The system handles all necessary tasks, such as updating the user model and installing dependencies.
- Users receive appropriate feedback regarding the installation status over awesome notifications service.
- Uninstall Installed Commands:
- Users can uninstall commands they no longer need, and the system handles confirmation and cleanup tasks.
- The command is removed from the user's installed commands list, and the corresponding executable file is deleted.
- The system handles all necessary tasks, such as updating the user model and removing dependencies.
- Users receive appropriate feedback regarding the uninstallation status over awesome notifications service.
- User Interaction: Users interact with your voice assistant through a desktop app with a user-friendly interface.
- Voice Input: Users can record voice commands using the app's microphone feature or enter text commands if they prefer.
- Speech-to-Text (STT) Conversion: The recorded voice commands are sent to the Speech-to-Text engine, which converts the audio input into text.
- Natural Language Processing (NLP): The text input is processed by the Natural Language Processing (NLP) module, powered by the Rasa framework. The NLP module extracts intent and entities from the user's input, understanding the user's request.
- Command Mapping: The NLP module maps the user's intent to specific commands available in the system, determining the appropriate action to be taken.
- Command Execution: Based on the command mapping, the system executes the corresponding action or task, such as opening an application, performing a specific operation, or retrieving information.
- Text-to-Speech (TTS) Conversion: Upon completing the requested task, the response is sent to the Text-to-Speech engine, converting the text into an audible response.
- Response Playback: The voice assistant plays back the response to the user, providing real-time feedback on the executed action.
- Customization and Personalization: Your voice assistant stands out by allowing users to create, edit, and manage their own commands, adding a high level of customization and personalization to the user experience.
- Integration with Marketplace: The app features a marketplace where users can browse and install commands created by others, extending the assistant's capabilities through community-contributed resources.
- Approval Workflow: Users can submit their custom commands for admin approval. The admin reviews and approves or rejects the command, updating its visibility accordingly.
- Data Security and Privacy: Your voice assistant prioritizes data security and privacy. The application is self-hosted, ensuring user data remains on the user's device, and end-to-end encryption is applied for secure interactions.
Workflow:
- The core components of the system are the Desktop App, which serves as the user-facing interface, and the API, which acts as the central component handling communication between various components and external services. The NLP Manager is responsible for natural language processing, while the Executable Builder generates executable files for the commands. The system also integrates with Google's Speech-to-Text and Text-to-Speech APIs for voice-based interactions.
The tools used in this project.
Tool | Description | |
---|---|---|
ElectronJS | A framework for building cross-platform desktop applications using web technologies. | |
Angular | Platform for building dynamic web applications. | |
SASS | CSS preprocessor for creating scalable and maintainable styles. | |
PrimeNG | UI component library to enhance the visual and interactive aspects of the application. | |
Angular Material | UI component library that follows Google's Material Design guidelines. | |
Rasa Framework | Framework for natural language processing to understand user commands and interactions. | |
YAML | YAML library used for automating the training process. | |
Webhook | A way for two applications to communicate with each other by sending HTTP requests. | |
Multitenancy | A way to allow multiple users to share the same application without interfering with each other. | |
Model Linguistic Feature | A feature that is used to represent the linguistic content of a text. | |
Google Calendar API | API for interacting with Google Calendar to schedule events. | |
Google Cloud | Cloud platform used for hosting and deploying the application. | |
Redis | In-memory data store used for caching and performance optimization. | |
Django | Web framework used for the backend server and database management. | |
Async Channels | A library that allows you to create asynchronous communication channels in Django. | |
PostgreSQL | Relational database management system used for data storage. | |
Daphne | ASGI server used to deploy Django applications. |
A quick look at the top-level files and directories:
.
βββ electronApp
βΒ Β βββ build
βΒ Β βββ CommandManger
βΒ Β βββ DB
βΒ Β βΒ Β βββ models
βΒ Β βΒ Β βββ queries
βΒ Β βββ scriptRunner
βΒ Β βββ stt
βΒ Β βββ textToScript
βΒ Β βΒ Β βββ models
βΒ Β βββ tray
βΒ Β βββ tts
βββ src
βΒ Β βββ app
βΒ Β βΒ Β βββ auth
βΒ Β βΒ Β βΒ Β βββ _helper
βΒ Β βΒ Β βΒ Β βββ interface
βΒ Β βΒ Β βΒ Β βββ pipes
βΒ Β βΒ Β βΒ Β βΒ Β βββ only-one-error
βΒ Β βΒ Β βΒ Β βββ register-component
βΒ Β βΒ Β βΒ Β βββ services
βΒ Β βΒ Β βΒ Β βΒ Β βββ auth-service
βΒ Β βΒ Β βΒ Β βΒ Β βββ not-match-validation
βΒ Β βΒ Β βΒ Β βββ user-card
βΒ Β βΒ Β βββ core
βΒ Β βΒ Β βΒ Β βββ services
βΒ Β βΒ Β βΒ Β βββ electron
βΒ Β βΒ Β βΒ Β βββ notification
βΒ Β βΒ Β βββ recorder
βΒ Β βΒ Β βΒ Β βββ components
βΒ Β βΒ Β βΒ Β βΒ Β βββ audio-visualizer
βΒ Β βΒ Β βΒ Β βΒ Β βββ chat
βΒ Β βΒ Β βΒ Β βΒ Β βββ home-page
βΒ Β βΒ Β βΒ Β βΒ Β βββ mic
βΒ Β βΒ Β βΒ Β βββ services
βΒ Β βΒ Β βΒ Β βββ rasa
βΒ Β βΒ Β βΒ Β βΒ Β βββ rasa.socket
βΒ Β βΒ Β βΒ Β βββ stt
βΒ Β βΒ Β βΒ Β βββ tts
βΒ Β βΒ Β βββ scripts-table
βΒ Β βΒ Β βΒ Β βββ components
βΒ Β βΒ Β βΒ Β βΒ Β βββ command-management
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ abstract-commands
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ create-command-form
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ parameter-field
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ edit-command-form
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ installed-commands
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ installed-commands-service
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ my-commands
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ my-command-service
βΒ Β βΒ Β βΒ Β βΒ Β βββ marketplace-component
βΒ Β βΒ Β βΒ Β βΒ Β βββ card-preview
βΒ Β βΒ Β βΒ Β βΒ Β βββ command-card
βΒ Β βΒ Β βΒ Β βββ interfaces
βΒ Β βΒ Β βΒ Β βββ services
βΒ Β βΒ Β βββ shared
βΒ Β βΒ Β βΒ Β βββ components
βΒ Β βΒ Β βΒ Β βΒ Β βββ google-token
βΒ Β βΒ Β βΒ Β βΒ Β βββ loader
βΒ Β βΒ Β βΒ Β βΒ Β βββ modal
βΒ Β βΒ Β βΒ Β βΒ Β βββ notifications
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ interfaces
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ notification-card
βΒ Β βΒ Β βΒ Β βΒ Β βΒ Β βββ notification-list
βΒ Β βΒ Β βΒ Β βΒ Β βββ sidebar
βΒ Β βΒ Β βΒ Β βββ directives
βΒ Β βΒ Β βΒ Β βΒ Β βββ webview
βΒ Β βΒ Β βΒ Β βββ snackbar-service
βΒ Β βΒ Β βββ tray
βΒ Β βββ assets
βΒ Β βΒ Β βββ fonts
βΒ Β βΒ Β βββ i18n
βΒ Β βΒ Β βββ icons
βΒ Β βββ environments
βββ stt
βΒ Β βββ temp
βββ test-files
-
- electronApp: contains all the source code for the electron app.
-
- src: contains all the source code for the angular app.
- app: contains all the source code for the app.
- auth: contains all the source code for the authentication module.
- core: contains all the source code for the core module.
- recorder: contains all the source code for the recorder module and components {audio-visualizer, chat, home-page, mic}.
- scripts-table: contains all the source code for the scripts-table module and components {command-management, marketplace-component}.
- shared: contains all the source code for the shared module and components {google-token, loader, modal, notifications, sidebar}.
- tray: contains all the source code for the tray module.
- assets: contains all the assets (ex. images, fonts...)
- environments: contains all the environment variables.
- app: contains all the source code for the app.
- src: contains all the source code for the angular app.
-
- stt: contains all the source code for the speech-to-text module.
-
- test-files: contains all the test files.
-
- Install the voice assistant application on your desktop or device.
-
- Launch the application and create a new account or log in securely with your credentials.
-
- Customize your voice assistant by creating new commands. Provide metadata such as name, description, and patterns for each command.
π This project uses npm
as a package manager
npm install npm@latest -g
π» Angular CLI: Install the Angular Command Line Interface (CLI) globally on your system. You can do this by running the following command in your terminal or command prompt:
npm install -g @angular/cli
π Electron: Install ElectronJS, which is used for building cross-platform desktop applications. You can install it globally via npm:
npm install -g electron
π¦ PrimeNG: Install PrimeNG, which is a collection of rich UI components for Angular. You can install it via npm:
npm install primeng --save
npm install primeicons --save
βοΈ Configuration Before running the application, you may need to configure some settings. Please refer to the configuration files or documentation provided with the project.
π» Running the App Now that you've installed the necessary dependencies and configured the project, you can run the app using Angular CLI.
ng serve
π Congratulations! You have successfully set up the project and can now explore and interact with the customizable voice assistant application. Happy coding!
_To run this project, you will need some requirements:_
-
Google Cloud bucket with the following files:
- google-credentials.json - google-token.json
- Clone the repository
git clone https://github.com/your-username/your-repo.git
- Install dependencies
npm install
- Run the app
npm run start
- Add Integration with Third-Party Services: Automate interactions with online platforms using voice commands.
- Expanded Language Support: Enable custom commands in preferred programming languages for users.
- Workflow and Visualization: Create workflows with multiple commands and intuitive visualization.
- Multilingual Support: Include Arabic and more languages for broader accessibility.
- Integration with LLM Models: Improve natural language understanding with LLM models.
The MA
community can be found on:
Where you can ask questions, suggest new ideas, and get support.
Mohamed Zaky π» |
Ahmad Eid π» |
Ahmad Bedeir π» |
Licensed under the GPL-v3 License.