Prompt based GUI and terminal automation #3463
Replies: 5 comments
-
For such assistant to become usable, it shall be both real-time and always-learning. Real-time means it will listen to the user, monitor feedback from computer and react accordingly. Always-learning means it will watch and imitate user's actions, ask questions to users and search online to learn more. |
Beta Was this translation helpful? Give feedback.
-
Indeed, that would be very cool. Efforts like this exist, e.g. https://robotme.org/ and we'll probably not get into this in the first version, but in subsequent versions, this is definitely on the table! |
Beta Was this translation helpful? Give feedback.
-
Sounds a bit like speech recognition software (e.g. Dragon NaturallySpeaking) that can do specific actions like clicking somewhere, opening programs or dictating text, but combined with an intent recognition like the current voice assistants (Alexa, Siri, ...), but more flexible what it can understand. The end product could be an app that runs in the background of your pc or smartphone and you can talk to it. Ask any question and command it do to stuff on the computer for you. Here is my research for "Linux Voice Interface": https://pad.nixnet.services/d1W89tL8Qj6-65-UJcp5SA?view Especially check out Almond aka Genie from Stanford. Maybe you can collaborate with them to create an Open Source, Privacy-Preserving Voice Assistant. Also integration with Home Assistant would be great. |
Beta Was this translation helpful? Give feedback.
-
Now it is been partly implemented, and as part of my ideology, the project Cybergod has been released. Here's the program in action: cybergod_with_background.mp4If anyone interested in Cybergod, please join official discord group. |
Beta Was this translation helpful? Give feedback.
-
Developed a terminal interaction environment for agents, capable of converting all info from terminal into meaningful text, including cursor and styling information. Terminal environment can be captured as image with cursor denoted in red: OpenDevin is working on this right now. |
Beta Was this translation helpful? Give feedback.
-
I always want to make a bot to execute GUI and terminal tasks like human, such as "check and cleanup disks", "make a funny video and upload to youtube", "edit and test this bash script till it is bug-free", "talk to people on twitter and post ads".
Of course these tasks can be done by domain specific software, but since ChatGPT shows promising capabilities, and Open-Assistant is working on it, I wonder if it can target human-level computer operations to become a real killer assistant.
Beta Was this translation helpful? Give feedback.
All reactions