RobotecAI · maciejmajek · Oct 22, 2024 · Oct 17, 2024 · Oct 22, 2024 · Oct 22, 2024
diff --git a/README.md b/README.md
@@ -1,10 +1,9 @@
 # RAI
 
 > [!IMPORTANT]  
-> **RAI is in beta phase now, expect friction. Early contributors are the most welcome!** \
-> **RAI is developing fast towards a glorious release in time for ROSCon 2024.**
+> **RAI is meant for R&D. Make sure to understand its limitations.**
 
-RAI is a flexible AI agent framework to develop and deploy Gen AI features for your robots.
+RAI is a flexible AI agent framework to develop and deploy Embodied AI features for your robots.
 
 ---
 
@@ -38,38 +37,43 @@ The RAI framework aims to:
 - Supply a general multi-agent system, bringing Gen AI features to your robots.
 - Add human interactivity, flexibility in problem-solving, and out-of-box AI features to existing robot stacks.
 - Provide first-class support for multi-modalities, enabling interaction with various data types.
-- Incorporate an advanced database for persistent agent memory.
-- Include ROS 2-oriented tooling for agents.
-- Support a comprehensive task/mission orchestrator.
+
+## Limitations
+
+- Limitations of LLMs and VLMs in use apply: poor spatial reasoning, hallucinations, jailbreaks, latencies, costs, ...
+- Resource use (memory, CPU) is not addressed yet.
+- Requires connectivity and / or an edge platform.
 
 ## Table of Contents
 
 - [Features](#features)
 - [Setup](#setup)
-- [Usage examples (demos)](#planned-demos)
+- [Usage examples (demos)](#simulation-demos)
 - [Developer resources](#developer-resources)
 - [ROSCon 2024 Talk](#roscon-2024)
 
 ## Features
 
 - [x] Voice interaction (both ways).
 - [x] Customizable robot identity, including constitution (ethical code) and documentation (understanding own capabilities).
-- [x] Accessing camera ("What do you see?") sensor, utilizing VLMs.
-- [x] Reasoning about its own state through ROS logs.
+- [x] Accessing camera ("What do you see?"), utilizing VLMs.
+- [x] Summarizing own state through ROS logs.
 - [x] ROS 2 action calling and other interfaces. The Agent can dynamically list interfaces, check their message type, and publish.
 - [x] Integration with LangChain to abstract vendors and access convenient AI tools.
 - [x] Tasks in natural language to nav2 goals.
-- [x] NoMaD integration.
+- [x] [NoMaD](https://general-navigation-models.github.io/nomad/) integration.
 - [x] Tracing.
-- [ ] Grounded SAM 2 integration.
-- [ ] Improved Human-Robot Interaction with voice and text.
+- [x] Grounded SAM 2 integration.
+- [x] Improved Human-Robot Interaction with voice and text.
+- [x] Additional tooling such as GroundingDino.
+- [x] Support for at least 3 different AI vendors.
 - [ ] SDK for RAI developers.
-- [ ] Support for at least 3 different AI vendors.
-- [ ] Additional tooling such as GroundingDino.
 - [ ] UI for configuration to select features and tools relevant for your deployment.
 
 ## Setup
 
+Before going further, make sure you have ROS 2 (Jazzy or Humble) installed and sourced on your system.
+
 ### 1. Setting up the workspace:
 
 #### 1.1 Install poetry
@@ -96,6 +100,13 @@ poetry install
 rosdep install --from-paths src --ignore-src -r -y
 ```
 
+> [!TIP]  
+> If you want to use features such as Grounded SAM 2 or NoMaD install additional dependencies:
+>
+> ```bash
+> poetry install --with openset,nomad
+> ```
+
 ### 2. Build the project:
 
 #### 2.1 Build RAI workspace
@@ -123,37 +134,6 @@ Pick your local solution or service provider and follow one of these guides:
 - **[OpenAI](https://platform.openai.com/docs/quickstart)**
 - **[AWS Bedrock](https://console.aws.amazon.com/bedrock/home?#/overview)**
 
-## Running RAI
-
-You are now ready to run RAI!
-
-![rosbot-xl-example](./docs/imgs/rosbot-xl-example.gif)
-
-You can start by running the following examples:
-
-1. **Hello RAI:** Interact directly with your ROS 2 environment through an intuitive Streamlit chat interface.
-2. **O3DE Husarion ROSbot XL demo"** give tasks to a simulated robot using natural language.
-
-### Hello RAI
-
-Chat seamlessly with your ROS 2 environment, retrieve images from cameras, adjust parameters, and get information about your ROS interfaces.
-
-```bash
-streamlit run src/rai_hmi/rai_hmi/text_hmi.py
-```
-
-Remember to run this command in a sourced shell.
-
-### O3DE Rosbot XL Demo
-
-This demo provides a practical way to interact with and control a virtual Husarion ROSbot XL within a simulated environment.
-Using natural language commands, you can assign tasks to the robot, allowing it to perform a variety of actions.
-
-Given that this is a beta release, consider this demo as an opportunity to explore the framework's capabilities, provide feedback, and contribute.
-Try different commands, see how the robot responds, and use this experience to understand the potential and limitations of the system.
-
-Follow this guide: [husarion-rosbot-xl-demo](docs/demos.md)
-
 ## What's next?
 
 Once you know your way around RAI, try the following challenges, with the aid the [developer guide](docs/developer_guide.md):
@@ -162,15 +142,16 @@ Once you know your way around RAI, try the following challenges, with the aid th
 - Implement additional tools and use them in your interaction.
 - Try a complex, multi-step task for your robot, such as going to several points to perform observations!
 
-Soon you will have an opportunity to work with new RAI demos across several domains.
-
-### Planned demos
+### Simulation demos
 
-| Application                                | Robot                          | Description                                                                                                                                      | Link                                                          |
-| ------------------------------------------ | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |
-| Mission and obstacle reasoning in orchards | Autonomous tractor             | In a beautiful scene of a virtual orchard, RAI goes beyond obstacle detection to analyze best course of action for a given unexpected situation. | [🌾 demo](https://github.com/RobotecAI/rai-agriculture-demo)  |
-| Manipulation tasks with natural language   | Robot Arm (Franka Panda)       | Complete flexible manipulation tasks thanks to RAI and Grounded SAM 2                                                                            | [🦾 demo](https://github.com/RobotecAI/rai-manipulation-demo) |
-| Quadruped inspection demo                  | A robot dog (ANYbotics ANYmal) | Perform inspection in a warehouse environment, find and report anomalies                                                                         | link TBD                                                      |
+Try RAI yourself with these demos:
+| Application | Robot | Description | Demo Link | Docs Link |
+| ------------------------------------------ | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- | -------------------------------- |
+| Mission and obstacle reasoning in orchards | Autonomous tractor | In a beautiful scene of a virtual orchard, RAI goes beyond obstacle detection to analyze best course of action for a given unexpected situation. | [🌾 demo](https://github.com/RobotecAI/rai-rosbot-xl-demo) | [📚](docs/demos/agriculture.md) |
+| Manipulation tasks with natural language | Robot Arm (Franka Panda) | Complete flexible manipulation tasks thanks to RAI and Grounded SAM 2 | [🦾 demo](https://github.com/RobotecAI/rai-manipulation-demo) | [📚](docs/demos/manipulation.md) |
+| Autonomous mobile robot demo | Husarion ROSbot XL | Demonstrate RAI's interaction with an autonomous mobile robot platform for navigation and control | [🤖 demo](https://github.com/RobotecAI/rai-rosbot-xl-demo) | [📚](docs/demos/rosbot_xl.md) |
+| Turtlebot demo | Turtlebot | Showcase RAI's capabilities with the popular Turtlebot platform | [🐢 demo](docs/demos/turtlebot.md) | [📚](docs/demos/turtlebot.md) |
+| Speech-to-speech interaction with autonomous taxi | Simulated car | Demonstrate RAI's speech-to-speech interaction capabilities for specifying destinations to an autonomous taxi in awsim with autoware environment | [🚕 demo](docs/demos/taxi.md) | [📚](docs/demos/taxi.md) |
 
 ## Community
 
@@ -185,12 +166,3 @@ See our [Developer Guide](docs/developer_guide.md) for a deeper dive into RAI, i
 ### Contributing
 
 You are welcome to contribute to RAI! Please see our [Contribution Guide](CONTRIBUTING.md).
-
-### RAI release and talk
-
-RAI will be released on **October 15th**, right before [ROSCon 2024](https://roscon.ros.org/2024/).
-If you are going to the conference, come join us at RAI talk on October 23rd.
-
-<p align="center">
-<img width="400" src="./docs/imgs/talk.png" />
-</p>
diff --git a/docs/create_robots_whoami.md b/docs/create_robots_whoami.md
@@ -9,7 +9,7 @@ Your robot's `whoami` package serves as a configuration package for the `rai_who
 > [!TIP]
 > The Human-Machine Interface (HMI), both text and voice versions, relies heavily on the whoami package. It uses the robot's identity, constitution, and documentation to provide context-aware responses and ensure the robot behaves according to its defined characteristics.
 
-## Example (Franka Emika Panda arm)
+## Configuration example - Franka Emika Panda arm
 
 1. Setup the repository using 1st and 2nd step from [Setup](../README.md#setup)
 

diff --git a/docs/demos/agriculture.md b/docs/demos/agriculture.md
@@ -10,16 +10,10 @@ This demo showcases autonomous tractors operating in an agricultural field using
 
    Download the latest binary release for your ROS 2 distribution.
 
-   - [ros2-humble-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/RAIAgricultureDemo_1.0.0_jammyhumble.zip)
-   - [ros2-jazzy-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/RAIAgricultureDemo_1.0.0_noblejazzy.zip)
+   - [ros2-humble-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIAgricultureDemo_1.0.0_jammyhumble.zip)
+   - [ros2-jazzy-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIAgricultureDemo_1.0.0_noblejazzy.zip)
 
-2. **Install Required Packages**
-
-   ```bash
-   sudo apt install ros-${ROS_DISTRO}-ackermann-msgs ros-${ROS_DISTRO}-gazebo-msgs ros-${ROS_DISTRO}-control-toolbox ros-${ROS_DISTRO}-nav2-bringup
-   ```
-
-3. **Unpack the Binary and Run the Simulation**
+2. **Unpack the Binary and Run the Simulation**
    Unpack the binary
 
    - For Jazzy:
@@ -39,7 +33,7 @@ This demo showcases autonomous tractors operating in an agricultural field using
    ./RAIAgricultureDemoGamePackage/RAIAgricultureDemo.GameLauncher -bg_ConnectToAssetProcessor=0
    ```
 
-4. **Start the Tractor Node**
+3. **Start the Tractor Node**
 
    ```bash
    python examples/agriculture-demo.py --tractor_number 1

diff --git a/docs/demos/manipulation.md b/docs/demos/manipulation.md
@@ -0,0 +1,3 @@
+# Manipulation tasks with natural language
+
+Work in progress.
diff --git a/docs/demos/rosbot_xl.md b/docs/demos/rosbot_xl.md
@@ -8,8 +8,8 @@ This demo utilizes Open 3D Engine simulation and allows you to work with RAI on
 
 1. Download the newest binary release:
 
-- Ubuntu 22.04 & ros2 humble: [link](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EU4kUlXRLShMo3PBYnyYFP0B-_pw1Vv6FcmqSQHiUbrhfw?e=qo2T9K)
-- Ubuntu 24.04 & ros2 jazzy: [link](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EaCcsJXzxqZFvOzmHnEAFQwBV89pRQ9yKQmSrVC-JYv2ug?e=s4ryDO)
+- Ubuntu 22.04 & ros2 humble: [link](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIROSBotDemo_1.0.0_jammyhumble.zip)
+- Ubuntu 24.04 & ros2 jazzy: [link](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIROSBotDemo_1.0.0_noblejazzy.zip)
 
 2. Install required packages
 
@@ -50,7 +50,7 @@ Please refer to [rai husarion rosbot xl demo][rai rosbot demo] for more details.
 2. Running rai nodes and agents, navigation stack and O3DE simulation.
 
    ```bash
-   ros2 launch ./examples/rosbotxl.launch.xml game_lanucher:=path/to/RARAIROSBotXLDemo.GameLauncher
+   ros2 launch ./examples/rosbotxl.launch.xml game_launcher:=path/to/RARAIROSBotXLDemo.GameLauncher
    ```
 
 3. Play with the demo, adding tasks to the RAI agent. Here are some examples:

diff --git a/docs/demos/taxi.md b/docs/demos/taxi.md
@@ -0,0 +1,3 @@
+# Speech-to-speech interaction with autonomous taxi
+
+Work in progress.
diff --git a/docs/human_robot_interface.md b/docs/human_robot_interface.md
@@ -1,22 +1,19 @@
 # RAI: Human-Robot Interaction
 
-You can utilize RAI Human-Robot Interaction (HRI) package to converse with your robots.
-This package allows you to simply chat with your robot, or to give it tasks and receive feedback and reports.
-You have the following options:
+RAI provides a Human-Robot Interaction (HRI) package that enables communication with your robots. This package allows you to chat with your robot, give it tasks, and receive feedback and reports. You have the following options for interaction:
 
-- [Voice communication](human_robot_interface/voice_interface.md) using ASR and TTS models ([OpenAI Whisper](https://openai.com/index/whisper/))
+- [Voice communication](human_robot_interface/voice_interface.md) using Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models
 - [Text communication](human_robot_interface/text_interface.md) using [Streamlit](https://streamlit.io)
 
-If your environment is noisy, voice communication might be tricky.
-In noisy environments, it is better to use text channel.
+Voice communication might be challenging in noisy environments. In such cases, it's recommended to use the text channel.
 
 ## How it works?
 
 ### General Architecture
 
 ![General HRI interface](./imgs/HRI_interface.png)
 
-The general architecture follows the diagram above. Text is captured from the input source, transported to the HMI, processed according to the given tools and robot's rules, and then sent to the output source.
+The general architecture follows the diagram above. Text is captured from the input source, transported to the Human-Machine Interface (HMI), processed according to the given tools and robot's rules, and then sent to the output source.
 
 ### Voice Interface
 

diff --git a/docs/human_robot_interface/text_interface.md b/docs/human_robot_interface/text_interface.md
@@ -1,12 +1,20 @@
-# Human Robot Interface via Streamlit
+# Human-Robot Interface via Streamlit
 
-## Running example
+## Running the Example
 
-When your robot's whoami package is ready, run the following:
+When your robot's whoami package is ready, run the following command:
 
 ```bash
-streamlit run src/rai_hmi/rai_hmi/text_hmi.py <my_robot_whoami> # eg rosbot_xl_whoami
+streamlit run src/rai_hmi/rai_hmi/text_hmi.py <my_robot_whoami> # e.g., rosbot_xl_whoami
 ```
 
 > [!NOTE]
-> Agent's responses can take longer time for complex tasks.
+> The agent's responses may take longer for complex tasks.
+
+## Customization
+
+Currently, customization capabilities are limited due to the internal API design. We are planning to deliver a solution for seamless expansion in the near future.
+
+If you want to customize the available tools, you can do so by editing the `src/rai_hmi/rai_hmi/agent.py` file.
+
+If you have a RaiStateBasedLlmNode running (see e.g., [examples/rosbot-xl-demo.py](examples/rosbot-xl-demo.py)), the Streamlit GUI will communicate with the running node via task_tools defined in the `rai_hmi/rai_hmi/agent.py` file.
diff --git a/docs/manipulation.md b/docs/manipulation.md
@@ -56,7 +56,7 @@ We explored various aspects of the dataset and training process, including:
 
 ### Challenges and Limitations
 
-During the experiments we encountered several obstacles and challanges:
+During the experiments we encountered several obstacles and challenges:
 
 1. **Computational Requirements**:
 

diff --git a/docs/vendors.md b/docs/vendors.md
@@ -39,22 +39,3 @@ llm = ChatBedrock(
     model="anthropic.claude-3-opus-20240229-v1:0",
 )
 ```
-
-# Caching
-
-## Redis
-
-`ScenarioRunner` supports Redis cache through langchain. Make sure to set
-
-```bash
-export REDIS_CACHE_HOST="redis://<host>"
-```
-
-Self hosting Redis:
-
-```bash
-docker run -p 6379:6379 -d redis:latest
-export REDIS_CACHE_HOST="redis://localhost:6379"
-```
-
-For more invormation see [redis.io/docs/latest/operate/oss_and_stack/install/install-stack/docker/](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/docker/)
diff --git a/examples/rosbot-xl-demo.py b/examples/rosbot-xl-demo.py
@@ -91,10 +91,10 @@ def main():
     - if you are asked to drive towards some object, it's good to:
         1. check the camera image and verify if objects can be seen
         2. if only driving forward is required, do it
-        3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your currect position, then very accurately estimate the goal pose.
+        3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your current position, then very accurately estimate the goal pose.
     - it is good to verify using given information if the robot is not stuck
     - navigation actions sometimes fail. Their output can be read from rosout. You can also tell if they partially worked by checking the robot position and rotation.
-    - before using any ros2 interfaces, always make sure to check you are usig the right interface
+    - before using any ros2 interfaces, always make sure to check you are using the right interface
     - processing camera image takes 5-10s. Take it into account that if the robot is moving, the information can be outdated. Handle it by good planning of your movements.
     - you are encouraged to use wait tool in between checking the status of actions
     - to find some object navigate around and check the surrounding area

diff --git a/pyproject.toml b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "rai"
-version = "0.1.0"
+version = "1.0.0"
 description = "RAI is a framework for building general multi-agent systems, bringing Gen AI features to ROS enabled robots."
 readme = "README.md"
 authors = ["Maciej Majek <maciej.majek@robotec.ai>", "Bartłomiej Boczek <bartlomiej.boczek@robotec.ai>"]

diff --git a/src/examples/turtlebot4/README.md b/src/examples/turtlebot4/README.md
@@ -55,7 +55,7 @@ using Turtlebot4 simulation. The step by step video tutorial is available [here]
       ```
 
    > **TIP**  
-   > Skip steps 2-4 by downlading generated files [here](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EbPZSEdXYaRGoeecu6oJg6QBsI4ZOe_mrU3uOtOflnIjQg?e=HX8ZHB) unzip them to `src/examples/turtlebot4_whoami/description/generated` with a command:
+   > Skip steps 2-4 by downloading generated files [here](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EbPZSEdXYaRGoeecu6oJg6QBsI4ZOe_mrU3uOtOflnIjQg?e=HX8ZHB) unzip them to `src/examples/turtlebot4_whoami/description/generated` with a command:
    > `unzip -d src/examples/turtlebot4_whoami/description turtlebot4_whoami_generated.zip`
 
    2. Download official turtlebot4 [data sheet](https://bit.ly/3KCp3Du) into
@@ -138,7 +138,7 @@ _My robot doesn't have an identity._
    4.0K documentation  4.0K images   28K index.faiss  8.0K index.pkl  4.0K robot_constitution.txt  4.0K robot_identity.txt
    ```
 
-   You can also check the contents of `robot_indentify.txt` file (it is generated by LLM, but should be simillar to the one below).
+   You can also check the contents of `robot_identity.txt` file (it is generated by LLM, but should be similar to the one below).
 
    ```bash
    cat src/examples/turtlebot4_whoami/description/robot_identity.txt

diff --git a/src/examples/turtlebot4/turtlebot_demo.py b/src/examples/turtlebot4/turtlebot_demo.py
@@ -66,7 +66,7 @@ def main(allowlist: Optional[Path] = None):
     - if you are asked to drive towards some object, it's good to:
         1. check the camera image and verify if objects can be seen
         2. if only driving forward is required, do it
-        3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your currect position, then very accurately estimate the goal pose.
+        3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your current position, then very accurately estimate the goal pose.
     - to spin right use negative yaw, to spin left use positive yaw
     """
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Manipulation tasks with natural language

		Work in progress.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Speech-to-speech interaction with autonomous taxi

		Work in progress.