Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: release update #283

Merged
merged 8 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 34 additions & 62 deletions README.md
maciejmajek marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
# RAI

> [!IMPORTANT]
> **RAI is in beta phase now, expect friction. Early contributors are the most welcome!** \
> **RAI is developing fast towards a glorious release in time for ROSCon 2024.**
> **RAI is meant for R&D. Make sure to understand its limitations.**

RAI is a flexible AI agent framework to develop and deploy Gen AI features for your robots.
RAI is a flexible AI agent framework to develop and deploy Embodied AI features for your robots.

---

Expand Down Expand Up @@ -38,38 +37,43 @@ The RAI framework aims to:
- Supply a general multi-agent system, bringing Gen AI features to your robots.
- Add human interactivity, flexibility in problem-solving, and out-of-box AI features to existing robot stacks.
- Provide first-class support for multi-modalities, enabling interaction with various data types.
- Incorporate an advanced database for persistent agent memory.
- Include ROS 2-oriented tooling for agents.
- Support a comprehensive task/mission orchestrator.

## Limitations

- Limitations of LLMs and VLMs in use apply: poor spatial reasoning, hallucinations, jailbreaks, latencies, costs, ...
- Resource use (memory, CPU) is not addressed yet.​
- Requires connectivity and / or an edge platform.​

## Table of Contents

- [Features](#features)
- [Setup](#setup)
- [Usage examples (demos)](#planned-demos)
- [Usage examples (demos)](#simulation-demos)
- [Developer resources](#developer-resources)
- [ROSCon 2024 Talk](#roscon-2024)

## Features

- [x] Voice interaction (both ways).
- [x] Customizable robot identity, including constitution (ethical code) and documentation (understanding own capabilities).
- [x] Accessing camera ("What do you see?") sensor, utilizing VLMs.
- [x] Reasoning about its own state through ROS logs.
- [x] Accessing camera ("What do you see?"), utilizing VLMs.
- [x] Summarizing own state through ROS logs.
- [x] ROS 2 action calling and other interfaces. The Agent can dynamically list interfaces, check their message type, and publish.
- [x] Integration with LangChain to abstract vendors and access convenient AI tools.
- [x] Tasks in natural language to nav2 goals.
- [x] NoMaD integration.
- [x] [NoMaD](https://general-navigation-models.github.io/nomad/) integration.
- [x] Tracing.
- [ ] Grounded SAM 2 integration.
- [ ] Improved Human-Robot Interaction with voice and text.
- [x] Grounded SAM 2 integration.
- [x] Improved Human-Robot Interaction with voice and text.
- [x] Additional tooling such as GroundingDino.
- [x] Support for at least 3 different AI vendors.
- [ ] SDK for RAI developers.
- [ ] Support for at least 3 different AI vendors.
- [ ] Additional tooling such as GroundingDino.
- [ ] UI for configuration to select features and tools relevant for your deployment.

## Setup

Before going further, make sure you have ROS 2 (Jazzy or Humble) installed and sourced on your system.

### 1. Setting up the workspace:

#### 1.1 Install poetry
Expand All @@ -96,6 +100,13 @@ poetry install
rosdep install --from-paths src --ignore-src -r -y
```

> [!TIP]
> If you want to use features such as Grounded SAM 2 or NoMaD install additional dependencies:
>
> ```bash
> poetry install --with openset,nomad
> ```

### 2. Build the project:

#### 2.1 Build RAI workspace
Expand Down Expand Up @@ -123,37 +134,6 @@ Pick your local solution or service provider and follow one of these guides:
- **[OpenAI](https://platform.openai.com/docs/quickstart)**
- **[AWS Bedrock](https://console.aws.amazon.com/bedrock/home?#/overview)**

## Running RAI

You are now ready to run RAI!

![rosbot-xl-example](./docs/imgs/rosbot-xl-example.gif)

You can start by running the following examples:

1. **Hello RAI:** Interact directly with your ROS 2 environment through an intuitive Streamlit chat interface.
2. **O3DE Husarion ROSbot XL demo"** give tasks to a simulated robot using natural language.

### Hello RAI

Chat seamlessly with your ROS 2 environment, retrieve images from cameras, adjust parameters, and get information about your ROS interfaces.

```bash
streamlit run src/rai_hmi/rai_hmi/text_hmi.py
```

Remember to run this command in a sourced shell.

### O3DE Rosbot XL Demo

This demo provides a practical way to interact with and control a virtual Husarion ROSbot XL within a simulated environment.
Using natural language commands, you can assign tasks to the robot, allowing it to perform a variety of actions.

Given that this is a beta release, consider this demo as an opportunity to explore the framework's capabilities, provide feedback, and contribute.
Try different commands, see how the robot responds, and use this experience to understand the potential and limitations of the system.

Follow this guide: [husarion-rosbot-xl-demo](docs/demos.md)

## What's next?

Once you know your way around RAI, try the following challenges, with the aid the [developer guide](docs/developer_guide.md):
Expand All @@ -162,15 +142,16 @@ Once you know your way around RAI, try the following challenges, with the aid th
- Implement additional tools and use them in your interaction.
- Try a complex, multi-step task for your robot, such as going to several points to perform observations!

Soon you will have an opportunity to work with new RAI demos across several domains.

### Planned demos
### Simulation demos

| Application | Robot | Description | Link |
| ------------------------------------------ | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |
| Mission and obstacle reasoning in orchards | Autonomous tractor | In a beautiful scene of a virtual orchard, RAI goes beyond obstacle detection to analyze best course of action for a given unexpected situation. | [🌾 demo](https://github.com/RobotecAI/rai-agriculture-demo) |
| Manipulation tasks with natural language | Robot Arm (Franka Panda) | Complete flexible manipulation tasks thanks to RAI and Grounded SAM 2 | [🦾 demo](https://github.com/RobotecAI/rai-manipulation-demo) |
| Quadruped inspection demo | A robot dog (ANYbotics ANYmal) | Perform inspection in a warehouse environment, find and report anomalies | link TBD |
Try RAI yourself with these demos:
| Application | Robot | Description | Demo Link | Docs Link |
| ------------------------------------------ | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- | -------------------------------- |
| Mission and obstacle reasoning in orchards | Autonomous tractor | In a beautiful scene of a virtual orchard, RAI goes beyond obstacle detection to analyze best course of action for a given unexpected situation. | [🌾 demo](https://github.com/RobotecAI/rai-rosbot-xl-demo) | [📚](docs/demos/agriculture.md) |
| Manipulation tasks with natural language | Robot Arm (Franka Panda) | Complete flexible manipulation tasks thanks to RAI and Grounded SAM 2 | [🦾 demo](https://github.com/RobotecAI/rai-manipulation-demo) | [📚](docs/demos/manipulation.md) |
| Autonomous mobile robot demo | Husarion ROSbot XL | Demonstrate RAI's interaction with an autonomous mobile robot platform for navigation and control | [🤖 demo](https://github.com/RobotecAI/rai-rosbot-xl-demo) | [📚](docs/demos/rosbot_xl.md) |
| Turtlebot demo | Turtlebot | Showcase RAI's capabilities with the popular Turtlebot platform | [🐢 demo](docs/demos/turtlebot.md) | [📚](docs/demos/turtlebot.md) |
| Speech-to-speech interaction with autonomous taxi | Simulated car | Demonstrate RAI's speech-to-speech interaction capabilities for specifying destinations to an autonomous taxi in awsim with autoware environment | [🚕 demo](docs/demos/taxi.md) | [📚](docs/demos/taxi.md) |

## Community

Expand All @@ -185,12 +166,3 @@ See our [Developer Guide](docs/developer_guide.md) for a deeper dive into RAI, i
### Contributing

You are welcome to contribute to RAI! Please see our [Contribution Guide](CONTRIBUTING.md).

### RAI release and talk

RAI will be released on **October 15th**, right before [ROSCon 2024](https://roscon.ros.org/2024/).
If you are going to the conference, come join us at RAI talk on October 23rd.

<p align="center">
<img width="400" src="./docs/imgs/talk.png" />
</p>
2 changes: 1 addition & 1 deletion docs/create_robots_whoami.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Your robot's `whoami` package serves as a configuration package for the `rai_who
> [!TIP]
> The Human-Machine Interface (HMI), both text and voice versions, relies heavily on the whoami package. It uses the robot's identity, constitution, and documentation to provide context-aware responses and ensure the robot behaves according to its defined characteristics.

## Example (Franka Emika Panda arm)
## Configuration example - Franka Emika Panda arm

1. Setup the repository using 1st and 2nd step from [Setup](../README.md#setup)

Expand Down
14 changes: 4 additions & 10 deletions docs/demos/agriculture.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,10 @@ This demo showcases autonomous tractors operating in an agricultural field using

Download the latest binary release for your ROS 2 distribution.

- [ros2-humble-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/RAIAgricultureDemo_1.0.0_jammyhumble.zip)
- [ros2-jazzy-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/RAIAgricultureDemo_1.0.0_noblejazzy.zip)
- [ros2-humble-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIAgricultureDemo_1.0.0_jammyhumble.zip)
- [ros2-jazzy-agriculture-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIAgricultureDemo_1.0.0_noblejazzy.zip)

2. **Install Required Packages**

```bash
sudo apt install ros-${ROS_DISTRO}-ackermann-msgs ros-${ROS_DISTRO}-gazebo-msgs ros-${ROS_DISTRO}-control-toolbox ros-${ROS_DISTRO}-nav2-bringup
```

3. **Unpack the Binary and Run the Simulation**
2. **Unpack the Binary and Run the Simulation**
Unpack the binary

- For Jazzy:
Expand All @@ -39,7 +33,7 @@ This demo showcases autonomous tractors operating in an agricultural field using
./RAIAgricultureDemoGamePackage/RAIAgricultureDemo.GameLauncher -bg_ConnectToAssetProcessor=0
```

4. **Start the Tractor Node**
3. **Start the Tractor Node**

```bash
python examples/agriculture-demo.py --tractor_number 1
Expand Down
3 changes: 3 additions & 0 deletions docs/demos/manipulation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Manipulation tasks with natural language

Work in progress.
6 changes: 3 additions & 3 deletions docs/demos/rosbot_xl.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ This demo utilizes Open 3D Engine simulation and allows you to work with RAI on

1. Download the newest binary release:

- Ubuntu 22.04 & ros2 humble: [link](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EU4kUlXRLShMo3PBYnyYFP0B-_pw1Vv6FcmqSQHiUbrhfw?e=qo2T9K)
- Ubuntu 24.04 & ros2 jazzy: [link](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EaCcsJXzxqZFvOzmHnEAFQwBV89pRQ9yKQmSrVC-JYv2ug?e=s4ryDO)
- Ubuntu 22.04 & ros2 humble: [link](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIROSBotDemo_1.0.0_jammyhumble.zip)
- Ubuntu 24.04 & ros2 jazzy: [link](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIROSBotDemo_1.0.0_noblejazzy.zip)

2. Install required packages

Expand Down Expand Up @@ -50,7 +50,7 @@ Please refer to [rai husarion rosbot xl demo][rai rosbot demo] for more details.
2. Running rai nodes and agents, navigation stack and O3DE simulation.

```bash
ros2 launch ./examples/rosbotxl.launch.xml game_lanucher:=path/to/RARAIROSBotXLDemo.GameLauncher
ros2 launch ./examples/rosbotxl.launch.xml game_launcher:=path/to/RARAIROSBotXLDemo.GameLauncher
```

3. Play with the demo, adding tasks to the RAI agent. Here are some examples:
Expand Down
3 changes: 3 additions & 0 deletions docs/demos/taxi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Speech-to-speech interaction with autonomous taxi

Work in progress.
11 changes: 4 additions & 7 deletions docs/human_robot_interface.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,19 @@
# RAI: Human-Robot Interaction

You can utilize RAI Human-Robot Interaction (HRI) package to converse with your robots.
This package allows you to simply chat with your robot, or to give it tasks and receive feedback and reports.
You have the following options:
RAI provides a Human-Robot Interaction (HRI) package that enables communication with your robots. This package allows you to chat with your robot, give it tasks, and receive feedback and reports. You have the following options for interaction:

- [Voice communication](human_robot_interface/voice_interface.md) using ASR and TTS models ([OpenAI Whisper](https://openai.com/index/whisper/))
- [Voice communication](human_robot_interface/voice_interface.md) using Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models
- [Text communication](human_robot_interface/text_interface.md) using [Streamlit](https://streamlit.io)

If your environment is noisy, voice communication might be tricky.
In noisy environments, it is better to use text channel.
Voice communication might be challenging in noisy environments. In such cases, it's recommended to use the text channel.

## How it works?

### General Architecture

![General HRI interface](./imgs/HRI_interface.png)

The general architecture follows the diagram above. Text is captured from the input source, transported to the HMI, processed according to the given tools and robot's rules, and then sent to the output source.
The general architecture follows the diagram above. Text is captured from the input source, transported to the Human-Machine Interface (HMI), processed according to the given tools and robot's rules, and then sent to the output source.

### Voice Interface

Expand Down
18 changes: 13 additions & 5 deletions docs/human_robot_interface/text_interface.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
# Human Robot Interface via Streamlit
# Human-Robot Interface via Streamlit

## Running example
## Running the Example

When your robot's whoami package is ready, run the following:
When your robot's whoami package is ready, run the following command:

```bash
streamlit run src/rai_hmi/rai_hmi/text_hmi.py <my_robot_whoami> # eg rosbot_xl_whoami
streamlit run src/rai_hmi/rai_hmi/text_hmi.py <my_robot_whoami> # e.g., rosbot_xl_whoami
```

> [!NOTE]
> Agent's responses can take longer time for complex tasks.
> The agent's responses may take longer for complex tasks.

## Customization

Currently, customization capabilities are limited due to the internal API design. We are planning to deliver a solution for seamless expansion in the near future.

If you want to customize the available tools, you can do so by editing the `src/rai_hmi/rai_hmi/agent.py` file.

If you have a RaiStateBasedLlmNode running (see e.g., [examples/rosbot-xl-demo.py](examples/rosbot-xl-demo.py)), the Streamlit GUI will communicate with the running node via task_tools defined in the `rai_hmi/rai_hmi/agent.py` file.
2 changes: 1 addition & 1 deletion docs/manipulation.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ We explored various aspects of the dataset and training process, including:

### Challenges and Limitations

During the experiments we encountered several obstacles and challanges:
During the experiments we encountered several obstacles and challenges:

1. **Computational Requirements**:

Expand Down
19 changes: 0 additions & 19 deletions docs/vendors.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,22 +39,3 @@ llm = ChatBedrock(
model="anthropic.claude-3-opus-20240229-v1:0",
)
```

# Caching

## Redis

`ScenarioRunner` supports Redis cache through langchain. Make sure to set

```bash
export REDIS_CACHE_HOST="redis://<host>"
```

Self hosting Redis:

```bash
docker run -p 6379:6379 -d redis:latest
export REDIS_CACHE_HOST="redis://localhost:6379"
```

For more invormation see [redis.io/docs/latest/operate/oss_and_stack/install/install-stack/docker/](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/docker/)
4 changes: 2 additions & 2 deletions examples/rosbot-xl-demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,10 @@ def main():
- if you are asked to drive towards some object, it's good to:
1. check the camera image and verify if objects can be seen
2. if only driving forward is required, do it
3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your currect position, then very accurately estimate the goal pose.
3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your current position, then very accurately estimate the goal pose.
- it is good to verify using given information if the robot is not stuck
- navigation actions sometimes fail. Their output can be read from rosout. You can also tell if they partially worked by checking the robot position and rotation.
- before using any ros2 interfaces, always make sure to check you are usig the right interface
- before using any ros2 interfaces, always make sure to check you are using the right interface
- processing camera image takes 5-10s. Take it into account that if the robot is moving, the information can be outdated. Handle it by good planning of your movements.
- you are encouraged to use wait tool in between checking the status of actions
- to find some object navigate around and check the surrounding area
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "rai"
version = "0.1.0"
version = "1.0.0"
description = "RAI is a framework for building general multi-agent systems, bringing Gen AI features to ROS enabled robots."
readme = "README.md"
authors = ["Maciej Majek <maciej.majek@robotec.ai>", "Bartłomiej Boczek <bartlomiej.boczek@robotec.ai>"]
Expand Down
4 changes: 2 additions & 2 deletions src/examples/turtlebot4/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ using Turtlebot4 simulation. The step by step video tutorial is available [here]
```

> **TIP**
> Skip steps 2-4 by downlading generated files [here](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EbPZSEdXYaRGoeecu6oJg6QBsI4ZOe_mrU3uOtOflnIjQg?e=HX8ZHB) unzip them to `src/examples/turtlebot4_whoami/description/generated` with a command:
> Skip steps 2-4 by downloading generated files [here](https://robotecai-my.sharepoint.com/:u:/g/personal/bartlomiej_boczek_robotec_ai/EbPZSEdXYaRGoeecu6oJg6QBsI4ZOe_mrU3uOtOflnIjQg?e=HX8ZHB) unzip them to `src/examples/turtlebot4_whoami/description/generated` with a command:
> `unzip -d src/examples/turtlebot4_whoami/description turtlebot4_whoami_generated.zip`

2. Download official turtlebot4 [data sheet](https://bit.ly/3KCp3Du) into
Expand Down Expand Up @@ -138,7 +138,7 @@ _My robot doesn't have an identity._
4.0K documentation 4.0K images 28K index.faiss 8.0K index.pkl 4.0K robot_constitution.txt 4.0K robot_identity.txt
```

You can also check the contents of `robot_indentify.txt` file (it is generated by LLM, but should be simillar to the one below).
You can also check the contents of `robot_identity.txt` file (it is generated by LLM, but should be similar to the one below).

```bash
cat src/examples/turtlebot4_whoami/description/robot_identity.txt
Expand Down
2 changes: 1 addition & 1 deletion src/examples/turtlebot4/turtlebot_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def main(allowlist: Optional[Path] = None):
- if you are asked to drive towards some object, it's good to:
1. check the camera image and verify if objects can be seen
2. if only driving forward is required, do it
3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your currect position, then very accurately estimate the goal pose.
3. if obstacle avoidance might be required, use ros2 actions navigate_*, but first check your current position, then very accurately estimate the goal pose.
- to spin right use negative yaw, to spin left use positive yaw
"""

Expand Down
Loading