Skip to content

Thinking-Machines-RL/OpenRobotGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenRobotGPT

Open source implementation of RobotGPT

Paper

Abstract: We present RobotGPT, an innovative decision framework for robotic manipulation that prioritizes stability and safety. The execution code generated by ChatGPT cannot guarantee the stability and safety of the system. ChatGPT may provide different answers for the same task, leading to unpredictability. This instability prevents the direct integration of ChatGPT into the robot manipulation loop. Although setting the temperature to 0 can generate more consistent outputs, it may cause ChatGPT to lose diversity and creativity. Our objective is to leverage ChatGPT’s problem-solving capabilities in robot manipulation and train a reliable agent. The framework includes an effective prompt structure and a robust learning model. Additionally, we introduce a metric for measuring task difficulty to evaluate ChatGPT’s performance in robot manipulation. Furthermore, we evaluate RobotGPT in both simulation and realworld environments. Compared to directly using ChatGPT to generate code, our framework significantly improves task success rates, with an average increase from 38.5% to 92.5%. Therefore, training a RobotGPT by utilizing ChatGPT as an expert is a more stable approach compared to directly using ChatGPT as a task planner.

ROS2 structure:

The Ros nodes were created with the Idea of having 3 modular components.

  • LLM component: This is where the call to a LLM model is done, based on the API described in the node robot_api_node. This node then convert the LLM request to tradition Action for the environment. This component can also be replaced by a traditional RL agent.
  • ENV component: manage the connection with a LLM or RL agent, giving the possibility to ask for a certain action to be executed and receives the next_state associated with it. The actions are in the form of end effector final position-orientation-grip_status and how this positions are reached is managed by the trajectory_generator node. The physics env used as a basis is Pybullet.
  • Trajectory generator: component responsible for generating trajectory. One can decide to use moveit or other custom packages.

How to run

Create an image from the dockerfile

docker build -t <image_name> .

Run the docker with gpus, ssh port enabled and x11 forwarding

docker run --gpus all -it --rm \
    -p 2222:22 \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    -v $(pwd)/workspace:/root/workspace:rw \
    <image_name>

(From another command line) Connect to the docker

ssh -X -p 2222 root@localhost

The password is "password"

Every time you need to reconnect to the container the SHA key will change. You can fix the error that pops up with this bash code

ssh-keygen -f "/home/nicola/.ssh/known_hosts" -R "[localhost]:2222"

In order to be able to connect from an Ubuntu machine use the following command

xhost +

Testing of the pipeline:

The node to launch are in order:

  • GymNode.
  • the trajectory node (ex. MoveNodebasic, MoveIt).
  • robot_api_node
  • code_node

About

Open source implementation of RobotGPT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5