huggingface · Cadene · Oct 4, 2024 · Sep 4, 2024 · Sep 4, 2024 · Sep 6, 2024
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -35,7 +35,9 @@ jobs:
           lfs: true  # Ensure LFS files are pulled
 
       - name: Install apt dependencies
-        run: sudo apt-get update && sudo apt-get install -y libegl1-mesa-dev ffmpeg
+        run: |
+          sudo apt-get update && \
+          sudo apt-get install -y libegl1-mesa-dev ffmpeg portaudio19-dev
 
       - name: Install poetry
         run: |
@@ -110,7 +112,9 @@ jobs:
           lfs: true  # Ensure LFS files are pulled
 
       - name: Install apt dependencies
-        run: sudo apt-get update && sudo apt-get install -y libegl1-mesa-dev
+        run: |
+          sudo apt-get update && \
+          sudo apt-get install -y libegl1-mesa-dev portaudio19-dev
 
       - name: Install poetry
         run: |

diff --git a/examples/7_get_started_with_real_robot.md b/examples/7_get_started_with_real_robot.md
@@ -45,7 +45,7 @@ poetry install --sync --extras "dynamixel"
 ```bash
 conda install -c conda-forge ffmpeg
 pip uninstall opencv-python
-conda install -c conda-forge opencv>=4.10.0
+conda install -c conda-forge "opencv>=4.10.0"
 ```
 
 You are now ready to plug the 5V power supply to the motor bus of the leader arm (the smaller one) since all its motors only require 5V.

diff --git a/examples/8_use_stretch.md b/examples/8_use_stretch.md
@@ -0,0 +1,111 @@
+This tutorial explains how to use [Stretch 3](https://hello-robot.com/stretch-3-product) with LeRobot.
+
+## Setup
+
+Familiarize yourself with Stretch by following its [tutorials](https://docs.hello-robot.com/0.3/getting_started/hello_robot/) (recommended).
+
+To use LeRobot on Stretch, 3 options are available:
+- [tethered setup](https://docs.hello-robot.com/0.3/getting_started/connecting_to_stretch/#tethered-setup)
+- [untethered setup](https://docs.hello-robot.com/0.3/getting_started/connecting_to_stretch/#untethered-setup)
+- ssh directly into Stretch (you will first need to install and configure openssh-server on stretch using one of the two above setups)
+
+
+## Install LeRobot
+
+On Stretch's CLI, follow these steps:
+
+1. [Install Miniconda](https://docs.anaconda.com/miniconda/#quick-command-line-install):
+```bash
+mkdir -p ~/miniconda3
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
+bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
+rm ~/miniconda3/miniconda.sh
+~/miniconda3/bin/conda init bash
+```
+
+2. Comment out these lines in `~/.profile` (this can mess up paths used by conda and ~/.local/bin should already be in your PATH)
+```
+# set PATH so it includes user's private bin if it exists
+if [ -d "$HOME/.local/bin" ] ; then
+    PATH="$HOME/.local/bin:$PATH"
+fi
+```
+
+3. Restart shell or `source ~/.bashrc`
+
+4. Create and activate a fresh conda environment for lerobot
+```bash
+conda create -y -n lerobot python=3.10 && conda activate lerobot
+```
+
+5. Clone LeRobot:
+```bash
+git clone https://github.com/huggingface/lerobot.git ~/lerobot
+```
+
+6. Install LeRobot
+```bash
+cd ~/lerobot && pip install -e ".[stretch]"
+
+conda install -y -c conda-forge ffmpeg
+pip uninstall -y opencv-python
+conda install -y -c conda-forge "opencv>=4.10.0"
+```
+
+## Teleoperate, record a dataset and run a policy
+
+> **Note:** As indicated in Stretch's [doc](https://docs.hello-robot.com/0.3/getting_started/stretch_hardware_overview/#turning-off-gamepad-teleoperation), you may need to free the "robot process" after booting Stretch by running `stretch_free_robot_process.py`
+
+Before operating Stretch, you need to [home](https://docs.hello-robot.com/0.3/getting_started/stretch_hardware_overview/#homing) it first. In the scripts used below, if the robot is not already homed it will be automatically homed first. Be mindful about giving Stretch some space as this procedure will move the robot's arm and gripper. If you want to "manually" home Stretch first, you can simply run this command:
+```bash
+python lerobot/scripts/control_robot.py calibrate \
+    --robot-path lerobot/configs/robot/stretch.yaml
+```
+This is equivalent to running `stretch_robot_home.py`
+
+Try out teleoperation (you can learn about the controls in Stretch's [documentation](https://docs.hello-robot.com/0.3/getting_started/hello_robot/#gamepad-teleoperation)):
+```bash
+python lerobot/scripts/control_robot.py teleoperate \
+    --robot-path lerobot/configs/robot/stretch.yaml
+```
+This is essentially the same as running `stretch_gamepad_teleop.py`
+
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Once you're familiar with the gamepad controls and after a bit of practice, try to record your first dataset with Stretch.
+Store your Hugging Face repository name in a variable to run these commands:
+```bash
+HF_USER=$(huggingface-cli whoami | head -n 1)
+echo $HF_USER
+```
+
+Record one episode:
+```bash
+python lerobot/scripts/control_robot.py record \
+    --robot-path lerobot/configs/robot/stretch.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/stretch_test \
+    --tags stretch tutorial \
+    --warmup-time-s 3 \
+    --episode-time-s 40 \
+    --reset-time-s 10 \
+    --num-episodes 1 \
+    --push-to-hub 0
+```
+
+Note that if you're using ssh to connect to Stretch and run this script, you won't be able to visualize its cameras feed (though they'll still be recording).
+
+Now try to replay this episode (make sure the robot's initial position is the same):
+```bash
+python lerobot/scripts/control_robot.py replay \
+    --robot-path lerobot/configs/robot/stretch.yaml \
+    --fps 30 \
+    --root data \
+    --repo-id ${HF_USER}/stretch_test \
+    --episode 0
+```
diff --git a/lerobot/common/datasets/compute_stats.py b/lerobot/common/datasets/compute_stats.py
@@ -68,7 +68,7 @@ def get_stats_einops_patterns(dataset, num_workers=0):
     return stats_patterns
 
 
-def compute_stats(dataset, batch_size=32, num_workers=16, max_num_samples=None):
+def compute_stats(dataset, batch_size=8, num_workers=8, max_num_samples=None):
     """Compute mean/std and min/max statistics of all data keys in a LeRobotDataset."""
     if max_num_samples is None:
         max_num_samples = len(dataset)

diff --git a/lerobot/common/robot_devices/cameras/intelrealsense.py b/lerobot/common/robot_devices/cameras/intelrealsense.py
@@ -9,6 +9,7 @@
 import threading
 import time
 import traceback
+from collections import Counter
 from dataclasses import dataclass, replace
 from pathlib import Path
 from threading import Thread
@@ -28,22 +29,23 @@
 SERIAL_NUMBER_INDEX = 1
 
 
-def find_camera_indices(raise_when_empty=True) -> list[int]:
+def find_cameras_info(raise_when_empty=True) -> dict[int, str]:
     """
-    Find the serial numbers of the Intel RealSense cameras
+    Find the names and the serial numbers of the Intel RealSense cameras
     connected to the computer.
     """
-    camera_ids = []
+    cameras_info = {}
     for device in rs.context().query_devices():
         serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX)))
-        camera_ids.append(serial_number)
+        name = device.get_info(rs.camera_info.name)
+        cameras_info[serial_number] = name
 
-    if raise_when_empty and len(camera_ids) == 0:
+    if raise_when_empty and len(cameras_info) == 0:
         raise OSError(
             "Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware."
         )
 
-    return camera_ids
+    return cameras_info
 
 
 def save_image(img_array, camera_idx, frame_index, images_dir):
@@ -59,7 +61,7 @@ def save_image(img_array, camera_idx, frame_index, images_dir):
 
 def save_images_from_cameras(
     images_dir: Path,
-    camera_ids: list[int] | None = None,
+    camera_ids: list[int | None],
     fps=None,
     width=None,
     height=None,
@@ -69,12 +71,13 @@ def save_images_from_cameras(
     Initializes all the cameras and saves images to the directory. Useful to visually identify the camera
     associated to a given camera index.
     """
-    if camera_ids is None:
-        camera_ids = find_camera_indices()
+    if len(camera_ids) == 0:
+        camera_ids = find_cameras_info()
 
     print("Connecting cameras")
     cameras = []
     for cam_idx in camera_ids:
+        print(f"{cam_idx=}")
         camera = IntelRealSenseCamera(cam_idx, fps=fps, width=width, height=height)
         camera.connect()
         print(
@@ -93,7 +96,7 @@ def save_images_from_cameras(
     frame_index = 0
     start_time = time.perf_counter()
     try:
-        with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
+        with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
             while True:
                 now = time.perf_counter()
 
@@ -140,6 +143,7 @@ class IntelRealSenseCameraConfig:
     IntelRealSenseCameraConfig(90, 640, 480)
     IntelRealSenseCameraConfig(30, 1280, 720)
     IntelRealSenseCameraConfig(30, 640, 480, use_depth=True)
+    IntelRealSenseCameraConfig(30, 640, 480, rotation=90)
     ```
     """
 
@@ -149,6 +153,7 @@ class IntelRealSenseCameraConfig:
     color_mode: str = "rgb"
     use_depth: bool = False
     force_hardware_reset: bool = True
+    rotation: int | None = None
 
     def __post_init__(self):
         if self.color_mode not in ["rgb", "bgr"]:
@@ -162,11 +167,15 @@ def __post_init__(self):
                 f"but {self.fps=}, {self.width=}, {self.height=} were provided."
             )
 
+        if self.rotation not in [-90, None, 90, 180]:
+            raise ValueError(f"`rotation` must be in [-90, None, 90, 180] (got {self.rotation})")
+
 
 class IntelRealSenseCamera:
     """
     The IntelRealSenseCamera class is similar to OpenCVCamera class but adds additional features for Intel Real Sense cameras:
     - camera_index corresponds to the serial number of the camera,
+    - can be instantiated with the camera's name — if it's unique — using IntelRealSenseCamera.init_from_name(),
     - camera_index won't randomly change as it can be the case of OpenCVCamera for Linux,
     - read is more reliable than OpenCVCamera,
     - depth map can be returned.
@@ -181,8 +190,10 @@ class IntelRealSenseCamera:
 
     Example of usage:
     ```python
-    camera_index = 128422271347
-    camera = IntelRealSenseCamera(camera_index)
+    # Instantiate with camera index (its serial number)
+    camera = IntelRealSenseCamera(128422271347)
+    # Or by its name if it's unique
+    camera = IntelRealSenseCamera.init_from_name("Intel RealSense D405")
     camera.connect()
     color_image = camera.read()
     # when done using the camera, consider disconnecting
@@ -237,6 +248,36 @@ def __init__(
         self.depth_map = None
         self.logs = {}
 
+        # TODO(alibets): Do we keep original width/height or do we define them after rotation?
+        self.rotation = None
+        if config.rotation == -90:
+            self.rotation = cv2.ROTATE_90_COUNTERCLOCKWISE
+        elif config.rotation == 90:
+            self.rotation = cv2.ROTATE_90_CLOCKWISE
+        elif config.rotation == 180:
+            self.rotation = cv2.ROTATE_180
+
+    @classmethod
+    def init_from_name(cls, name: str, config: IntelRealSenseCameraConfig | None = None, **kwargs):
+        cameras_info = find_cameras_info()
+        this_name_count = Counter(cameras_info.values())[name]
+        if this_name_count > 1:
+            # TODO(aliberts): Test this with multiple identical cameras (Aloha)
+            raise ValueError(
+                f"Multiple {name} cameras have been detected. Please use their serial number to instantiate them."
+            )
+
+        name_to_serial_dict = {name: serial for serial, name in cameras_info.items()}
+        serial = name_to_serial_dict[name]
+
+        if config is None:
+            config = IntelRealSenseCameraConfig()
+
+        # Overwrite the config arguments using kwargs
+        config = replace(config, **kwargs)
+
+        return cls(camera_index=serial, config=config, **kwargs)
+
     def connect(self):
         if self.is_connected:
             raise RobotDeviceAlreadyConnectedError(
@@ -270,7 +311,7 @@ def connect(self):
         # valid cameras.
         if not is_camera_open:
             # Verify that the provided `camera_index` is valid before printing the traceback
-            available_cam_ids = find_camera_indices()
+            available_cam_ids = find_cameras_info()
             if self.camera_index not in available_cam_ids:
                 raise ValueError(
                     f"`camera_index` is expected to be one of these available cameras {available_cam_ids}, but {self.camera_index} is provided instead. "
@@ -323,6 +364,9 @@ def read(self, temporary_color: str | None = None) -> np.ndarray | tuple[np.ndar
                 f"Can't capture color image with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
             )
 
+        if self.rotation is not None:
+            color_image = cv2.rotate(color_image, self.rotation)
+
         # log the number of seconds it took to read the image
         self.logs["delta_timestamp_s"] = time.perf_counter() - start_time
 
@@ -342,6 +386,9 @@ def read(self, temporary_color: str | None = None) -> np.ndarray | tuple[np.ndar
                     f"Can't capture depth map with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
                 )
 
+            if self.rotation is not None:
+                depth_map = cv2.rotate(depth_map, self.rotation)
+
             return color_image, depth_map
         else:
             return color_image

diff --git a/lerobot/common/robot_devices/cameras/opencv.py b/lerobot/common/robot_devices/cameras/opencv.py
@@ -78,13 +78,13 @@ def save_image(img_array, camera_index, frame_index, images_dir):
 
 
 def save_images_from_cameras(
-    images_dir: Path, camera_ids: list[int] | None = None, fps=None, width=None, height=None, record_time_s=2
+    images_dir: Path, camera_ids: list[int | None], fps=None, width=None, height=None, record_time_s=2
 ):
     """
     Initializes all the cameras and saves images to the directory. Useful to visually identify the camera
     associated to a given camera index.
     """
-    if camera_ids is None:
+    if len(camera_ids) == 0:
         camera_ids = find_camera_indices()
 
     print("Connecting cameras")
@@ -156,13 +156,17 @@ class OpenCVCameraConfig:
     width: int | None = None
     height: int | None = None
     color_mode: str = "rgb"
+    rotation: int | None = None
 
     def __post_init__(self):
         if self.color_mode not in ["rgb", "bgr"]:
             raise ValueError(
                 f"`color_mode` is expected to be 'rgb' or 'bgr', but {self.color_mode} is provided."
             )
 
+        if self.rotation not in [-90, None, 90, 180]:
+            raise ValueError(f"`rotation` must be in [-90, None, 90, 180] (got {self.rotation})")
+
 
 class OpenCVCamera:
     """
@@ -223,6 +227,15 @@ def __init__(self, camera_index: int, config: OpenCVCameraConfig | None = None,
         self.color_image = None
         self.logs = {}
 
+        # TODO(alibets): Do we keep original width/height or do we define them after rotation?
+        self.rotation = None
+        if config.rotation == -90:
+            self.rotation = cv2.ROTATE_90_COUNTERCLOCKWISE
+        elif config.rotation == 90:
+            self.rotation = cv2.ROTATE_90_CLOCKWISE
+        elif config.rotation == 180:
+            self.rotation = cv2.ROTATE_180
+
     def connect(self):
         if self.is_connected:
             raise RobotDeviceAlreadyConnectedError(f"OpenCVCamera({self.camera_index}) is already connected.")
@@ -328,6 +341,9 @@ def read(self, temporary_color_mode: str | None = None) -> np.ndarray:
                 f"Can't capture color image with expected height and width ({self.height} x {self.width}). ({h} x {w}) returned instead."
             )
 
+        if self.rotation is not None:
+            color_image = cv2.rotate(color_image, self.rotation)
+
         # log the number of seconds it took to read the image
         self.logs["delta_timestamp_s"] = time.perf_counter() - start_time