Skip to content

3.6 Vision

Antoine Dangeard edited this page Sep 12, 2023 · 15 revisions

Vision

Table of Contents

Overview

The vision package is responsible for everything to do with computer vision on the AUV. This includes getting data, training models, streaming cameras to Python, performing object detection on the camera feeds, estimating the position (and orientation or other features) of detected objects, and building a discrete map of the various objects in the AUV's environment to send to the Planner package.

This is a diagram of the how the vision package works at a high level: image

Usage

This package handles multiple aspects of computer vision, which can be used as follows:

  1. The model_pipeline folder contains all the code and instructions necessary for training object detection models on large datasets of images.
  2. The Visualizer (used to visualize what the AUV thinks it's environment looks like) and Image View GUI (to view camera feeds, view and debug detections made by the CV models) can be run using the visualization.launch and gui_image_view.launch launch files respectively.
  3. To start streaming camera data to ROS topics without any detection, run the stream*******cam.launch files.
  4. To start object detection as well as stream camera data to ROS topics, run the object_detection.launch with the argument sim depending if you are running vision for a pool test or for the sim.
  5. To turn .bag files (files which record all messages published to specific topics) which recorded camera streams into all the images that the camera took, run the convert_bag_to_png.sh file then play back the recording from the .bag file (ROS documentation explains how to do this well).

Nodes

  • object_detection.py: This node defines the callback for image data from camera topics, runs object detection models on those streams, calls the necessary imported functions to find the position (and other features) of detected objects, and publishes this information as ObjectDetectionFrames.
  • object_detection_utils.py: This file defines all the parameters, functions, and Subscribers/Publishers required to perform object detection on a camera stream, estimate object positions (and other features), and any other utility function needed by object_detection.py.
  • lane_marker_measure.py: This file contains all the computation necessary to estimate one or two bearings from an image of a lane marker.
  • debug_thresholding.py: This node is used to publish debug image streams to topics which can be viewed with the Image View Gui. Useful for understanding the steps the lane_marker_measure.py takes to estimate bearings, and for debugging issues/calibrating this process.
  • point_cloud.py: This node is responsible for subscribing to camera feeds, and publishing the point cloud (an image where each pixel is an XYZ position in space instead of an RGB value) resulting from that feed to the appropriate topic, for use by object_detection_utils.py. This is only used for the front camera, since the down camera does not support 3D imaging.
  • render_visualization.py: This node is responsible for listening to all topics related to the AUV's understanding of its environment and it's location within that environment, and passing this data to RViz for 3D visualization. This is the file to run when debugging vision, state estimation, or controls.

Configuration

The only configuration in the Vision package is in the src/config/thresholding_values.txt (these values are used for thresholding during lane marker bearing estimation) and in object_detection_utils.py (hard-coded information about the dimensions of detectable objects, detection parameters, etc.).

Dependencies

  • auv_msgs
  • sensor_msgs
  • rospy
  • usb_cam