-
Notifications
You must be signed in to change notification settings - Fork 3
3.6 Vision
The vision package is responsible for everything to do with computer vision on the AUV. This includes getting data, training models, streaming cameras to Python, performing object detection on the camera feeds, estimating the position (and orientation or other features) of detected objects, and building a discrete map of the various objects in the AUV's environment to send to the Planner package.
This is a diagram of the how the vision package works at a high level:
This package handles multiple aspects of computer vision, which can be used as follows:
- The
model_pipeline
folder contains all the code and instructions necessary for training object detection models on large datasets of images. - The Visualizer (used to visualize what the AUV thinks it's environment looks like) and Image View GUI (to view camera feeds, view and debug detections made by the CV models) can be run using the
visualization.launch
andgui_image_view.launch
launch files respectively. - To start streaming camera data to ROS topics without any detection, run the
stream*******cam.launch
files. - To start object detection as well as stream camera data to ROS topics, run the
object_detection.launch
with the argumentsim
depending if you are running vision for a pool test or for the sim. - To turn .bag files (files which record all messages published to specific topics) which recorded camera streams into all the images that the camera took, run the
convert_bag_to_png.sh
file then play back the recording from the .bag file (ROS documentation explains how to do this well).
-
object_detection.py
: This node defines the callback for image data from camera topics, runs object detection models on those streams, calls the necessary imported functions to find the position (and other features) of detected objects, and publishes this information asObjectDetectionFrame
s. -
object_detection_utils.py
: This file defines all the parameters, functions, and Subscribers/Publishers required to perform object detection on a camera stream, estimate object positions (and other features), and any other utility function needed byobject_detection.py
. -
lane_marker_measure.py
: This file contains all the computation necessary to estimate one or two bearings from an image of a lane marker. -
debug_thresholding.py
: This node is used to publish debug image streams to topics which can be viewed with the Image View Gui. Useful for understanding the steps thelane_marker_measure.py
takes to estimate bearings, and for debugging issues/calibrating this process. -
point_cloud.py
: This node is responsible for subscribing to camera feeds, and publishing the point cloud (an image where each pixel is an XYZ position in space instead of an RGB value) resulting from that feed to the appropriate topic, for use byobject_detection_utils.py
. This is only used for the front camera, since the down camera does not support 3D imaging. -
render_visualization.py
: This node is responsible for listening to all topics related to the AUV's understanding of its environment and it's location within that environment, and passing this data to RViz for 3D visualization. This is the file to run when debugging vision, state estimation, or controls.
The only configuration in the Vision package is in the src/config/thresholding_values.txt
(these values are used for thresholding during lane marker bearing estimation) and in object_detection_utils.py
(hard-coded information about the dimensions of detectable objects, detection parameters, etc.).
- auv_msgs
- sensor_msgs
- rospy
- usb_cam