Skip to content

Latest commit

 

History

History
105 lines (78 loc) · 5.27 KB

README.md

File metadata and controls

105 lines (78 loc) · 5.27 KB

3D-Computer-Vision

  • Affine transformation and 3D rotation
  • Camera Calibration
  • Line Detection
  • Plane Detection
  • Pong Game
  • Image Transformations
  • Stereo Reconstruction
  • Homography Estimation

Task 1: Affine transformation and 3D rotation

The code allows users to either interactively manipulate the 3D point cloud or watch it animate with constant rotation and scaling effects.

Affine Transformation

  • Purpose: Transforms 2D images through translation, scaling, rotation, and shearing.
  • Matrix:
 ⎡ a b tx ⎤
 ⎣ c d ty ⎦

3D Rotation

  • Purpose: Rotates 3D objects around x, y, or z axes.
  • Matrix:
  • Matrix: Matrix:
    • Rotation around x-axis:
      [ 1       0          0 ]
      [ 0   cos(θ)   -sin(θ) ]
      [ 0   sin(θ)    cos(θ) ]
      
    • Rotation around y-axis:
      [  cos(θ)   0   sin(θ) ]
      [      0    1     0   ]
      [ -sin(θ)   0   cos(θ) ]
      
    • Rotation around z-axis:
      [ cos(θ)   -sin(θ)    0 ]
      [ sin(θ)    cos(θ)    0 ]
      [     0         0     1 ]
      

Summary

  • Affine Transformation: 2D image manipulation.
  • 3D Rotation: Rotating 3D models around axes.

Task 2: Camera Calibration

This code calculates a camera projection matrix from 3D-2D point correspondences, normalizes the point data, solves for the projection matrix, and then decomposes it into intrinsic parameters, rotation matrix, and translation vector.

Camera calibration is the process of determining a camera's intrinsic parameters (focal length, optical center, and distortion coefficients) and extrinsic parameters (camera position and orientation relative to a world coordinate system). It enables accurate mapping between 3D world coordinates and 2D image coordinates, crucial for tasks like 3D reconstruction and accurate image measurement.

Task 3: Line Detection

The code reads an image, detects edges, fits lines to these edges using RANSAC, and visualizes the result. It Shows the intermediate and final results: the blurred grayscale image, edge-detected image, and the image with fitted lines.

Line detection is a computer vision technique used to identify and locate straight lines within an image. It involves:

  1. Edge Detection: Finding boundaries or edges in the image where there is a significant change in intensity.
  2. Line Fitting: Using algorithms like RANSAC (our case) or Hough Transform to identify straight lines that best fit the detected edges or points.
  3. Visualization: Drawing the detected lines on the image for analysis or further processing. In essence, line detection helps in recognizing linear structures and patterns within an image.

Task 4: Plane Detection

The PlaneFitting.cpp file provides a complete application for plane fitting using RANSAC and Least Squares methods. It reads 3D point data, applies both fitting methods, calculates errors, and saves the results. The MatrixReaderWriter class in MatrixReaderWriter.h and MatrixReaderWriter.cpp is responsible for reading and writing matrix data from files, facilitating the handling of the input and output data.

Plane detection is a fundamental technique for identifying and modeling flat surfaces in 3D data. By applying methods like RANSAC and LSQ, it helps in analyzing and utilizing 3D point clouds in various practical applications.

Task 5: Pong Game

This game serves as a basic example of interactive graphics and event handling in a simple 2D simulation environment using OpenCV.

Task 6: Image Transformations

The code applies affine and perspective transformations to an image using OpenCV, including translation, rotation, scaling, skew, and perspective distortion. It reads an image, transforms it with specified parameters, and displays the result.

In OpenCV, transformation refers to altering the position, orientation, or size of an image using mathematical operations. This includes:

  • Translation: Shifting the image position.
  • Rotation: Rotating the image around a point.
  • Scaling: Changing the image size.
  • Skewing: Distorting the image by slanting.
  • Affine Transformation: A combination of translation, rotation, scaling, and skewing.
  • Perspective Transformation: Altering the image to simulate different viewpoints.

These transformations are typically achieved using transformation matrices applied to image coordinates.

Task 7: Stereo Reconstruction

The code performs 3D reconstruction from stereo images by:

  • Loading and Normalizing Data: Reads and normalizes feature points from two images.
  • Estimating Fundamental Matrix: Uses RANSAC to find the fundamental matrix relating the images.
  • Calculating Essential Matrix and Projection Matrices: Computes these matrices to derive camera perspectives.
  • Triangulating Points: Converts 2D feature points into 3D coordinates using the projection matrices.
  • Outputting Results: Displays matches and saves the 3D point cloud.

It combines feature matching, matrix computations, and 3D point triangulation to reconstruct the scene.

Stereo reconstruction in computer vision is the process of creating a 3D model of a scene from two or more 2D images taken from different viewpoints. By analyzing the disparities between corresponding points in these images, the depth and spatial relationships of objects in the scene can be estimated.