-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruction from many views #35
Comments
In linear algebra, a rotation matrix is a square matrix that is used to perform rotations in n-dimensional Euclidean space. A rotation matrix is an orthogonal matrix, meaning that its columns and rows are orthonormal vectors (i.e., they have length 1 and are perpendicular to each other). To check if a given matrix R is a coherent rotation matrix, we need to verify two conditions: R is a square matrix with the same number of rows and columns. The function check_coherent_rotation(R: np.ndarray) -> bool takes a numpy array R as input and returns True if it is a coherent rotation matrix, or False otherwise. It uses the numpy linalg.det() function to compute the determinant of R, and compares it to 1.0 within the tolerance specified by epsilon. If the absolute difference between det(R) and 1.0 is less than or equal to epsilon, the function returns True, indicating that R is a coherent rotation matrix. |
In the given C++ code, the 3D-2D correspondences are found by iterating through the good views (already used views), and for each view, the code checks if there are matches between the working view and the old view. If there is a match, it then iterates through the 3D point cloud to find the corresponding 3D point to the 2D point in the old view. This is done by comparing the index of the 2D point in the old view with the indices stored in the index_of_2d_origin attribute of the CloudPoint struct. Here's an example of how you can find 3D-2D correspondences in Python within the generate_point_cloud_general function: def find_3d_2d_correspondences(pcloud, good_views, img_keypoints, matches_matrix):
img_points = []
ppcloud = []
for old_view in good_views:
matches_from_old_to_working = matches_matrix[(old_view, working_view)]
for match in matches_from_old_to_working:
idx_in_old_view = match.queryIdx
for pcldp, cloud_point in enumerate(pcloud):
if idx_in_old_view == cloud_point.index_of_2d_origin[old_view]:
ppcloud.append(cloud_point.pt)
pt_ = img_keypoints[working_view][match.trainIdx].pt
img_points.append(pt_)
break
return ppcloud, img_points You can call this function inside the generate_point_cloud_general function before running solvePnPRansac. To find the camera pose using the 3D-2D correspondences, the code uses the solvePnPRansac function, which takes the 3D points, 2D points, camera matrix K, and distortion coefficients as input and returns the rotation and translation vectors. The rotation vector is then converted to a 3x3 rotation matrix using the Rodrigues function. Finally, the projection matrix P1 is constructed using the rotation matrix and the translation vector. In the given C++ and Python code, the reference image is chosen based on the good views that have already been processed. The reference image is the one with the most inliers or the best quality matches. If you have images [1, 2, 3, 4] and image 1 is the reference image, you will need to find 3D-2D correspondences for each image to estimate the camera poses for each image relative to the reference image. In this case, image 1 would be the base view, and you would find correspondences between image 1 and images 2, 3, and 4. Then, use solvePnPRansac to estimate the camera poses. Here's an example of using solvePnPRansac in the Python code: retval, rvec, tvec, inliers = cv2.solvePnPRansac(ppcloud, img_points, K, dist_coeffs)
# Convert the rotation vector to a rotation matrix
R, _ = cv2.Rodrigues(rvec)
# Construct the projection matrix P1
P1 = np.hstack((R, tvec)) You should incorporate this code into the generate_point_cloud_general function after finding the 3D-2D correspondences. |
Importantimport cv2
import numpy as np
def find_3d_2d_correspondences(pcloud, good_views, img_keypoints, matches_matrix, working_view):
img_points = []
ppcloud = []
for old_view in good_views:
matches_from_old_to_working = matches_matrix[(old_view, working_view)]
for match in matches_from_old_to_working:
idx_in_old_view = match.queryIdx
for cloud_point in pcloud:
if idx_in_old_view == cloud_point.index_of_2d_origin[old_view]:
ppcloud.append(cloud_point.pt)
pt_ = img_keypoints[working_view][match.trainIdx].pt
img_points.append(pt_)
break
return ppcloud, img_points
def generate_point_cloud_general(images: Images, K_matrix: np.ndarray, dist_coeffs: np.ndarray, **kwargs) -> np.ndarray:
point_cloud = []
good_views = set()
for i in range(len(images.images) - 1): # "0", 1, 2, 3 -> (0, 1) and (0, 2) and (0, 3)
img1 = images.images[i]
img2 = images.images[i + 1]
keypoints1 = np.array([img1.keypoints[m.queryIdx].pt for m in images.feature_matches[i].matches])
keypoints2 = np.array([img2.keypoints[m.trainIdx].pt for m in images.feature_matches[i].matches])
P2 = find_camera_matrices(K_matrix, keypoints1, keypoints2, images.feature_matches[i].matches)
if P2 is None:
continue
if i == 0: # Intial Structure
P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))
P2 = K_matrix @ P2
points_4D: list[OpenCV.points4D] = cv2.triangulatePoints(P1, P2, keypoints1.T, keypoints2.T)
points_3D: list[OpenCV.points3D] = (points_4D / points_4D[3])[:3]
point_cloud.append(points_3D)
good_views.add(i)
good_views.add(i + 1)
if i > 0:
working_view = i + 1
ppcloud, img_points = find_3d_2d_correspondences(point_cloud[-1], good_views, [img.keypoints for img in images.images], images.matches_matrix, working_view)
retval, rvec, tvec, inliers = cv2.solvePnPRansac(ppcloud, img_points, K_matrix, dist_coeffs)
R, _ = cv2.Rodrigues(rvec)
P1 = np.hstack((R, tvec))
P2 = K_matrix @ P1
points_4D = cv2.triangulatePoints(K_matrix @ np.eye(3, 4), P2, keypoints1.T, keypoints2.T)
points_3D = (points_4D / points_4D[3])[:3]
point_cloud.append(points_3D)
good_views.add(working_view)
return np.hstack(point_cloud).T |
import numpy as np
import cv2 as OpenCV
from typing import List, Tuple
def check_coherent_rotation(R: np.ndarray) -> bool:
return np.abs(np.linalg.det(R) - 1.0) <= 1e-6
def find_camera_matrices(K: np.ndarray, keypoints_one: np.ndarray, keypoints_two: np.ndarray, matches: List) -> Tuple[np.ndarray, np.ndarray]:
E, mask = OpenCV.findEssentialMat(keypoints_one, keypoints_two, K, method=OpenCV.RANSAC, prob=0.999, threshold=1.0)
_, R, t, _ = OpenCV.recoverPose(E, keypoints_one, keypoints_two, K)
# Check if the resulting rotation is coherent
return (None, None) if not check_coherent_rotation(R) else (R, t)
def generate_point_cloud_general(images: Images, K_matrix: np.ndarray, **kwargs) -> np.ndarray:
point_cloud = []
P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))
for feature_match in images.feature_matches: # "0", 1, 2, 3 -> (0, 1) and (0, 2) and (0, 3)
image_one = feature_match.image_one
image_two = feature_match.image_two
# Extract matched keypoints
keypoints_one = np.array([image_one.keypoints[m.queryIdx].pt for m in feature_match.matches])
keypoints_two = np.array([image_two.keypoints[m.trainIdx].pt for m in feature_match.matches])
# Find Camera Matrices
P2 = find_camera_matrices(K_matrix, keypoints_one, keypoints_two, feature_match.matches)
if P2 is None:
continue
if feature_match == images.feature_matches[0]: # Intial Structure
P2 = K_matrix @ P2
else:
ppcloud, img_points = find_3d_2d_correspondences(point_cloud[-1], [image_one, image_two], [img.keypoints for img in images.images], images.matches_matrix, image_two)
retval, rvec, tvec, inliers = OpenCV.solvePnPRansac(ppcloud, img_points, K_matrix, None)
R, _ = OpenCV.Rodrigues(rvec)
P2 = K_matrix @ np.hstack((R, tvec))
points_4D = OpenCV.triangulatePoints(P1, P2, keypoints_one.T, keypoints_two.T)
points_3D = (points_4D / points_4D[3])[:3]
point_cloud.append(points_3D)
return np.hstack(point_cloud).T |
The solvePnP function in OpenCV estimates the pose (the rotation and translation) of a camera given a set of 3D points in the world coordinate system and their corresponding 2D projections in the image plane. This problem is known as the Perspective-n-Point (PnP) problem.
The PnP problem can be solved using various methods, and OpenCV defaults to the iterative Levenberg-Marquardt optimization algorithm. Other methods available in OpenCV include P3P, EPnP, and DLS. Each method has different assumptions and trade-offs in terms of accuracy, speed, and the minimum number of point correspondences required.
The logic behind the solvePnP function can be summarized as follows:
Input
: The function takes as input a set of 3D points in the world coordinate system, their corresponding 2D projections in the image plane, the intrinsic camera matrix (including focal length and principal point), and an optional distortion coefficients vector.Estimate camera pose
: The function estimates the rotation vector (rvec) and translation vector (tvec) that describe the transformation from the world coordinate system to the camera coordinate system. This is done by minimizing the re-projection error, which is the difference between the observed 2D points and the 2D points obtained by projecting the 3D points using the estimated camera pose.Output
: The function returns the estimated rotation and translation vectors. You can convert the rotation vector to a rotation matrix using the cv2.Rodrigues function if needed.The estimated camera pose can be used to understand the position and orientation of the camera in the world coordinate system or to project additional 3D points onto the image plane. It is an essential component in many computer vision and robotics tasks, such as 3D reconstruction, augmented reality, and visual odometry.
The text was updated successfully, but these errors were encountered: