To get started: Fork this repository then issue
git clone --recursive http://github.com/[username]/geometry-processing-registration.git
See introduction.
Once built, you can execute the assignment from inside the build/
using
./registration [path to mesh1.obj] [path to mesh2.obj]
In this assignment, we will be implementing a version of the iterative closest point (ICP), not to be confused with Insane Clown Posse.
Rather than registering multiple point clouds, we will register multiple triangle mesh surfaces.
This algorithm and its many variants has been use for quite some time to align discrete shapes. One of the first descriptions is given in "A Method for Registration of 3-D Shapes" by Besl & McKay 1992. However, the award-winning PhD thesis of Sofien Bouaziz ("Realtime Face Tracking and Animation" 2015, section 3.2-3.3) contains a more modern view that unifies many of the variants with respect to how they impact the same core optimization problem.
For our assignment, we will assume that we have a triangle mesh representing a
complete scan of the surface
These meshes will not have the same number of vertices or the even the same
topology. We will first explore different ways to measure how well aligned
two surfaces are and then how to optimize the rigid alignment of the partial
surface
We would like to compute a single scalar number that measures how poorly two surfaces are matched. In other words, we would like to measure the distance between two surfaces. Let's start by reviewing more familiar distances:
The usually Euclidean distance between two points
\[ d(\x,\y) = ‖\x - \y‖. \]
When we consider the distance between a point
\[ d(\x,Y) = \inf_{\y ∈ Y} d(\x,\y). \]
written in this way the
infimum considers all
possible points
\[ d(\x,Y) = d((\x,P_Y(\x)) = ‖\x - P_Y(\x)‖. \]
If
We might be tempted to define the distance from surface
\[ D_\text{inf}(X,Y) = \inf_{\x ∈ X} ‖\x - P_Y(\x)‖, \]
but this will not be useful for registering two surfaces: it will measure zero
if even just a single point of
Instead, we should take the supremum of point-to-projection distances over
all points
\[ D_{\overrightarrow{H}}(X,Y) = \sup_{\x ∈ X} ‖\x - P_Y(\x)‖. \]
This surface-to-surface distance measure is called the directed Hausdorff
distance. We may interpret
this as taking the worst of the best: we
let each point
It is easy to verify that
The converse is not true: if
\[ D_{\overrightarrow{H}}(X,Y) ≠ D_{\overrightarrow{H}}(Y,X). \]
We can approximate a lower bound on the Hausdorff distance between two meshes
by densely sampling surfaces
\[ D_{\overrightarrow{H}}(X,Y) ≥ D_{\overrightarrow{H}}(\P_X,Y) = \max_{i=1}^k ‖\p_i - P_Y(\p_i)‖, \]
where we should be careful to ensure that the projection
As our sampling
Even if it were cheap to compute, Hausdorff distance is difficult to
optimize when aligning two surfaces. If we treat the Hausdorff distance
between surfaces
Hausdorff distance can serve as a validation measure, while we turn to
We would like a distance measure between two surfaces that---like Hausdorff
distance---does not require a shared parameterization. Unlike Hausdorff
distance, we would like this distance to diffuse the measurement over the
entire surfaces rather than generate it from the sole worst offender. We can
accomplish this by replacing the supremum in the Hausdorff distance (
\[ D_{\overrightarrow{C}}(X,Y) = \sqrt{\ ∫\limits_{\x∈X} ‖\x - P_Y(\x) ‖² ;dA }. \]
This distance will only be zero if all points on
This distance is suitable to define a matching energy, but is not necessarily
welcoming for optimization: the function inside the square is non-linear. Let's
dig into it a bit. We'll define a directed matching energy
\[ E_{\overrightarrow{C}}(Z,Y) = ∫\limits_{\z∈Z} ‖\z - P_Y(\z) ‖² ;dA = ∫\limits_{\z∈Z} ‖f_Y(\z) ‖² ;dA \]
where we introduce the proximity function
\[ \f(\z) = \z - P_Y(\z). \]
Suppose
Similarly, suppose
But in general, if
In optimization, a common successful strategy to minimize energies composed of
squaring a non-linear functions
This is the core idea behind gradient descent and the Gauss-Newton methods:
minimize f(z)²
z₀ ← initial guess
repeat until convergence
f₀ ← linearize f(z) around z₀
z₀ ← minimize f₀(z)²
Since our
If we make the convenient---however unrealistic---assumption that in the
neighborhood of the closest-point projection
\[ \f(\z) \approx \f_\text{point}(\z) = \z-P_Y(\z₀) \]
In effect, we are assuming that the surface
Minimizing
If we make make a slightly more appropriate assumption that in the neighborhood
of the
\[ \f(\z) \approx \f_\text{plane}(\z) = ((\z-P_Y(\z₀))⋅\n) \n. \]
where the plane that best approximates
Minimizing
Equipped with these linearizations, we may now describe an optimization
algorithm
for minimizing the matching energy between a surface
So far we have derived distances between a surface
Our matching problem can be written as an optimization problem to find the best
possible rotation
\[ \mathop{\text{minimize}}{\t∈\R³,\ \Rot ∈ SO(3)} ∫\limits{\x∈X} ‖\Rot \x + \t - P_Y(T(\x)) ‖² ;dA \]
Even if
\[ \mathop{\text{minimize}}{\t∈\R³,\ \Rot ∈ SO(3)} ∑{i=1}^k ‖\Rot \x_i + \t - P_Y(T(\x_i)) ‖², \]
where
As the name implies, the method proceeds by iteratively finding the closest
point on
If
icp V_X, F_X, V_Y, F_Y
R,t ← initialize (e.g., set to identity transformation)
repeat until convergence
X ← sample source mesh (V_X,F_X)
P0 ← project all X onto target mesh (V_Y,F_Y)
R,t ← update rigid transform to best match X and P0
V_X ← rigidly transform original source mesh by R and t
We would like to find the rotation matrix
In either case, this is still a non-linear optimization problem. This time due to the constraints rather than the energy term.
We require that
\[ \Rot \approx \I + \left(\begin{array}{ccc} 0 & -γ & β \ γ & 0 & -α \ -β & α & 0 \ \end{array}\right) \]
where we can now work directly with the three scalar unknowns
If we apply our linearization of
\[ \mathop{\text{minimize}}{\t∈\R³, α, β, γ} ∑{i=1}^k \left| \left(\begin{array}{ccc} 0 & -γ & β \ γ & 0 & -α \ -β & α & 0 \ \end{array}\right) \x_i + \t - \p_i \right|^2. \]
This energy is quadratic in the translation vector
\[ \mathop{\text{minimize}}{\u∈\R⁶} ∑{i=1}^k \left| \left(\begin{array}{cccccc} 0 & x_{i,3} & -x_{i,2} & 1 & 0 & 0 \ -x_{i,3} & 0 & x_{i,1} & 0 & 1 & 0 \ x_{i,2} & -x_{i,1} & 0 & 0 & 0 & 1 \end{array}\right) \u + \x_i - \p_i \right|^2. \]
This can be written compactly in matrix form as:
\[
\mathop{\text{minimize}}{\u∈\R⁶}
\left|
\underbrace{
\left(\begin{array}{cccccc}
0 & \X_3 & -\X_2 & \One & 0 & 0 \
-\X_3 & 0 & \X_1 & 0 & \One & 0 \
\X_2 & -\X_1 & 0 & 0 & 0 & \One
\end{array}\right)
}{\A}
\u +
\left[\begin{array}{c}
\X_1-\P_1 \
\X_2-\P_2 \
\X_3-\P_3
\end{array}\right]
\right|_F^2,
\]
where we introduce the matrix
This quadratic energy is minimized with its partial derivatives with respect to
entries in
\[ \begin{align} \A^\transpose \A \u & = -\A^\transpose \left[\begin{array}{c} \X_1-\P_1 \ \X_2-\P_2 \ \X_3-\P_3 \end{array}\right] , \ \u & = \left(\A^\transpose \A\right)^{-1} \left(-\A^\transpose \left[\begin{array}{c} \X_1-\P_1 \ \X_2-\P_2 \ \X_3-\P_3 \end{array}\right] \right), \end{align} \]
Solving this small 6×6 system gives us our translation vector
\[ \Rot ← \M := \I + \left(\begin{array}{ccc} 0 & -γ & β \ γ & 0 & -α \ -β & α & 0 \ \end{array}\right) \]
then our transformation will not be rigid. Instead, we should project
In an effort to provide an alternative from "Least-Squares Rigid Motion Using SVD" [Sorkine 2009], this derivation purposefully avoids the trace operator and its various nice properties.
If
In general, it is better to find the closest rotation matrix to
\[
\Rot^* = \argmin_{\Rot ∈ SO(3)} \left| \Rot - \M \right|_F^2,
\]
where $|\X|F^2$ computes the squared Frobenius
norm of the matrix
$\X$ (i.e., the sum of all squared element values. In MATLAB syntax:
sum(sum(A.^2))
). We can expand the norm by taking advantage of the associativity
property of the Frobenius
norm:
\[
\Rot^* = \argmin{\Rot ∈ SO(3)} \left| \M \right|_F^2 + \left| \Rot
\right|_F^2 - 2 \left<\Rot, \M \right>_F,
\]
where sum(sum(A.*B))
). We can drop the Frobenius norm
of
We now take advantage of the singular value
decomposition of
\[ \Rot^* = \argmax_{\Rot ∈ SO(3)} \left<\Rot,\U Σ \V^\transpose \right>_F. \]
The Frobenius inner product will let us move the products by
Recall some linear algebra properties:
- Matrix multiplication (on the left) can be understood as acting on each column: $\A \B = \A [\B_1 \ \B_2 \ … \ \B_n] = [\A \B_1 \ \A \B_2 \ …
\A \B_n]$,- The Kronecker product
$\I ꕕ \A$ of the identity matrix$\I$ of size$k$ and a matrix$\A$ simply repeats$\A$ along the diagonal k times. In MATLAB,repdiag(A,k)
,- Properties 1. and 2. imply that the vectorization of a matrix product
$\B\C$ can be written as the Kronecker product of the #-columns-in-$\C$ identity matrix and$\B$ times the vectorization of$\C$ :$\text{vec}(\B\C) = (\I ꕕ \B)\text{vec}(\C)$ ,- The transpose of a Kronecker product is the Kronecker product of transposes:
$(\A ꕕ \B)^\transpose = \A^\transpose ꕕ \B^\transpose$ ,- The Frobenius inner product can be written as a dot product of vectorized matrices:
$<\A,\B>_F = \text{vec}(\A) ⋅ \text{vec}(\B) = \text{vec}(\A)^\transpose \text{vec}(\B)$ ,- Properties 3., 4., and 5. imply that Frobenius inner product of a matrix
$\A$ and the matrix product of matrix$\B$ and$\C$ is equal to the Frobenius inner product of the matrix product of the transpose of$\B$ and$\A$ and the matrix$\C$ :$<\A,\B\C>_F = \text{vec}(\A)^\transpose \text{vec}(\B\C) = \text{vec}(\A)^\transpose (\I ꕕ \B)\text{vec}(\C) = \text{vec}(\A)^\transpose (\I ꕕ \B^\transpose)^\transpose \text{vec}(\C) = \text{vec}(\B^\transpose\A)^\transpose \text{vec}(\C) = <\B^\transpose \A,\C>_F$ .
\[ \Rot^* = \argmax_{\Rot ∈ SO(3)} \left<\U^\transpose \Rot \V, Σ \right>_F. \]
Now,
\[ \Rot^* = \U \left( \argmax_{Ω ∈ O(3),\ \det{Ω} = \det{\U\V^\transpose}} \left<Ω, Σ \right>_F \right) \V^\transpose. \]
This ensures that as a result $\Rot^$ will be a rotation: $\det{\Rot^} = 1$.
Recall that
$Σ∈\R^{3×3}$ is a non-negative diagonal matrix of singular values sorted so that the smallest value is in the bottom right corner.
Because
\[
Ω^*_{ij} = \begin{cases}
1 & \text{ if
Finally, we have a formula for our optimal rotation:
\[ \Rot = \U Ω^* \V^\transpose. \]
Interestingly, despite the non-linear constraint on
$\Rot$ there is actually a closed-form solution to the point-to-point matching problem:\[ \mathop{\text{minimize}}{\t∈\R³,\ \Rot ∈ SO(3)} ∑{i=1}^k ‖\Rot \x_i + \t - \p_i‖², \]
This is a variant of what's known as a Procrustes problem, named after a mythical psychopath who would kidnap people and force them to fit in his bed by stretching them or cutting off their legs. In our case, we are forcing
$\Rot$ to be perfectly orthogonal (no "longer", no "shorter).This energy is quadratic in
$\t$ and there are no other constraints on$\t$ . We can immediately solve for the optimal$\t^*$ ---leaving$\Rot$ as an unknown---by setting all derivatives with respect to unknowns in$\t$ to zero:\[ \begin{align} \t^* &= \argmin_{\t} ∑_{i=1}^k ‖\Rot \x_i + \t - \p_i‖² \\ &= \argmin_\t \left|\Rot \X^\transpose + \t \One^\transpose - \P^\transpose\right|^2_F, \end{align} \] where
$\One ∈ \R^{k}$ is a vector ones. Setting the partial derivative with respect to$\t$ of this quadratic energy to zero finds the minimum: \[ \begin{align} 0 &= \frac{∂}{∂\t} \left|\Rot \X^\transpose + \t \One^\transpose - \P^\transpose\right|^2_F \\ &= \One^\transpose \One \t + \Rot \X^\transpose \One - \P^\transpose \One, \end{align} \]Rearranging terms above reveals that the optimal
$\t$ is the vector aligning the centroids of the points in$\P$ and the points in$\X$ rotated by the---yet-unknown---$\Rot$. Introducing variables for the respective centroids$\hat{\p} = \tfrac{1}{k} ∑_{i=1}^k \p_i$ and$\hat{\x} = \tfrac{1}{k} ∑_{i=1}^k \x_i$ , we can write the formula for the optimal$\t$ :\[ \begin{align} \t &= \frac{\P^\transpose \One - \Rot \X^\transpose \One}{ \One^\transpose \One} \\ &= \hat{\p} - \Rot \hat{\x}. \end{align} \]
Now we have a formula for the optimal translation vector
$\t$ in terms of the unknown rotation$\Rot$ . Let us substitute this formula for all occurrences of$\t$ in our energy written in its original summation form:\[ \mathop{\text{minimize}}{\Rot ∈ SO(3)} ∑\limits{i=1}^k \left| \Rot \x_i + ( \hat{\p} - \Rot\hat{\x}) - \p_i \right|^2 \
\mathop{\text{minimize}}{\Rot ∈ SO(3)} ∑\limits{i=1}^k \left| \Rot (\x_i - \hat{\x}) - (\p_i - \hat{\p}) \right|^2 \\ \mathop{\text{minimize}}{\Rot ∈ SO(3)} ∑\limits{i=1}^k \left| \Rot \overline{\x}_i - \overline{\p}i \right|^2 \\ \mathop{\text{minimize}}{\Rot ∈ SO(3)} \left| \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right|_F^2, \]where we introduce
$\overline{\X} ∈ \R^{k × 3}$ where the ith row contains the relative position of the ith point to the centroid$\hat{\x}$ : i.e.,$\overline{\x}_i = (\x_i - \hat{\x})$ (and analagously for$\overline{\P}$ ).Now we have the canonical form of the orthogonal procrustes problem. To find the optimal rotation matrix
$\Rot^*$ we will massage the terms in the minimization until we have a maximization problem involving the Frobenius inner-product of the unknown rotation$\Rot$ and covariance matrix of$\X$ and$\P$ :\[ \begin{align} \Rot^* &= \argmin_{\Rot ∈ SO(3)} \left| \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right|F^2 \\ &= \argmin{\Rot ∈ SO(3)} \left<\Rot \overline{\X}^\transpose - \overline{\P}^\transpose , \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right>F\\ &= \argmin{\Rot ∈ SO(3)} \left| \overline{\X} \right|_F^2 + \left| \overline{\P} \right|_F^2 - 2 \left<\Rot \overline{\X}^\transpose , \overline{\P}^\transpose \right>F\\ &= \argmax{\Rot ∈ SO(3)} \left<\Rot,\overline{\P}^\transpose,\overline{\X}\right>F\\ &= \argmax{\Rot ∈ SO(3)} \left<\Rot,\M\right>_F\ \end{align} \]
Letting
$\M = \overline{\P}^\transpose,\overline{\X}$ we can now follow the steps above using singular value decomposition to find the optimal$\Rot$ .
If we apply our linearization of
\[ \mathop{\text{minimize}}{\t∈\R³, α, β, γ} ∑{i=1}^k \left( \left( \left(\begin{array}{ccc} 0 & -γ & β \ γ & 0 & -α \ -β & α & 0 \ \end{array}\right)\x_i + \x_i + \t - \p_i \right)⋅\n_i \right)^2. \]
We can follow similar steps as above. Let's gather a vector of unknowns:
\[ \mathop{\text{minimize}}{\u∈\R⁶} ∑{i=1}^k \left(\n_i^\transpose \left(\begin{array}{cccccc} 0 & x_{i,3} & -x_{i,2} & 1 & 0 & 0 \ -x_{i,3} & 0 & x_{i,1} & 0 & 1 & 0 \ x_{i,2} & -x_{i,1} & 0 & 0 & 0 & 1 \end{array}\right) \u + \n_i^\transpose(\x_i - \p_i) \right)^2. \]
This can be written compactly in matrix form as:
\[ \mathop{\text{minimize}}_{\u∈\R⁶} \left( \left[\begin{array}{ccc} \text{diag}(\N_1) & \text{diag}(\N_2) & \text{diag}(\N_2)\end{array}\right] \left( \A \u + \left[\begin{array}{c} \X_1-\P_1 \ \X_2-\P_2 \ \X_3-\P_3 \end{array}\right]\right) \right)^2, \]
where
This energy is quadratic in
To the best of my knowledge, no known closed-form solution exists. I am not sure whether it can not exist or just whether no one has figured it out (or they did and I just do not know about it).
Our last missing piece is to sample the surface of a triangle mesh
We would like our random
variable
Suppose we have a way to evaluate a continuous random point
\[ h(T) g_T(\x) = \frac{A_T}{A_X} \frac{1}{A_T} = \frac{1}{A_X} = f(\x). \]
In order to pick a point uniformly randomly in a triangle with corners
\[ \x = \v_1 + α (\v_2-\v_1) + β (\v_3 - \v_1) \]
where
Assuming we know how to draw a continuous uniform random variable
We can achieve this by first computing the cumulative
sum
\[ C_i = ∑_{j=1}^i \frac{A_j}{A_X}, \]
Then our random index is found by identifying the first entry in
Try profiling your code. Where is most of the computation time spent?
If you have done things right, the majority of time is spent computing
point-to-mesh distances. For each query point, the computational
complexity of
computing its distance to a mesh with
This can be dramatically improved (e.g., to
This reading task is not directly graded, but it's expected that you read and understand sections 3.2-3.3 of Sofien Bouaziz's PhD thesis "Realtime Face Tracking and Animation" 2015. Understanding this may require digging into wikipedia, other online resources or other papers.
You may not use the following libigl functions:
igl::AABB
igl::fit_rotations
igl::hausdorff
igl::point_mesh_squared_distance
igl::point_simplex_squared_distance
igl::polar_dec
igl::polar_svd3x3
igl::polar_svd
igl::random_points_on_mesh
You are encouraged to use the following libigl functions:
igl::cumsum
computes cumulative sumigl::doublearea
computes triangle areasigl::per_face_normals
computes normal vectors for each triangle face
Generate n
random points uniformly sampled on a given triangle mesh with
vertex positions VX
and face indices FX
.
Compute the distance d
between a given point x
and the closest point p
on
a given triangle with corners a
, b
, and c
.
Compute the distances D
between a set of given points X
and their closest
points P
on a given mesh with vertex positions VY
and face indices FY
.
For each point in P
also output a corresponding normal in N
.
It is OK to assume that all points in
P
lie inside (rather than exactly at vertices or exactly along edges) for the purposes of normal computation inN
.
Compute a lower bound on the directed Hausdorff distance from a given mesh
(VX
,FX
) to another mesh (VY
,FY
). This function should be implemented by
randomly sampling the
Given a 3×3 matrix M
, find the closest rotation matrix R
.
Given a set of source points X and corresponding target points P, find the optimal rigid transformation (R,t) that aligns X to P, minimizing the point-to-point matching energy.
You may implement either that "Approximate" solution via linearizing the rotation matrix or the "closed form" solution
Given a set of source points X
and corresponding target points P
and their
normals N
, find the optimal rigid transformation (R
,t
) that aligns X
to
planes passing through P
orthogonal to N
, minimizing the point-to-point
matching energy.
Conduct a single iteration of the iterative closest point method align
(VX
,FX
) to (VY
,FY
) by finding the rigid transformation (R
,t
)
minimizing the matching energy.
The caller can specify the number of samples num_samples
used to approximate
the integral over method
(point-to-point or
point-to-plane).