# Open source Structure-from-Motion and Multi-View Stereo tools with C++

Structure-from-Motion (SFM) is genuinely an interesting topic in computer vision, Basically making 3D structure from something 2D is absolutely mesmerizing 🙂

There two open source yet very robust tools for SFM, which sometimes compiling them might be complicated, here I will share my experience with you.

## 1)VisualSFM

### 1)Glew

Download the glew from SF at http://glew.sourceforge.net/. Do NOT get it from Github as to seems to have some problems.

### Prerequisite:

Install DevIl Image library

open makefile and enable siftgpu_enable_cuda

now go to bin directory and libsiftgpu.so to your vsfm bin directory

### VSFM

you can see the some of my results here:

Note: if you don't have cuda or Nvidia driver you can use your CPU but then you have to limit your file image size and sift binary should be available in your vsfm/bin directory.

Download http://www.cs.ubc.ca/~lowe/keypoints/siftDemoV4.zip and make it and copy the sift binary to vsfm/bin.

## 2)Colmap

The installation fo this one is almost straightforward except you need the latest version of Ceres Solver (do not use the one binary package from Ubuntu they are old and they won't work). So download and make and install the Ceres Solver using the following:

Now in the colmap CMakeList.txt add the following line:

just before "find_package(Ceres REQUIRED)"

and now you can make and install it. You can see some of my results here:

In the next post, I will show you how can you do that in OpenCV.

# Finding Homography Matrix using Singular-value Decomposition and RANSAC in OpenCV and Matlab



Solving a Homography problem leads to solving a set of homogeneous linear equations such below:

\left(
\begin{array}{ccccccccc}
-x1 & -y1 & -1 & 0 & 0 & 0 & x1*xp1 & y1*xp1 & xp1\\
0 & 0 & 0 & -x1 & -y1 & -1 & x1*yp1 & y1*yp1 &  yp1\\
-x2 & -y2 & -1 & 0 & 0 & 0 & x2*xp2 & y2*xp2 & xp2\\
0 & 0 & 0 & -x2 & -y2 & -1 & x2*yp2 & y2*yp2 & yp2\\
-x3 & -y3 & -1 & 0 & 0 & 0 & x3*xp3 & y3*xp3 & xp3\\
0 & 0 & 0 & -x3 & -y3 & -1 & x3*yp3 & y3*yp3 & yp3\\
-x4 & -y4 & -1 & 0 & 0 & 0 & x4*xp4 & y4*xp4 & xp4\\
0 & 0 & 0 & -x4 & -y4 & -1 & x4*yp4 & y4*yp4 & yp4\\
\end{array}
\right) *H=0

$$H^{*} \underset{H}{\mathrm{argmin}}= \|AH\|^{2}$$ subject to $$\|H\|=1$$

We can't use least square since it's a homogeneous linear equations (the other side of equation is 0 therfore we can't just multyly it by the psudo inverse). To solve this problem we use Singular-value Decomposition (SVD).

For any given matrix $$A_{M{\times}N}$$

\underbrace{\mathbf{A}}_{M \times N} = \underbrace{\mathbf{U}}_{M \times M} \times \underbrace{\mathbf{\Sigma}}_{M\times N} \times \underbrace{\mathbf{V}^{\text{T}}}_{N \times N}

$$U$$ is an $${M\times M}$$ matrix with orthogonal matrix (columns are eigen vectors of A).
$$\Sigma$$ is an $${M\times N}$$ matrix with non-negative entries, termed the singular values  (diagonal entries are eigen values of A).
$$V$$ is an $${N\times N}$$ orthogonal matrix.

$$H^{*}$$ is the last column of $$V$$

# Finding Affine Transform with Linear Least Squares



linear least squares is a method of fitting a model to data in which the relation between data and unknown paramere can be expressed in a linear form.
$$Ax=b$$
$$X^{*}= \underset{x}{\mathrm{argmin}}= \|Ax-b\|^{2} =(A^{T}A)^{-1}A^{T}b$$

And testing the code:

# Roll, pitch, yaw using Eigen and KDL Frame

From Eigen documentation:

If you are working with OpenGL 4x4 matrices then Affine3f and Affine3d are what you want.
Since Eigen defaults to column-major storage, you can directly use the Transform::data()  method to pass your transformation matrix to OpenGL.

construct a Transform:

or like this:

But note that unfortunately, because of how C++ works, you can not do this:

and with KDL Frame:

# Generating multivariate normal distribution samples using C++11 and Eigen library.

Eigen is a great tool for matrix operations, here I found a small piece of code in Github that enables you to generate multivariate normal distribution samples using C++11 and Eigen library. The below is an example:

# 2D pose estimation of human body using CNS and PCA

This work is the second part of my master thesis (part I). In this part, I developed an algorithm for 2D pose estimation of the human body. To do this, I created a software with QT that could generate 2D contours representing human body. Then I send these contours for evaluation to CNS(Contrast Normalized Sobel) [1] and finally picked the contours with the highest response.

A 2D contour is a set of points connected to each other. By changing the relative
position of these points we can generate contours that can describe a human, car,
tree or any arbitrary shape in 2D space. In this work first, we need a system that
can generate 2D contour of a human in the different pose. After this step, we need
a system for evaluating the generated contour on an image to see how close is
the contour to the pose of player in the image.

We considered 44 points on the human body to create a contour. These points are
pointing prominent part of the human body like the top of the head, ears, neck, shoulder,
elbow, wrist, knee, ankle and feet. There are several ways to connect these points
to each other i.e. straight lines, pronominal, different kind of curve fitting. In
order to make the contour smooth in a way, it could describe human body curves-
turns meanwhile keep it computationally inexpensive we used cubic splines for
connecting these point.

Now by changing the location of any these 44 points, the interpolated points would
also change and a new contour would be formed. But creating the contour by this method is pretty complicated and computationally expensive. We don’t know
how should we move these points to generate a contour of a human, furthermore
searching through 44 dimensions is computationally expensive.
To make an automatic way for generating human contours we created a training
set of several images of a player in different poses. Then we manually registered
these 44 key points on the body of the player in each individual image. Based on
the items in the training set, we can generate contours of human in the different pose
in detection phase.
We also tried to reduce the dimensions of our problem from 44 to some smaller
meaningful number. For that purpose, we used dimensionality reduction by PCA
(Principal component analysis).

We put the data regarding the x and y position of the point in these contours plus
the interpolated point as a raw entry in a matrix of data. In the next step, we tried to
find principal components of these data by finding the eigenvalue and eigenvector
of these data. After calculating the eigenvalues of these data in the matrix, we
sort them from largest value to smallest one and we pick those who contribute
90%. In the other words we sum up all the eigenvalues and then from sorted
eigenvalues in descending order (largest first) we contribute eigenvalues until the
sum of them is less than 90% of total eigenvalues.

To create a training set for the PCA (principal component analysis) we recorded
the activity of a player in different poses and we manually labeled these 44 key
points.

By selecting 44 points on each image and generating 3 interpolated points for each
pair of points and considering the (x,y) coordinates of each point our data would
have 44 × 2 × 4 = 352 dimension. To calculate PCA we have used OpenCV.
By taking into account the 90%, we found that
The optimal number of dimensions is 4 and we reduce our problem into 4 DOF.

To visualize the generated contour with these 4 parameters a software has been
designed with four sliders for each principal component respectively. By chang-
ing the position of each slider, a new value would be set for respective principal
component and by doing a back projecting with PCA, a new contour would be
generated and displayed on the image.

In figure a principal component values are set to p Φ = [143, 0, 0, 0] and In figure b principal component values are set to p Φ = [−149, 0, 0, 0].

## CNS response from optimal contour on image with clutter noise

In this experiment for each image, several contours have been generated by brute
forcing all the four principal components in PCA space and the CNS response for
them has been calculated. The contour with the highest CNS response has been
selected. It should be remembered that the variance of each component is equal
to the corresponding eigenvalue. In other words, the standard deviation of the
component is equal to the square root of the eigenvalue. To determine the range
of values for each principal component, first, we calculate the square root of each
eigenvalue. Then the range of the search space for each principal component is
set to two times of the standard deviation. The maximum and minimum value for
each principal component observed in the labeled contour also endorses this range
and falls within this interval range.

CNS response for optimal contour on images with clutter background.

### CNS response from optimal contour on image with a clear background

CNS response from optimal contour on the image with the clear background.

The CNS images of the experiment with a clear background. Highlighted part on the image indicate the magnitude of gradient on the image (the more highlighted, the greater gradient magnitude)

estimated contour from brute force has been labeled rejected if at least one of the limbs of player (arms, hands, legs or feet) has been wrongly estimated otherwise it has been labeled as accepted. Overall 0.75% of estimated contours labeled as accepted contours.

[1]Thomas Röfer Judith Müller, Udo Frese. Grab a mug - object detection and grasp motion planning with the nao robot. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS 2012), Osaka, Japan, 2012. URL http://www.informatik.uni-bremen.de/agebv2/downloads/published/muellerhumanoids12.pdf.

# Stitching image using SIFT and Homography

This Matlab tutorial I use SIFT, RANSAC, and homography to find corresponding points between two images. Here I have used vlfeat to find SIFT features.

Full code is available at my GitHub repository

Major steps are:

1.Detect key points and extract descriptors. In the image below you can see some SIFT key points and their descriptor.

2.Matching features:

3.Pruning features

In this step, I only took first k top matches.

number_of_top_matches_to_select=90;

4.Estimating transformation using RANSAC

I used RANSAC to estimate the transformation.

5.Compute optimal transformation

Finally, I used the least square method to find the affine transformation between two images.

# Lucas–Kanade method optical flow with MATLAB

In this tutorial, I will show you how to estimate optical flow based on Lucas–Kanade method.  This project has the following scripts: Optical_flow_estimation, myFlow, myWarp, computeColor, flowToColor.

The myFlow does the main job, it gets two images and a window length (patch length) and a threshold for accepting the optical flow.  In the following, you see the myFlow.  You can uncomment figure function calls to see output result of each step. The other scripts are just for visualization. you can access the code and image in my Github repository.

 first image second image estimated optical flow color map of optical flow