In this video, I explain the geometry of stereo Vision. Please have look at my other tutorial about finding the essential and fundamental matrix.
Tag Archives: computer vision
Open source Structure-from-Motion and Multi-View Stereo tools with C++
Structure-from-Motion (SFM) is genuinely an interesting topic in computer vision, Basically making 3D structure from something 2D is absolutely mesmerizing 🙂
There two open source yet very robust tools for SFM, which sometimes compiling them might be complicated, here I will share my experience with you.
1)VisualSFM
Prerequisite:
1)Glew
Download the glew from SF at http://glew.sourceforge.net/. Do NOT get it from Github as to seems to have some problems.
1 2 3 |
cd build/cmake/ && mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX:PATH=~/usr .. && make -j8 all install |
2)SiftGPU
Prerequisite:
Install DevIl Image library
1 |
sudo apt-get install libdevil-dev |
1 |
git clone https://github.com/pitzer/SiftGPU |
open makefile and enable siftgpu_enable_cuda
1 |
siftgpu_enable_cuda=1 |
1 |
make -j8 |
now go to bin directory and libsiftgpu.so to your vsfm bin directory
VSFM
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
1) Download the <a href="http://ccwu.me/vsfm/download/VisualSFM_linux_64bit.zip">vsfm</a> program 1-1)cd vsfm 1-2)make 2) Download <a href="https://grail.cs.washington.edu/projects/mcba/pba_v1.0.5.zip">PBA</a> 2-1)If you have no GPU rename makefile_no_gpu to makefie 2-2)make 2-3)rename bin/libpba_no_gpu.so to libpba.so and copy it in the vsfm/bin 3) Download <a href="https://github.com/pmoulon/CMVS-PMVS">CMVS-PMVS</a> 3-1)cd CMVS-PMVS-master/program 3-2)mkdir build && cd build 3-3)cmake ../ && make -j8 3-4)copy cmvs, genOption, pmvs2 from build/main into vsfm/bin 4) copy libsiftgpu.so from previous section to vsfm/bin 5) Run: export LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH ./VisualSFM |
you can see the some of my results here:
Note: if you don’t have cuda or Nvidia driver you can use your CPU but then you have to limit your file image size and sift binary should be available in your vsfm/bin directory.
Download http://www.cs.ubc.ca/~lowe/keypoints/siftDemoV4.zip and make it and copy the sift binary to vsfm/bin.
2)Colmap
The installation fo this one is almost straightforward except you need the latest version of Ceres Solver (do not use the one binary package from Ubuntu they are old and they won’t work). So download and make and install the Ceres Solver using the following:
1 |
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX:PATH=~/usr .. && make -j8 all install |
Now in the colmap CMakeList.txt add the following line:
1 |
set(Ceres_DIR "$ENV{HOME}/usr/lib/cmake/Ceres") |
just before “find_package(Ceres REQUIRED)”
and now you can make and install it. You can see some of my results here:
In the next post, I will show you how can you do that in OpenCV.
Effect of focal length on image
Have you ever wondered why the selfie that you are taking with your phone is so bad while when you taking a photo a decent Nikon or Canon DSLR you get a nice portrait?
This happens due to the size of the sensor and focal length. Since the sensor in your cell-phone is very small a lens with a very small focal length is needed to collect the light. You can see a nice demonstration of changing focal length on the same scene:
An other demonstartion:
Creating 3D model of buildings and monuments using structure from motion
Stitching image using SIFT and Homography
This Matlab tutorial I use SIFT, RANSAC, and homography to find corresponding points between two images. Here I have used vlfeat to find SIFT features.
Full code is available at my GitHub repository
Major steps are:
0.Adding vlfeat to your Matlab workspace:
1 |
run('<path_to_vlfeat>/toolbox/vl_setup') |
1.Detect key points and extract descriptors. In the image below you can see some SIFT key points and their descriptor.
1 2 3 4 5 |
image_left=imread('images/image-left.jpg'); scale=0.20; image_left=imresize(image_left,scale); image_left=single(rgb2gray(image_left)) ; [f_image_left,d_image_left] = vl_sift(image_left) ; |
2.Matching features:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[dims, image_right_features]=size(d_image_right); [dims, image_left_features]=size(d_image_left); M=zeros(image_right_features,image_left_features); for i=1:image_right_features for j=1:image_left_features diff = double(d_image_right(:,i)) - double(d_image_left(:,j)) ; distance = sqrt(diff' * diff); M(i,j)=distance ; end end |
3.Pruning features
In this step, I only took first k top matches.
number_of_top_matches_to_select=90;
1 2 |
M_Vec=sort(M(:)); M=(M < M_Vec( number_of_top_matches_to_select ) ); |
4.Estimating transformation using RANSAC
I used RANSAC to estimate the transformation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
for i=1:M_Rows for j=1:M_Cols if M(i,j)>0 x_right=round(f_image_right(1,i)); y_right=round(f_image_right(2,i)); x_left=round(f_image_left(1,j)); y_left=round(f_image_left(2,j)); left_matches=vertcat(left_matches, [x_left,y_left]); right_matches=vertcat(right_matches, [x_right,y_right]); end end end p=99/100; e=50/100; [s,dim]=size(left_matches); s=3 N= (log (1-p))/(log(1-(1-e)^s)) iter=N;%we choose this so with probability of 99% at least one random sample is free from ouliers num=3;%minmum number of samples to make the model i.e for line is 2 and for affine is 3 threshDist=5; %5 pixel [best_number_of_inliers,best_M,index_of_matches]=affine_ransac_est(left_matches,right_matches,num,iter,threshDist); |
5.Compute optimal transformation
Finally, I used the least square method to find the affine transformation between two images.
1 |
best_M=affine_least_square_generalized(best_left_matches, best_right_matches); |
Colour based object Tracking with OpenCV
In many applications, you need to track an object. One simple method is color based tracking. I have developed a simple tool for that with OpenCV. All you have to do is just to adjust the High and Low values of HSV slider in the left window till you filter the image and you only see your desired object, here I’m tracking a green pen, a blue water container, and a red bottle top. The code is pretty easy and straight forward but I found different pieces of the codes for each part all over the internet and I change them and adapted them together so they can do the job.
The code on my GitHub account.
Lucas–Kanade method optical flow with MATLAB
In this tutorial, I will show you how to estimate optical flow based on Lucas–Kanade method. This project has the following scripts: Optical_flow_estimation, myFlow, myWarp, computeColor, flowToColor.
The myFlow does the main job, it gets two images and a window length (patch length) and a threshold for accepting the optical flow. In the following, you see the myFlow. You can uncomment figure function calls to see output result of each step. The other scripts are just for visualization. you can access the code and image in my Github repository.
first image |
![]() second image |
![]() estimated optical flow |
color map of optical flow |
6DOF pose estimation with Aruco marker and ROS
ArUco is a simple yet great library for augmented reality applications. In this tutorial, I’m gonna show you how to track ArUco marker and estimate their 6DOF pose with ROS.
For this tutorial, you only need a USB camera. You need to calibrate your camera before first. If you don’ know how to that just follow my other tutorial on camera calibration with ROS.
1.So first let’s install the required packages:
1 |
sudo apt-get install ros-indigo-usb-cam ros-indigo-aruco-ros |
2. You need two launch files, one of them will publish images from your USB cam:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
<launch> <arg name="video_device" default="/dev/video0" /> <arg name="image_width" default="640" /> <arg name="image_height" default="480" /> <node name="usb_cam" pkg="usb_cam" type="usb_cam_node" output="screen" > <param name="video_device" value="$(arg video_device)" /> <param name="image_width" value="$(arg image_width)" /> <param name="image_height" value="$(arg image_height)"/> <param name="pixel_format" value="mjpeg" /> <param name="camera_frame_id" value="usb_cam" /> <param name="io_method" value="mmap"/> </node> </launch> |
save it under usb_cam_stream_publisher.launch
3.The other launch file will find the ArUco marker in the image and publish the 6DOF pose. So open your editor and paste the following into that:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
<launch> <arg name="markerId" default="701"/> <arg name="markerSize" default="0.05"/> <!-- in meter --> <arg name="eye" default="left"/> <arg name="marker_frame" default="marker_frame"/> <arg name="ref_frame" default=""/> <!-- leave empty and the pose will be published wrt param parent_name --> <arg name="corner_refinement" default="LINES" /> <!-- NONE, HARRIS, LINES, SUBPIX --> <node pkg="aruco_ros" type="single" name="aruco_single"> <remap from="/camera_info" to="/usb_cam/camera_info" /> <remap from="/image" to="/usb_cam/image_raw" /> <param name="image_is_rectified" value="True"/> <param name="marker_size" value="$(arg markerSize)"/> <param name="marker_id" value="$(arg markerId)"/> <param name="reference_frame" value="$(arg ref_frame)"/> <!-- frame in which the marker pose will be refered --> <param name="camera_frame" value="base_link"/> <param name="marker_frame" value="$(arg marker_frame)" /> <param name="corner_refinement" value="$(arg corner_refinement)" /> </node> </launch> |
save it under aruco_marker_finder.launch
4. Now start publishing images from your camera:
1 |
roslaunch usb_cam_stream_publisher.launch |
and find the markers:
1 |
roslaunch aruco_marker_finder.launch markerId:=701 markerSize:=0.05 |
5.To see the results, open rqt_gui:
1 |
rosrun rqt_gui rqt_gui |
6.The topic name of the pose of markers is /aruco_single/pose. You can monitor it by:
1 |
rostopic echo /aruco_single/pose |
7. Generate your aruco marker at:
http://terpconnect.umd.edu/~jwelsh12/enes100/markergen.html
special thanks to sauravag.com.
PCL pointcloud pairwise registration
registration is aligning 3D point cloud on each other such that it gives you a complete model. To achieve this, you need to find the relative positions and orientations of each point cloud, such that you maximize the overlapping intersecting areas between them [1].
So I got the idea from here and I implemented a software based on that. In the following, you can see the main idea and the step I took and finally the results:
Main Flowchart of pairwise point cloud registration

image source: image has been taken from http://pointclouds.org/documentation/tutorials/registration_api.php
1)Importing point cloud acquired from different angles, down sampling, selecting keypoint extractor method SIFT, NARF, Harris, SUSAN and respected parameters
2)Selected keypoints are highlighted in green, for each keypoint a descriptor (PFH or FPFH) is estimated
3) Correspondences between keypoint descriptor are estimated (histogram distance) and correspondent points are connected.
4) Correspondent points are rejected via several algorithm and from the remained correspondent points a 4×4 transformation matrix is computed
5) 4×4 transformation matrix is used for initial estimation of ICP algorithm and two point clouds are merged into one
The moon is sphere! structure from motion confirms that :)
So the other day I saw this beautiful 360-degree video of the moon and I decided to see what would I get if I apply SFM (structure from motion) algorithm to the images in the video. So I extracted the image frame by frame from the video and feed the to SFM algorithm and the result was exactly what I expected! a perfect sphere. you can download the model from here
You can open it with Meshlab