# Kernel Density Estimation (KDE) for estimating probability distribution function



There are several approaches for estimating the probability distribution function of a given data:

1)Parametric
2)Semi-parametric
3)Non-parametric

A parametric one is GMM via algorithm such as expectation maximization. Here is my other post for expectation maximization.

Example of Non-parametric is the histogram, where data are assigned to only one bin and depending on the number bins that fall within an interval the height of histogram will be determined.

Kernel Density Estimation (KDE) is an example of a non-parametric method for estimating the probability distribution function. It is very similar to histogram but we don’t assign each data to only to a bin. In KDE we use a kernel function which weights data point, depending on how far are they from the point $$x$$.

\hat{f}(x) = \frac{1}{nh} \sum_{i=1}^n k\bigg(\frac{ x-x_i  }{h}\bigg)

where $$h$$ is a bandwidth parameter and $$k$$ is the kernel function. One choice for kernel function is the Gaussian (normal distribution)  but there are other kernel functions (uniform, triangular, biweight, triweight, Epanechnikov) that can be used as well. Choosing too small or too bog values for bandwidth might overfit or under fit our estimation. A rule of thumb for choosing bandwidth is Silverman rule.

# Expectation Maximization algorithm to obtain Gaussian mixture models for ROS

I found a really good code at GitHub for fitting a Gaussian Mixture Model (GMM) with Expectation Maximization (EM) for ROS. There are so many parameters that you can change. Some of the most important ones are:

To find the optimal number of components, it uses Bayesian information criterion (BIC). There are other methods to find the optimal number of components: Minimum description length (MDL),  Akaike information criterion (AIC),  Minimum message length (MML).

Here is my code for generating a 2 Gaussian and sending them to this node:

and you need to put them in to send them to the node:

and the results are what we expect:

It also makes it possible to visualize the data in RVIZ, but first, you have to publish your tf data and set the frame name and topic names correctly in gmm_rviz_converter.h

and add a MarkerArray in RVIZ and set the topic “gmm_rviz_converter_output

References: [1], [2]

# Analytic distance metric for Gaussian mixture models

In many applications, you need to compare two or more data sets with each other to see how much they are similar or different. For instance, you have measured the height of men and women in Japan and Netherlands and now you like to know how much they are different.



Two commonly used method for measuring distances are the Kullback-Liebler divergence and the Bhattacharyya distance.

KL divergence (Kullback-Liebler divergence) measures the difference between two probability distributions p and q.

\label{Kullback_Lieblerdivergence}
D
{KL}(p||q)=\int_{-\infty}^\infty p(x)\log\frac{p(x)}{q(x)} \,\mathrm{d}x

But it only works if your data is made of a single Gaussian and it is not applicable If your data is made of a mixture of Gaussians.

Sfikas et al [1]  have extended the Kullback Liebler divergence for GMM and proposed a distance metric using the values (\mu ,\Sigma,\pi ) for each one of the two distributions in the following form:

\label{analytical_Kullback_Lieblerdivergence}
C2(p||q)=-\log \large[ \frac{2\sum
{i,j}\pi{i}\pi{j}^{\prime} \sqrt{ \frac{|V{ij}|}{e^{k{ij}}|\sum{i}| |\sum{j}^{\prime}|} } }
{
\sum{i,j}\pi{i}\pi{j} \sqrt{ \frac{|V{ij}|}{e^{k{ij}}|\sum{i}| |\sum{j}|} }+
\sum
{i,j}\pi{i}^{\prime}\pi{j}^{\prime} \sqrt{ \frac{|V{ij}|}{e^{k{ij}}|\sum{i}^{\prime}| |\sum{j}^{\prime}|} }
}
\large]

Where:
\label{Kullback_Liebler_divergenceDetails1}
V
{ij}=(\Sigma{i}^{-1} +\Sigma{j}^{-1})^{-1}

and
\label{Kullback_Liebler_divergenceDetails2}
K
{ij}=\mu{i}^{T}\Sigma{i}^{-1}(\mu{i}-\mu{j}^{\prime})+\mu{j}^{\prime T}\Sigma{j}^{\prime -1}(\mu{j}^{\prime}-\mu{i})

Code in matlab:

Update: Here is a very nice interactive vizualiztaion of Kullback-Liebler divergence.

# Learning From Demonstration

In this work at first, I recognize the object in the scene and estimate the 6 DOF pose of that. Then I track the object by using particle filter. RGB data acquired from Kinect 2 and turned into PCL pointcloud.
I demonstrate a task several times to the robot. In this case, I move an object (a detergent) over an “S” shape path to get an “S” shape trajectory.

In the following, you can see the result of 6 times repeating the job. Trajectories are very noisy and each repeat gives you a new “S” shape.
Then I compute the GMM (Gaussian mixture model) of trajectory in each dimension. Numbers of kernels can be set by the user or can be determined automatically based on BIC (Bayesian information criterion).
After that, I computed the Gaussian mixture regression to generalize the task and get the learned trajectory.
DTW (Dynamic time warping) can be used to align the trajectories to a reference trajectory to get rid of time difference problem between several trajectories.

Finally, you can see the learned trajectory in black.

All codes have been done with C++ in ROS (indigo-level).

Ref: [1]