Analytic distance metric for Gaussian mixture models

In many applications, you need to compare two or more data sets with each other to see how much they are similar or different. For instance, you have measured the height of men and women in Japan and Netherlands and now you like to know how much they are different.

\( \)

Two commonly used method for measuring distances are the Kullback-Liebler divergence and the Bhattacharyya distance.

KL divergence (Kullback-Liebler divergence) measures the difference between two probability distributions p and q.

\begin{equation} \label{Kullback_Liebler_divergence}
D_{KL}(p||q)=\int_{-\infty}^\infty p(x)\log\frac{p(x)}{q(x)} \,\mathrm{d}x

But it only works if your data is made of a single Gaussian and it is not applicable If your data is made of a mixture of Gaussians.

Sfikas et al [1]  have extended the Kullback Liebler divergence for GMM and proposed a distance metric using the values \(\mu ,\Sigma,\pi \) for each one of the two distributions in the following form:

\begin{equation} \label{analytical_Kullback_Liebler_divergence}
C2(p||q)=-\log \large[ \frac{2\sum_{i,j}\pi_{i}\pi_{j}^{\prime} \sqrt{ \frac{|V_{ij}|}{e^{k_{ij}}|\sum_{i}| |\sum_{j}^{\prime}|} } }
\sum_{i,j}\pi_{i}\pi_{j} \sqrt{ \frac{|V_{ij}|}{e^{k_{ij}}|\sum_{i}| |\sum_{j}|} }+
\sum_{i,j}\pi_{i}^{\prime}\pi_{j}^{\prime} \sqrt{ \frac{|V_{ij}|}{e^{k_{ij}}|\sum_{i}^{\prime}| |\sum_{j}^{\prime}|} }
V_{ij}=(\Sigma_{i}^{-1} +\Sigma_{j}^{-1})^{-1}
K_{ij}=\mu_{i}^{T}\Sigma_{i}^{-1}(\mu_{i}-\mu_{j}^{\prime})+\mu_{j}^{\prime T}\Sigma_{j}^{\prime -1}(\mu_{j}^{\prime}-\mu_{i})

Code in matlab:


Leave a Reply


This site uses Akismet to reduce spam. Learn how your comment data is processed.

Notify of