# machine learning

## Installing NVIDIA DIGIST Ubuntu 16.04

Prerequisite

Protobuf 3

caffe Install caffe as being explained in my other post here. DIGITS visit https://github.com/NVIDIA/DIGITS/ Dependencies

# Install repo packages

Building DIGITS

Open in the browser: http://localhost:5000/

## Hierarchical Clustring in python

Hierarchical Clustering is a method of clustering which build a hierarchy of clusters. It could be Agglomerative or Divisive. Agglomerative: At the first step, every item is a cluster, then clusters based on their distances are merged and form bigger clusters till all data is in one cluster (Bottom Up). The complexity is \( O (n^2log(n) ) \). Divisive: At the beginning, …

## Maximum likelihood estimation explained

In this tutorial, I explain the “Maximum likelihood” and MLE (maximum likelihood estimation) for binomial and Gaussian distribution.

## Naive Bayes Classifier Explained

In this video, I explain the “Naive Bayes Classifier”. The example has been solved with phyton in my other post here

## Naive Bayes Classifier Example with Python Code

In the below example I implemented a “Naive Bayes classifier” in python and in the following I used “sklearn” package to solve it again: and the output is:

## Markov Localization Explained

In this tutorial, I explain the math and theory of robot localization and I will solve an example of Markov localization.

## Density-Based Spatial Clustering (DBSCAN) with Python Code

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a data clustering algorithm It is a density-based clustering algorithm because it finds a number of clusters starting from the estimated density distribution of corresponding nodes. It starts with an arbitrary starting point that has not been visited. This point’s epsilon-neighborhood is retrieved, and if it …

## Bayes Filter Explained

In this tutorial I explain the bayes filter from scratch:

## Kernel Density Estimation (KDE) for estimating probability distribution function

There are several approaches for estimating the probability distribution function of a given data: 1)Parametric 2)Semi-parametric 3)Non-parametric A parametric one is GMM via algorithm such as expectation maximization. Here is my other post for expectation maximization. Example of Non-parametric is the histogram, where data are assigned to only one bin and depending on the number bins that fall within …

## Silhouette coefficient for finding optimal number of clusters

Silhouette coefficient is another method to determine the optimal number of clusters. Here I introduced c-index earlier. The silhouette coefficient of a data measures how well data are assigned to its own cluster and how far they are from other clusters. A silhouette close to 1 means the data points are in an appropriate cluster and a silhouette …