Naive Bayes Classifier Explained
In this video, I explain the “Naive Bayes Classifier”. The example has been solved with phyton in my other post here
Naive Bayes Classifier Explained Read More »
In this video, I explain the “Naive Bayes Classifier”. The example has been solved with phyton in my other post here
Naive Bayes Classifier Explained Read More »
In the below example I implemented a “Naive Bayes classifier” in python and in the following I used “sklearn” package to solve it again: and the output is:
1 2 3 4 5 6 7 |
male posterior is: 1.54428667821e-07 female posterior is: 0.999999845571 Then our data must belong to the female class Then our data must belong to the class number: [2] |
Naive Bayes Classifier Example with Python Code Read More »
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a data clustering algorithm It is a density-based clustering algorithm because it finds a number of clusters starting from the estimated density distribution of corresponding nodes. It starts with an arbitrary starting point that has not been visited. This point’s epsilon-neighborhood is retrieved, and if it
Density-Based Spatial Clustering (DBSCAN) with Python Code Read More »
There are several approaches for estimating the probability distribution function of a given data: 1)Parametric 2)Semi-parametric 3)Non-parametric A parametric one is GMM via algorithm such as expectation maximization. Here is my other post for expectation maximization. Example of Non-parametric is the histogram, where data are assigned to only one bin and depending on the number bins that fall within
Kernel Density Estimation (KDE) for estimating probability distribution function Read More »
Silhouette coefficient is another method to determine the optimal number of clusters. Here I introduced c-index earlier. The silhouette coefficient of a data measures how well data are assigned to its own cluster and how far they are from other clusters. A silhouette close to 1 means the data points are in an appropriate cluster and a silhouette
Silhouette coefficient for finding optimal number of clusters Read More »
This module finds the optimal number of components (number of clusters) for a given dataset. In order to find the optimal number of components for, first we used k-means algorithm with a different number of clusters, starting from 1 to a fixed max number. Then we checked the cluster validity by deploying \( C-index \) algorithm and
Finding optimal number of Clusters by using Cluster validation Read More »
Gradient descent is a very popular method for finding the maximum/ minimum point of a given function. It’s very simple yet powerful but may trap in the local minima. Here I try to find the minimum of the following function: $$ z= -( 4 \times e^{- ( (x-4)^2 +(y-4)^2 ) }+ 2 \times e^{- ( (x-2)^2 +(y-2)^2
Gradient descent method for finding the minimum Read More »