Category Archives: Python

SSH connection to Google Colaboratory and Transfering files from Google Drive

Do like to access the Google Colaboratory directly from your machine? Running your script via terminal and having shell access? Then there is a good news, just do the followings:

SSH to Google Colaboratory

1. Install colab_ssh on Google Colaboratory

Add cloudflared and password for root user:

2. Install cloudflared on your machine
Downlaod it from here

3. Append the following to your SSH config file (the file is in  ~/.ssh/config)

4. To connect using your terminal, type this command:

The followings are optional if you want to install Anaconda /Pytorch

5. Installing Anaconda on Google Colaboratory machine
Prerequisite:

Download Anaconda, install it and add it to the path:

Refs: [1]

Accessing Google Drive in Google Colaboratory

You can access files with:

1. Mounting Google Drive

Visit the URL in a browser and enter your authorization code
Your drive will be mounted on:

Mounted at /content/drive

You can access your drive in your Jupiter notebook:

via python code:

2. Using PyDrive

Authenticate and create the PyDrive client:

Access your files:

 

To prevent Google Colab from disconnecting, Just create a new cell at the bottom having the following line:

Refs: [1], [2]

Metrics for Evaluating Machine Learning Models – Classification

Confusion Matrix

Let’s say we have a binary classifier cats and non- cats, we have 1100 test images, 1000 non cats, 100 cats. The output of the classifier is either Positive  which means “cat” or Negative which  means non-cat. The following is called confusion matrix:

How to interpret these term is as follows: Correctness of labeling, Predicted Class

True Positive:

Observation is positive, and is predicted to be positive.90 cats correctly labeled.

True Negative:

Observation is negative, and is predicted to be negative. 940 images labeled as non-cats, and they are non-cats.

False Positive:

Observation is negative, but is predicted positive. 60 non-cat images labeled cats, but they are cats.

False Negative:

Observation is positive, but is predicted negative.10 images labels as cat, but they are truly non-cats.

Accuracy:

Accuracy is defined as the number of correct predictions divided by the total number of predictions. Classification accuracy= (90+940)/(1000+100)= 1030/1100= 93.6%

Precision:

Precision is not always a good indicator for the performance of our classifier. If one class has more frequency in our set, and we predict it correctly while the classifier wrongly label the  smaller class, accuracy could be very high but the performance o the classifier is bad so:

Precision= True Positive/ (True Positive+ False Positive)

Precision cat=  90/(90+60) = 60%

Precision non-cat= 940/950= 98.9%

Recall

Recall is the ratio of the total number of correctly classified positive examples divide to the total number of positive examples, kind of optimistic classifier.

Recall= True Positive/ (True Positive+ False Negative)

Recall cat= 90/100= 90%
Recall non-cat= 940/1000= 94%

High recall, low precision:  This means that our classifier finds almost all positive examples in our test set but also recognizes a lot of negative examples as positive examples.

Low recall, high precision: This means our classifier is very certain about positive examples (if it has labeled as positive, with high confident it is positive) meanwhile our classifier has missed a lot of positive example, kind of conservative classifier.

F1-Score

Depending on application, you might be interested in a conservative or optimistic classifier. But sometimes you are not biased toward any of the classes in your set, so you need to combine precision and recall together. F1-score is the harmonic mean of precision and recall:

F1-score= 2*Precision*Recall/(Precision+Recall)

F1-score cat= 2*0.6*0.9/(0.6+0.9)= 72%

Sensitivity and Specificity

Sensitivity and specificity are two other popular metrics mostly used in medical and biology. Basically computing recall for both positive and negative classes.
Sensitivity= Recall= TP/(TP+FN)
Specificity= True Negative Rate= TN/(TN+FP)

Receiver Operating Characteristic (ROC) Curve

The output of a classifier is usually a probabilistic number, and we based on a cut off value decided to accept or reject the value. ROC curve is plotting TPR against FPR for various threshold values. ROC curve is a popular curve to look at overall model performance and pick a good cut-off threshold for the model.

False Positive Rate

High values means: False Positive > True Negative which means  our classifier labels many examples as Positive while they are Negative and this ratio is bigger than the examples that are actually Negative and correctly labeled as Negative.

Small value means True Negative > False Positive which means our classifier truly labels examples that are negative and the ratio is bigger than the examples that are Negative and classifier labels them as Positive.
False Positive Rate=1-Specificity=False Positive/ (False Positive + True Negative)

True Positive Rate

True Positive Rate = Sensitivity= Recall= True Positive /(True Positive +False Negative)

Area Under the Curve (AUC)

Sørensen–Dice Coefficient

Confident Interval

Refs: [1], [2], [3]

 

 

Extended Kalman Filter Explained with Python Code

In the following code, I have implemented an Extended Kalman Filter for modeling the movement of a car with constant turn rate and velocity. The code is mainly based on this work (I did some bug fixing and some adaptation such that the code runs similar to the Kalman filter that I have earlier implemented).

Extended Kalman filter

Trajectory of the car, click on the image for large scale



References: [1] [2] [3] [4] [5]

Parcticle Filter Explained With Python Code From Scratch

In the following code I have implemented a localization algorithm based on particle filter.

I have used conda to run my code, you can run the following for installation of dependencies:

and the code:

 

Kalman Filter Explained With Python Code From Scratch

This snippet shows tracking mouse cursor with Python code from scratch and comparing the result with OpenCV. The CSV file that has been used are being created with below c++ code. A sample could be downloaded from here 1, 2, 3.

Python Kalman Filter

C++ and OpenCV Kalman Filter

Rapidcsv has been downloaded from here

 

How to develop GUI Application with PyQt (python Qt)

There are two main methods for developing GUI application with qt:
1) Adding all widgets in your code (your cpp or python code)
2) Creating qt UI files, adding widgets there and load everything into your application.

1)Adding all widgets in your code

Here is the snippet for adding all widgets and their slots in code:

2) Creating qt UI files, adding widgets there and load everything into your application

Now let’s do what we have done in the first method in a UI file and load it. First, create a text file and put the followings in it and save it as “mainwindow.ui”

Now call it in your python file like this:

The results should be the same as what you got in the first method.

Installing NVIDIA DIGIST Ubuntu 16.04

Prerequisite

Protobuf 3

caffe

Install caffe as being explained in my other post here.

DIGITS

visit https://github.com/NVIDIA/DIGITS/

Dependencies

# Install repo packages

Building DIGITS

Open in the browser:

http://localhost:5000/

Installing Caffe on Ubuntu 16.04

CUDA Toolkit 9.1

visit https://developer.nvidia.com/cuda-downloads and download the correct deb file then:

Basic Linear Algebra Subprograms (BLAS)

Protocol Buffers

or you can install protobuf v3  it from source:

Lightning Memory-Mapped Database

LevelDB

Hdf5

gflags

glog

Snappy

Caffe

Breadth-first search (BFS) and Depth-first search (DSF) Algorithm with Python and C++

Python Implementation

BFS traverse:

DFS traverse:

C++ Implementation

 

Populating directed graph in networkx from CSV adjacency matrix