Category Archives: Tutorials

SSH connection to Google Colaboratory and Transfering files from Google Drive

Do like to access the Google Colaboratory directly from your machine? Running your script via terminal and having shell access? Then there is a good news, just do the followings:

SSH to Google Colaboratory

1. Install colab_ssh on Google Colaboratory

Add cloudflared and password for root user:

2. Install cloudflared on your machine
Downlaod it from here

3. Append the following to your SSH config file (the file is in  ~/.ssh/config)

4. To connect using your terminal, type this command:

The followings are optional if you want to install Anaconda /Pytorch

5. Installing Anaconda on Google Colaboratory machine
Prerequisite:

Download Anaconda, install it and add it to the path:

Refs: [1]

Accessing Google Drive in Google Colaboratory

You can access files with:

1. Mounting Google Drive

Visit the URL in a browser and enter your authorization code
Your drive will be mounted on:

Mounted at /content/drive

You can access your drive in your Jupiter notebook:

via python code:

2. Using PyDrive

Authenticate and create the PyDrive client:

Access your files:

 

To prevent Google Colab from disconnecting, Just create a new cell at the bottom having the following line:

Refs: [1], [2]

Developing Redis applications with C++

I have found the following libraries to develop redis applications with c++ :

  1. hiredis (only available for c)
  2. redis-plus-plus (based on hiredis, available for C++)
  3. cpp-redis Asynchronous Multi-Platform, no dependency

Installation of redis-plus-plus

First lets install hiredis

Then we install redis-plus-plus

Using redis-plus-plus in your application

I have created the following CMake to make use smoother:

Installation of cpp_redis

Because this repository uses some sub-modules we have to pull them as well,

Using cpp_redis in your application

Since cpp_redis is not coming with a CMake, I wrote a CMake for that:

Now you can run the redis from docker and expose the port:

Getting started with Redis

Redis (Remote Dictionary Server) is a BSD licensed, in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker.
It can store different kinds of abstract data structures, such as

  • Strings
  • Lists
  • Hash maps
  • Sets
  • Sorted sets

When use Redis/Example use case

Redis, is NOT a NoSQL replacement for classic relational databases, as it doesn’t support many features of RDBMS, such as querying of your data which might slow it down.
Replacements for RDBMS are rather document databases like MongoDB or CouchDB.
Redis is great at specific functionality where speed and support for advanced data structures is crucial.

Refs: [1], [2]

Running the server

I have installed the redis via docker so:

Now open another terminal, get into the docker container and connect to the server:

You can log into redis server via:
redis-cli -h <host-address> -p <port> -a <password>
or
redis-cli -u redis://username:password@host:port

Redis CONFIG

The configuration file “redis.conf” in redis, is located at the root directory of Redis. You can get/set all Redis configurations by Redis CONFIG command.

Get all configuration:

This will set password temporarily (until redis or server restart)

To get the location of the data directory in Redis

 

Keys

Set key to hold the string value. If key already holds a value, it is overwritten,

Get the value of key. If the key does not exist the special value nil is returned. An error is returned if the value stored at the key is not a string, because GET only handles string values.

This command checks whether the key exists or not.

Sets the expiry of the key after the specified time.

Gets the remaining time in keys expiry.

Removes the expiration from the key.

Get all the keys matching pattern.

Returns the data type of the value stored in the key.

Refs: [1]

Lists

Prepends one or multiple values to a list

Gets a range of elements from a list

Refs: [1]

Hash maps

Hash maps are the perfect data type to represent objects.

Sets the string value of a hash field

Sets multiple hash fields to multiple values

Gets the value of a hash field stored at the specified key.

Gets all the fields and values stored in a hash at the specified key

Gets all the fields in a hash

Sets /Sorted sets

 

Publish/ Subscribe

Listens for messages published to the given channels.

Posts a message to a channel.

see what channels are currently available:

Returns the number of subscribers of the channel chat:

To seen umber of subscriptions to patterns (that are performed using the PSUBSCRIBE command), use PUBSUB NUMPAT

ACL

Stands for Access Control List, allows certain connections to be limited in terms of the commands that can be executed and the keys that can be accessed. The way it works is that, after connecting, a client is required to authenticate providing a username and a valid password

Setting Username and Password

this will set password temporarily (until redis or server restart)

CONFIG SET requirepass mypass

CONFIG GET requirepass
AUTH <username> <password>

Refs: [1], [2], [3]

Client Connections

To get information about clients and also managing them (i.e . killing, tracking etc) you can use CLIENT command:

Returns information and statistics about the client connections

Returns information and statistics about the current client

When tracking is enabled Redis remembers the keys that the connection requested, in order to send later invalidation messages when such keys are modified. (side caching)

Example:

  • Client 1 -> Server: CLIENT TRACKING ON
  • Client 1 -> Server: GET foo
  • (The server remembers that Client 1 may have the key “foo” cached)
  • (Client 1 may remember the value of “foo” inside its local memory)
  • Client 2 -> Server: SET foo SomeOtherValue
  • Server -> Client 1: INVALIDATE “foo”

Refs: [1], [2]

Notifications

Keyspace notifications enable clients to subscribe to Pub/Sub channels to receive events that impact the Redis data set in some way. Due to performance issue  Keyspace notifications are disabled by default.

There are two types of  Keyspace notifications: keyspace (what sort of event is happening to a key) and keyevent (what happened to a specific key).

For instance a DEL operation, could be done with the following PUBLISH commands
PUBLISH __keyspace@<db_number>__:<your-key> del
PUBLISH __keyevent@<db_number>__:del <your-key>

Now let’s have a complete example, first  enabling the notifications:

CONFIG SET notify-keyspace-events AKE

“AKE” string means all the events except “m” (Key miss events- events generated when a key that doesn’t exist is accessed).
A : Alias for “g$lshztxe” (means all the events except “m”)
E : Keyevent events, published with __keyevent@__ prefix.
K : Keyspace events, published with __keyspace@__ prefix.

Now let’s subscribe to delete events on keys:

Now if we delete a key, we are able to see the deletion event:

Another example would be all events on all keys on all databases:

Another example, lets make the key greeting with the value of hello that expires in 10 seconds:

You should be able to see the event with PSUBSCRIBE '__key*__:*'

Mastering git, Part 12, git rebase

Combining Git commits with squash

Imagine you have done lots of commits i.e. several commits for fixing a bug but you don’t need all of them and somehow you want to meld them and squash them into a single commit. You can use rebase. Git always squashes a newer commit into an older commit or “upward”. Let create some files:

Your tree would be something like this:

But now you would like to squash b-1, b2 and b-3 into one commit. So copy the hash ID of b-3 and execute rebase:

This will open your editor and the first few lines are:

Git will melt new commits into older one, and we want to squash b-3 and b-2 into b-1 so change it into:

Close all instance of your editor and save it. This will open another editor to write a new commit message.

Write your new commit message and save it. Your new tree should look like this:

Rebasing branches on to the base

When you create a feature branch and start working on that, your master might grow and also your branch. Let’s create a repository and add “a” and b:

Then lets create a feature branch for “c”

Now lets get back to master and “d” and “e”:

At a certain point you will decide to merge to the master branch your tree looks like this:

By doing this you will the following tree, which in a big project might become very complicated and difficult to track.

As you can see the branch feature-c is 2 commits behind of the master branch.  You can rebase your feature branch onto the master branch so you can make it like that your feature branch has just branched from your master.

By doing so, you finally end up with a “linear history”. You can also give tags to your commits so you make your tree more readable:

Rebase: rewriting Git history reword, delete, reorder  and Split

Refs 1,2, 3, 4

Mastering git, Part 11, View git history (commit log) of specific lines of code in a file

git blame

The git blame command is used to know who/which commit is responsible for the latest changes made to a file. The author/commit of each line can also been seen. git blame does not show the per-line modifications history in the chronological sense. It only shows who was the last person to have changed a line in a document up to the last commit in HEAD.

Line between 2 and 4:

2 lines after 2:

Make the output shorter:

Only display email of the auther:

get the blame for a specifi commit:

When the commit ID is 00000000000 it means I have changed that line locally.

git blame does not show the per-line modifications history in the chronological sense. It only shows who was the last person to have changed a line in a document up to the last commit in HEAD.
In order to see the full history/log of a document line, you would need to run a git blame path/to/file for each commit in your git log. Since Git 1.8.4, you can use git log has -L to view the evolution of a range of lines.

Refs: 1, 2

Mastering git, Part 10, Setting up your home Git server

In this tutorial I will show you how to set up your own Git server. Here I have used an Ubuntu 20.04 server on a virtual machine which is called homeserver and my clinent machine is called client. The convention for the bash scripts is user@host:~$

The Protocols

Git can use four distinct protocols to transfer data: Local, HTTP, SSH and Git

Local

The URLs will be in the following form:

or

HTTP Protocols

Git can communicate over HTTP using  Smart HTTP or Dumb HTTP.

Smart HTTP

Intelligently negotiate data transfer in a manner similar to how it does over SSH. Very similarly to the SSH or Git protocols but runs over standard HTTPS ports and can use various HTTP authentication mechanisms, meaning it’s often easier on the user than something like SSH, since you can use things like username/password authentication rather than having to set up SSH keys. It can be set up to both serve anonymously like the git:// protocol, and can also be pushed over with authentication and encryption like the SSH protocol.

Dumb HTTP

It is very simple and generally read-only. If a server does not respond with a Git HTTP smart service, the Git client will try to fall back to the simpler Dumb HTTP protocol. The Dumb protocol expects the bare Git repository to be served like normal files from the web server.To set up all you have to do is put a bare Git repository under your HTTP document root and set up a specific post-update hook.

The post-update hook that comes with Git by default runs the git update-server-info to make HTTP fetching and cloning work properly. This command is run when you push to this repository (over SSH for example). You can clone via:

SSH Protocol

To clone a Git repository over SSH, you can specify an ssh:// URL like this:

Or you can use the shorter scp-like syntax for the SSH protocol:

The main drawback is that you need SSH authentication even for read only repositories.

Git Protocol

It is a daemon (that comes packaged with Git) and  listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication. In order for a repository to be served over the Git protocol, you must create a git-daemon-export-ok file. Either the Git repository is available for everyone to clone, or it isn’t. You can enable push access but, given the lack of authentication, anyone on could push to your repository. Generally, you’ll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use git:// for read-only access. This protocol requires xinetd or systemd configuration or the like and also requires firewall access to port 9418,

Now let’s set up our server:

1) First on the server side make sure that the essential packages are installed:

2) Create a group for Git users:

3) Create users and add them to the gituser group:

BTW If you want to:
List all groups:

List groups of a user:

List all users:

And if you want to complelty delete a user and remove the user home directory:

Hint: the user should do a login and logout to make the group assignment effective.

4) Create a git repository and tell it to share based on the group the user belongs to

BTW if you want to create a non-bare repository that you can push into that:

5) Give gituser group permission to access the repository directory

Hint:
All files in Linux belong to an owner and a group. chgrp command changes the group ownership of a file or directory. You can set the owner by using chown command, and the group by the chgrp command. -R means do it recursively. You can see the owner and permision by

This command will give the gituser group the permission of read and write.

chmod g+s sets the setgid mode bit on the current directory which will cause any new file or directory that placed in this directory inherit the group owner (gituser), but the current files and directories will not be effected. To aply changes on them we directly call the command on them by using find /home/git/repos1 -type d

6) Create and config the clients
Create two users on the client machine:

SSH key pairs can be used to authenticate a client to an SSH server. That’s why we used ssh-copy-id to add them into ~/.ssh/authorized_keys so we can log in without password later.

Now copy the public key to the server and append it to the authorized_keys file as above

Append the client’s public key for ceach client to the authorized keys on server

Now you should be able to clone the repository:

And you should be able to push:

and pull the changes with other user:

If you want to see the URL that your local Git repository was originally cloned from:

Refs: 1, 2, 3, 4, 5, 6

Metrics for Evaluating Machine Learning Models – Classification

Confusion Matrix

Let’s say we have a binary classifier cats and non- cats, we have 1100 test images, 1000 non cats, 100 cats. The output of the classifier is either Positive  which means “cat” or Negative which  means non-cat. The following is called confusion matrix:

How to interpret these term is as follows: Correctness of labeling, Predicted Class

True Positive:

Observation is positive, and is predicted to be positive.90 cats correctly labeled.

True Negative:

Observation is negative, and is predicted to be negative. 940 images labeled as non-cats, and they are non-cats.

False Positive:

Observation is negative, but is predicted positive. 60 non-cat images labeled cats, but they are cats.

False Negative:

Observation is positive, but is predicted negative.10 images labels as cat, but they are truly non-cats.

Accuracy:

Accuracy is defined as the number of correct predictions divided by the total number of predictions. Classification accuracy= (90+940)/(1000+100)= 1030/1100= 93.6%

Precision:

Precision is not always a good indicator for the performance of our classifier. If one class has more frequency in our set, and we predict it correctly while the classifier wrongly label the  smaller class, accuracy could be very high but the performance o the classifier is bad so:

Precision= True Positive/ (True Positive+ False Positive)

Precision cat=  90/(90+60) = 60%

Precision non-cat= 940/950= 98.9%

Recall

Recall is the ratio of the total number of correctly classified positive examples divide to the total number of positive examples, kind of optimistic classifier.

Recall= True Positive/ (True Positive+ False Negative)

Recall cat= 90/100= 90%
Recall non-cat= 940/1000= 94%

High recall, low precision:  This means that our classifier finds almost all positive examples in our test set but also recognizes a lot of negative examples as positive examples.

Low recall, high precision: This means our classifier is very certain about positive examples (if it has labeled as positive, with high confident it is positive) meanwhile our classifier has missed a lot of positive example, kind of conservative classifier.

F1-Score

Depending on application, you might be interested in a conservative or optimistic classifier. But sometimes you are not biased toward any of the classes in your set, so you need to combine precision and recall together. F1-score is the harmonic mean of precision and recall:

F1-score= 2*Precision*Recall/(Precision+Recall)

F1-score cat= 2*0.6*0.9/(0.6+0.9)= 72%

Sensitivity and Specificity

Sensitivity and specificity are two other popular metrics mostly used in medical and biology. Basically computing recall for both positive and negative classes.
Sensitivity= Recall= TP/(TP+FN)
Specificity= True Negative Rate= TN/(TN+FP)

Receiver Operating Characteristic (ROC) Curve

The output of a classifier is usually a probabilistic number, and we based on a cut off value decided to accept or reject the value. ROC curve is plotting TPR against FPR for various threshold values. ROC curve is a popular curve to look at overall model performance and pick a good cut-off threshold for the model.

False Positive Rate

High values means: False Positive > True Negative which means  our classifier labels many examples as Positive while they are Negative and this ratio is bigger than the examples that are actually Negative and correctly labeled as Negative.

Small value means True Negative > False Positive which means our classifier truly labels examples that are negative and the ratio is bigger than the examples that are Negative and classifier labels them as Positive.
False Positive Rate=1-Specificity=False Positive/ (False Positive + True Negative)

True Positive Rate

True Positive Rate = Sensitivity= Recall= True Positive /(True Positive +False Negative)

Area Under the Curve (AUC)

Sørensen–Dice Coefficient

Confident Interval

Refs: [1], [2], [3]