Tag Archives: git

Mastering git, Part 11, View git history (commit log) of specific lines of code in a file

git blame

The git blame command is used to know who/which commit is responsible for the latest changes made to a file. The author/commit of each line can also been seen. git blame does not show the per-line modifications history in the chronological sense. It only shows who was the last person to have changed a line in a document up to the last commit in HEAD.

Line between 2 and 4:

2 lines after 2:

Make the output shorter:

Only display email of the auther:

get the blame for a specifi commit:

When the commit ID is 00000000000 it means I have changed that line locally.

git blame does not show the per-line modifications history in the chronological sense. It only shows who was the last person to have changed a line in a document up to the last commit in HEAD.
In order to see the full history/log of a document line, you would need to run a git blame path/to/file for each commit in your git log. Since Git 1.8.4, you can use git log has -L to view the evolution of a range of lines.

Refs: 1, 2

Mastering git, Part 10, Setting up your home Git server

In this tutorial I will show you how to set up your own Git server. Here I have used an Ubuntu 20.04 server on a virtual machine which is called homeserver and my clinent machine is called client. The convention for the bash scripts is user@host:~$

The Protocols

Git can use four distinct protocols to transfer data: Local, HTTP, SSH and Git

Local

The URLs will be in the following form:

or

HTTP Protocols

Git can communicate over HTTP using  Smart HTTP or Dumb HTTP.

Smart HTTP

Intelligently negotiate data transfer in a manner similar to how it does over SSH. Very similarly to the SSH or Git protocols but runs over standard HTTPS ports and can use various HTTP authentication mechanisms, meaning it’s often easier on the user than something like SSH, since you can use things like username/password authentication rather than having to set up SSH keys. It can be set up to both serve anonymously like the git:// protocol, and can also be pushed over with authentication and encryption like the SSH protocol.

Dumb HTTP

It is very simple and generally read-only. If a server does not respond with a Git HTTP smart service, the Git client will try to fall back to the simpler Dumb HTTP protocol. The Dumb protocol expects the bare Git repository to be served like normal files from the web server.To set up all you have to do is put a bare Git repository under your HTTP document root and set up a specific post-update hook.

The post-update hook that comes with Git by default runs the git update-server-info to make HTTP fetching and cloning work properly. This command is run when you push to this repository (over SSH for example). You can clone via:

SSH Protocol

To clone a Git repository over SSH, you can specify an ssh:// URL like this:

Or you can use the shorter scp-like syntax for the SSH protocol:

The main drawback is that you need SSH authentication even for read only repositories.

Git Protocol

It is a daemon (that comes packaged with Git) and  listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication. In order for a repository to be served over the Git protocol, you must create a git-daemon-export-ok file. Either the Git repository is available for everyone to clone, or it isn’t. You can enable push access but, given the lack of authentication, anyone on could push to your repository. Generally, you’ll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use git:// for read-only access. This protocol requires xinetd or systemd configuration or the like and also requires firewall access to port 9418,

Now let’s set up our server:

1) First on the server side make sure that the essential packages are installed:

2) Create a group for Git users:

3) Create users and add them to the gituser group:

BTW If you want to:
List all groups:

List groups of a user:

List all users:

And if you want to complelty delete a user and remove the user home directory:

Hint: the user should do a login and logout to make the group assignment effective.

4) Create a git repository and tell it to share based on the group the user belongs to

BTW if you want to create a non-bare repository that you can push into that:

5) Give gituser group permission to access the repository directory

Hint:
All files in Linux belong to an owner and a group. chgrp command changes the group ownership of a file or directory. You can set the owner by using chown command, and the group by the chgrp command. -R means do it recursively. You can see the owner and permision by

This command will give the gituser group the permission of read and write.

chmod g+s sets the setgid mode bit on the current directory which will cause any new file or directory that placed in this directory inherit the group owner (gituser), but the current files and directories will not be effected. To aply changes on them we directly call the command on them by using find /home/git/repos1 -type d

6) Create and config the clients
Create two users on the client machine:

SSH key pairs can be used to authenticate a client to an SSH server. That’s why we used ssh-copy-id to add them into ~/.ssh/authorized_keys so we can log in without password later.

Now copy the public key to the server and append it to the authorized_keys file as above

Append the client’s public key for ceach client to the authorized keys on server

Now you should be able to clone the repository:

And you should be able to push:

and pull the changes with other user:

If you want to see the URL that your local Git repository was originally cloned from:

Refs: 1, 2, 3, 4, 5, 6

Everything you need to know to master git, Part 10, git fork

Forking Projects

You can “fork” the project if you want to contribute to an existing project to which you do not have push access. GitHub can make a copy of the project that is fully yours when
you “fork” a project; it resides in your namespace, and you can push it.

By creating what is called a Pull Request, developers can fork a project, push it, and contribute their modifications back to the original repository.

1. Click on the fork to have a copy of the repository into your account.
2. Clone the repository onto your machine.

3. Add the original repository as a remote named upstream into the clone of your forked repository:

You can see the remotes via:

If you made a mistake or you want to change the upstream you can delete it with:

4. Pull the changes from original repo and update your fork to keep up with their changes:

Git fork workflow, credit: arthdatascience.org

Git fork workflow, credit: arthdatascience.org

Difference between origin and upstream/ downstream on GitHub

Git is a distributed system, and being distributed largely means there is no inherent “upstream” or “downstream” in the system.

  • upstream generally refers to the original repo that you have forked.
  • origin is your fork: your own repo on GitHub, clone of the original repo of GitHub

 

Refs: [1], [2], [3]

Mastering git, Part 7, git branch

It quite often happens that a software product is on a git server and developers add new features to the product. To make the job easier, developers usually create branches beside the master branch (which is the default branch) and work on their branch and once the changes are confirmed and tested, they will be merged to the master branch. So let’s get started:

ok now let’s create two branches, one for developer1 and one for developer2,

git branch <BRANCH_NAME> creates a new branch but leaves you on the same branch.
git checkout -b <BRANCH_NAME> creates a new branch and checks out the new branch.

In Fact it does the following:
git branch # create a new branch
git switch # then switch to the new branch

let’s make some work on the first branches:

and some changes in the second branch:

now you should see this in git gui or if you go to Repositories> Visualize all Branch History or

You can add the later as an alias so you dont’t have to type it everytime:

Now in the git gui, go to merge>local merge and choose b1, you will get the following error because of the merge conflict:

so call the following to solve the conflict:

choose C and save and then:

 

Working with branches in nutshell:

Creating a local branch

or short version:

Creating a remote branch

First, you create your branch locally:

The remote branch is automatically created when you push it to the remote server:

you might want to make the relation persistent by using:

Other people in your team can reach your branch, by doing:

if instead of git checkout <branch-name> you use the following:

you will enter a enter a detached HEAD state.

Track a remote branch

If you have already created a local branch and you want to track it with remote branch you just pulled down, or want to change the upstream branch you’re tracking,:

Please pay attention to the -to at the end of “–set-upstream-to”. There is a depricated flag “–set-upstream” (without to at the end) which is no longer supported.

View branches

To view local branches

To see only remote branches:

To view both remote branches and local branches

To view branches of a remote: git remote show <remote-name>, i.e:

To see the tracked branches:

Deleting both local and remote branch

local and remote branches actually have nothing to do with each other, even if you’ve established a tracking connection. They are completely separate objects in Git and deleting one wouldn’t delete the other If you want any branch item to be deleted, you need to delete it explicitly.

To delete a local branches:

-D, which is an alias for –delete –force, which deletes the branch “irrespective of its merged status

To delete a remote branch:

As of Git v1.7.0, you can delete a remote branch using:

summary:

In most cases the remote name is origin.

Pushing into the wrong branch

Deeper Look

The way Git thinks about its data is what sets it apart from all other VCS (CVS, Subversion, Perforce, Bazaar, and so on). Most other systems store data in the form of a list of file-based changes. This is commonly described as delta-based version control

Image courtesy of git-scm.com

Image courtesy of git-scm.com

Git considers its data to be a set of snapshots of a miniature filesystem.
Any time you commit, or save the state of your project, Git takes a snapshot of all your files at that particular moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored.

When you make a commit, Git stores a commit object that contains a pointer to the snapshot of the content you staged. This object also contains the author’s name and email address, the message that you typed, and pointers to the commit or commits that directly came before this commit (its parent or parents):

  1. zero parents for the initial commit,
  2. one parent for a normal commit,
  3. multiple parents for a commit that results from a merge of two or more branches.

Let’s make an example:

Now we have 5 objects in our repository, 3 blob , which you can see them by:

one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata.

If you make some changes and commit again, the next commit stores a pointer to the commit that came immediately before it.

A branch in Git is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master. As you start making commits, you’re given a master branch that points to the last commit you made. Every time you commit, the master branch pointer moves forward automatically.

Because a branch in Git is actually a simple file that contains the 40 character SHA-1 checksum of the commit it points to, branches are cheap to create and destroy.

How does Git know what branch you’re currently on? It keeps a special pointer called HEAD.

 

Ref [1], [2], [3], [4]

Mastering git, Part 6, git stash

Sometimes you have made some changes into your local copy and you don’t want to make a commit because your code doesn’t compile, but something came up and you should change your branch for instance and you will lose your local changes. In this situation, you can stash your changes and save your changes and change your branch for instance. The command saves your local modifications away and reverts the working directory to match the HEAD commit.

Let’s have a complete example:

now we make some change:

now let’s stash changes:

stash@{0} is the most recently created stash, stash@{1} is the one before it. WIP stands for work iprogress. We can list and drop changes by:

We can use gitk to see the changes and :

Now let’s create a branch and one more stash and apply the previous stash from the master branch there:

To see all stash in gitk

To checkout a specific file from stash:

To view the file in the stash:

stash@{0}^1 means the first parent of the given stash, which as stated in the explanation above is the commit at which changes were stashed away.

 

Mastering git, Part 5, git cherry-pick

To illustrate what is cherry-pick let’s imagine  the following situation: you and another developer start working on project, you will both make some changes, but after a while the other developer find a bug in the code and you are interested in to only apply that hot-fix without taking other changes from another developer, in this situation you should do a cherry pick, so let’s do it:

and you continue with something like:

The other developer checks out the repository and will start working and make the second commit:

and in the third commit, he fixes the problem with A.txt so he would something like this:

And he may continue his work with another commit:

Now you are only interested in the third commit and you don’t want to add other files to your local copy so you have to cherry pick the third commit, so all you have to do is fetch all changes, find the corresponding SHA1 value for the third commit and apply the cherry pick.

And now the file A.txt in repose1 should be same as the file in repose2 while C.txt and D.txt are not added to repose1.

 

 

 

Mastering git, Part 4, git merge, git diff with GUI tools

Before following this post, make sure you have already set up everything from my other post regarding the configuration of git environment.

Diff

Diff command will allow you to compare two different commits, let’s set up a repository:

Now in the second repository, we make some changes:

now let’s back to the first repository and see the changes:

now you should see all the difference in meld:

To find diff between current and previous commit:

As of Git 1.8.5, @ is an alias for HEAD, so you can use:

If you want to know the diff between head and any commit you can use:

You can add -t to select your difftool such kdiff3, meld, etc.
To see the diff with gitk, first go to Edit>Preference and set the external difftool.
Now click on your revision, then right-click on the revision that you want to compare to and click on Diff this-> selected. Now on the bottom window right click on any of changed file and select External diff

Merge and conflict

Now let’s merge all the changes. If you make some changes on different files you will do fast-forward. That means you only incorporate new changes and everything is ok afterward. But if you made some changes on the same file in same lines then you have a merge conflict and you should do a three-way merge.

To show you the point let’s make some changes in the first repository in the file A.txt.

now if you call git merge, you will get a merge conflict, to solve this call the following:

and you will get this in meld:

and this in kdiff:

+————————————–+
| LOCAL | BASE | REMOTE |
+————————————–+
|                MERGED               |
+————————————–+

LOCAL: Your side of the conflict – ie, your branch (HEAD) that will contain the results of the merge
REMOTE: The remote side of the conflict – the branch you are merging into HEAD.
BASE: The common ancestor of both local and remote. useful for feeding into a three-way merge tool
MERGED: The result of the automatic merge.

go through the conflict and choose the side (A, B or C) and see the merged file in the window below and save and close.

Use command line and interpret the output of git diff

diff uses Levenshtein distance and tries to determine the smallest set of deletions and insertions. The output is called a “diff”, or a patch, since the output can be applied with the Unix program patch

The POSIX standard specifies the behavior of the “diff” and “patch” utilities and their file formats. You can call the following to see the difference between and two files or directories:

For instance in our case:

-u  indicate unified format

Git also uses the unified format. If you call

You will get the following:

Let’s interpret the output:
1) The first line diff --git a/A.txt b/A.txt is a “git diff” header in the form diff –git a/file1 b/file2. The a/ and b/ filenames are the same unless rename/copy is involved. The –git is to mean that diff is in the “git” diff format.
2) In the last line in extended diff header is:

100644 is the mode of given file which means that it is ordinary file and not e.g. symlink, and that it doesn’t have executable permission bit, and about shortened hash of preimage (the version of file before given change) and post image (the version of file after change).
3) Next is two-line unified diff header

The original file is preceded by “” and the new file is preceded by “+++“.

4) Following this are one or more change hunks that contain the line differences in the file. Unified format hunks starts with line like:

5) Next comes the description of where files differ. The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:

  • ‘+’ — A line was added here to the first file.
  • ‘-‘ — A line was removed here from the first file.

 

Refs 1, 2, 3

Mastering git, Part 3 (Undo local changes, git revert, git reset, git amend and git clean)

Cleaning up git environment and rollbacking local changes

Once you started working on your local copy you might mess everything and need to clean up or roll back changes for some file, you can use git checkout and git clean

to rollback changes for a particular file:

to rollback changes for the entire local copy while keeping added files:

 

if you just want to remove the file and directories that have been added but not staged:

-d
Remove untracked directories in addition to untracked files.

–dry-run
Don’t actually remove anything, just show what would be done.

-i
Show what would be done and clean files interactively. See “Interactive mode” for details.

This will create a new branch and switch to that:

 

Undo local changes

Git reset

How to undo a commit in Git when no one has pulled your changes? The answer is git reset
git reset will do this job for you, it has three options, hard and soft and mixed.

1) Mixed: This is the default action. Resets the index1 but not the working tree2 (i.e., the changed files are preserved but not marked for commit) and reports what has not been updated.

2) Soft: Does not touch the index file or the working tree at all but resets the head to <commit>. This leaves all your changed files “Changes to be committed”

3) Hard: This option will reset the index and working tree. Any changes to tracked files in the working tree since <commit> are discarded.

1 The git “index” is where you place files you want to commit to the git repository. Some names you may have heard:

  • Index
  • Cache
  • Directory cache
  • Current directory cache
  • Staging area
  • Staged files

Files in the git index don’t get committed to the repository until you use the git commit command.

2 Working tree is what is actually in the files that you are currently working.

git ls-tree only works with git refs, e.g.

Commit could be in one of the following forms:
HEAD{2}
HEAD~2
commit-SHA

Examples:

You can undo your reset by

Since after your first reset your previous HEAD is at one commit before the current one.

git log will be exactly the same as before, but git relog will show you that you have reset the HEAD.

A Complete setup

some important note:

git add -A is equivalent to

Let’s continue our work:

 

The git reflog command records a chronological history of everything you have done in your local repository. Its full output might look like this:

C.txt and D.txt will remain (soft)

C.txt and D.txt will be deleted.

To see which files have been added and status:

Finally, lets’ push changes by:

Git revert

How to undo a push in Git when someone might have pulled your changes? The answer is Git revert. If you’ve already pushed your changes and someone else pulled in those changes, you should not use git reset to undo changes, use git revert instead.

Examples:

Let’s follow our work:

amend

This will fix your most recent commit message

Example

Refs: [1]

Mastering git, Part 2, Creating and managing repositories

Creating a bare repository

Creating a repositiy

Managing remotes

You need to know how to handle your remote repositories in order to be able to work on any  Git project. Remote repositories are copies of your project which are hosted elsewhere on the Internet or the network. You can have several of them,

Adding remote

git remote add <remote-name> <url>

git push <remote-name> <branch> explicitly says “push the local branch ‘master’ to the remote named ‘origin'”. This does not define a persistent relationship, it just executes a push this one time. Note that the remote branch is assumed to be named “master”.

Or you can use git push --set-upstream origin master (or the short version: git push -u origin master ) which is  the same thing as git push <remote-name> <branch> , except it first adds a persistent tracking relationship between your local branch “master” and the remote named “origin”. As before, it is assumed the remote branch is named “master”, later on you can just  use git push
You can add more remotes:

Viewing remotes

Renaming and Removing Remotes

Changing a remote’s URL

git remote set-url <existing-remote-name> <new-URL-for-the-remote> changes an existing remote repository URL.

Cloning

When you  clone your repository, you don’t need to run
git push -u <remote-name> <branch-name>

because origin/master is already set up as master‘s upstream branch. In other words, master is already set to track origin/master.

Mastering git, Part 1, setting up git environment, GUI, mergetool, difftool

Installing prerequisite

Configuring Git Environment

First config your environment and set a proper merge/ diff tool.  The settings in Linux are under/home/<username>/.gitconfig.    If you don’t know where this file is located in your OS You can easily edit this file by:

All Git setting can be set or viewed via terminal, i.e:

To view all settings

Current settings of mergetool:

Set the merge tool

All available options for setting a merge tools can be listed by:

here we config it for meld, kdiff3 and vim.

Kdiff3

Add the following to your .gitconfig file

For Windows users:

 

Meld

For Windows users:

 

Vim

Username, Email and editor

 

Adding Alias

[alias]

or add it via terminal:

SSH key

and upload your public key
/home/<username>/.ssh/id_rsa.pub
into your GitHub account

Push into GitHub without password using ssh-key

You have to change your remote type from http to git:

For example if your repository is:

Then you have to run the following:

How to show current branch in Bash prompt

Edit ~/.bashrc and find the following lines and uncomment them

and replace them with

1, 2