Version Control System and Git

Version control systems (VCS) are a group of tools that tracks the history of modifications to files. A VCS is essential to modern software development processes. It can facilitate tasks such as modification tracking, context switching, codebase management, team cooperation, team communication, software quality controls, etc.

Git is the most widely used distributed version control system (dVCS). It enables efficient workflows to work with source code and other textual documents. The git command is the main command to invoke the functionalities of the git system in a command-line environment.

The git tool is a command-line interface (CLI) based tool and you can access all features through CLI. We will be focusing on CLI based usage of git in the courses.

Due to its popularity, there are numerous editor/IDE extensions and GUI-based applications available for git. Use the GUI tools and extensions at your own risk for convenience once you have a solid understanding of git.

Note

GUI is short for Graphical User Interface

Warning

On Microsoft Windows, avoid git bash/git for windows! If possible, uninstall it. It causes a lot of issues. Use the git installed under WSL in this course.

Learn git

You can download the git cheatsheet. Or learn from the official git documentation

Installation

You may refer to the development environment doc to learn how to install git on various operating systems.

Setup before the first use

Before you can use git to track changes, you must add your name and email to identify yourself:

git config --global user.name "Your Name"
git config --global user.email id@uwf.edu

Substitute Your Name with your real name and substitute id@uwf.edu with your email address. It is recommended to use your student email address in this course.

Git Concepts

Working directory (a.k.a. working tree)

The directory where the repository is stored. A hidden .git directory is usually present in the working directory to store the database used by git.

Staging area (a.k.a. git index)

A temporary storage to keep files you wish to add to the database. It is used to temporary hold the changes that you plan to commit to the database. Run git add to add file to the staging area and use git commit to store everything in the staging area to the repository.

Repository

The database used by git to keep track of the history of modifications.

Commit

A commit is a snapshot of the working directory stored in the database; Each commit has a commit message and a unique hash value (ID) bound to it.

Remote

When you push or pull changes to or from the remote repository (e.g. on GitHub.com), you will give a name to each remote repository. It is called remote in git. The default remote is named origin with the URI specified in the git clone <URI> command. For any repository started from scratch, you will need to set the URI for a remote using git remote add <name> <URI> command before you can perform git push <remote> main

Branch

Git support branching to allow users to work on various lines simultaneously without inferring each other. It is essential in many workflows. In the linear workflow we use in the course, you will stay in the main branch. Extended reading on the git branching topic can be done by students who are interested in it.

.gitignore

A text file to specify the patterns of files to ignore by git. Used to prevent temporary files or large binary files from being included in the database.

Subcommands

All git features can be accessed using git subcommands. The common subcommands are listed below:

git clone <URI>

Clones a repository from an existing remotely hosted repository to a local directory.

git init

Create an empty git repository in the current directory. Used on existing code not monitored by git before.

git add

Add files to the staging area of git. git add -A is the most useful command to add all changes to the staging area. You can run git add . to add everything recursively in the current directory. Run git add list-of-paths to add specific files or directories.

git status

Run git status to check the status of the current repository. It will show you the updated, deleted and new untracked files not in the staging area yet, as well as files added in the staging area.

git commit

Run git commit -m "commit message" to store everything in the staging area to the database with a commit message.

git push

Run git push origin main to push local commits to the remote repository named origin with the remote branch being main.

git pull

Run git pull origin main to pull commits from the remote repository named origin with the remote branch being main. The remotes commits will be merged to the local repository.

git log

Run git log to see the history of the past commits.

git config

Run git config to change the configurations. The most common configurations are the name and email address of the user. You must provide them to be able to start.

linear workflow

For beginners such as students learning VCS for the first time. You can stick with the linear workflow without branching. In this workflow: You will stick to a single main branch and only take advantage of the snapshot and history logging features of git. In this workflow you will:

  1. Start your repository by either:

    • clone an existing repository: git clone <URI>

    • initiate a new repository: git init

  2. Follow the incremental development approach to develop the project in units iteratively

    1. Write code of a small unit (function, class, etc.)

    2. Test it until working

    3. Take a snapshot and log it using a commit message.

      1. git add -A to add all changes to the staging area.

      2. git commit -m "commit message" to take a snapshot (commit) and add a commit message for the commit.

  3. Push to the server git push origin main to synchronize to the server.

Warning

You must not push too frequent as the quota of autograding is limited. You should rely on local tests to test your code rather than using the autograding. Treat pushing like submitting your work.

Other Workflows (FYI)

Many tutorials available for anyone who is interested:

Advanced topics

Synchronize to the remote server

  • The scenario: You have a local repository synchronized with the remote repository. Your instructor pushed an emergency change to the remote repository. You want to synchronize your local repository with the remote repository.

  • The solution: Run git pull origin main to pull the changes from the remote repository to your local repository if you have no new commits locally. Or run git rebase origin/main if you have new commits locally. You may need to solve the conflicts if there are any.

Synchronization among multiple computers

  • The scenario: One user work on one branch from multiple computers.

  • The problem: When you are working on a branch on one computer, you may want to continue working on the same branch on another computer. You will need to synchronize the changes between the two computers.

  • The best practice:

    • Always pull first when you start on a new computer.

    • Do not forget to push before you leave that computer.

  • Other useful approach

    • If you already committed something before you remember to pull first. Use git rebase origin/main to rebase your local commits on top of the remote commits. It may fail if there are conflicts. You will need to resolve it manually.

Filename case problem

  • The scenario: The changes in the filename case (e.g. file.txt to File.txt) is not tracked by git. It will not be detected by git status and git add.

  • The solution: Use git mv to rename the file. It will be detected by git status and git add.

  • The backup solution: If git mv is not working, you can use backup the file, git rm to remove the file and commit; Then copy the file back git add to add the new file and commit. This will usually solve the problem.

Restore files

  • The scenario: You accidentally deleted a file or modified a file and want to restore it to the last commit.

  • The solution:

    • Use git restore <path> to restore the file to the last commit.