Fundamentals 18 min read

Understanding Git: Core Concepts, Objects, and Branching Explained

This article provides a comprehensive theoretical overview of Git, covering version‑control system types, Git's snapshot model, object storage, references, branching, merging strategies, and rebasing, helping readers grasp how Git manages files and history.

Java Backend Technology

Oct 9, 2016

Understanding Git: Core Concepts, Objects, and Branching Explained

Version Control Systems

Git is the most popular distributed version‑control system. Version‑control systems record a series of file changes over time so you can revert to any previous version. They fall into three categories: local, centralized, and distributed.

Local version control stores each version on the local disk, often as patches, solving manual copy‑paste but not multi‑user collaboration.

Centralized version control adds a central server that stores the repository and controls access. It enables teamwork but creates a single point of failure.

Distributed version control (e.g., Git, Mercurial) stores full snapshots of files, not just deltas. Each clone contains the entire history, so a single machine failure does not lose data.

Git Basics

Git stores complete snapshots of files. Each commit creates a new snapshot, but unchanged files are stored only once and referenced by a pointer (the SHA‑1 hash). Git is optimized for text files; binary files are stored but compress less efficiently.

Git works with three areas: the working directory, the staging area (populated by git add), and the local repository. Only the staged content is recorded in a commit.

Files can be in three states: committed , modified , or staged . The typical workflow is:

Modify files in the working directory.

Run git add to stage snapshots.

Run git commit to store the snapshot permanently.

Git Objects

Git is a content‑addressable file system that stores data as key‑value pairs. The key is a 40‑character SHA‑1 hash of the file content and metadata; the value is the content itself.

There are three primary object types: blob: stores file contents. tree: represents a directory, containing references to blob and other tree objects. commit: points to a top‑level tree, includes author information, timestamp, parent commit(s), and a message.

When a file changes, a new blob is created with a new SHA‑1; unchanged files keep the previous blob pointer, which explains why many commits do not increase repository size linearly.

Git References

References are human‑readable names for SHA‑1 hashes stored under .git/refs. The default branch name master (now often main) points to the latest commit on the main line.

The HEAD file records the current branch reference (e.g., ref: refs/heads/master). When you commit, Git creates a new commit object whose parent is the SHA‑1 that HEAD points to.

Tags are similar to references but permanently point to a specific commit, providing a friendly name for releases.

Git Branches

A branch is simply a named reference to a commit. Creating a branch writes a 40‑byte SHA‑1 into a new file under .git/refs/heads, which is why branch creation is fast regardless of project size.

Create a file .git/refs/heads/dev.

Write the current master commit SHA‑1 (plus a newline) into it.

Done.

Switching branches updates HEAD to point to the chosen reference and restores the working directory to that commit’s snapshot.

Branch Merging

Fast‑forward merges occur when the target branch is a direct descendant of the current branch; Git simply moves the branch pointer forward.

If branches have diverged, Git performs a three‑way merge, creating a new commit with two parents.

Branch Rebasing

Rebasing rewrites a branch’s history onto another base, producing a linear commit sequence. The process replays each commit from the source branch onto the tip of the target branch.

$ git checkout dev
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: added staged command

After rebasing, a fast‑forward merge can be performed.

$ git checkout master
$ git merge dev

Summary

Git stores complete file snapshots, not diffs.

Data is kept as key‑value pairs where the key is a SHA‑1 hash.

Each version of a file has a unique 40‑character SHA‑1.

The SHA‑1 acts as a pointer that distinguishes objects.

Every file version creates a blob object.

Unchanged files keep the previous blob pointer.

Git implements version control by maintaining a complex object tree.

The workflow moves files among the working directory, staging area, and repository.

Frequent branching facilitates team collaboration.

A branch is merely a reference to a commit.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Git merge rebase Version Control Objects branching

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.