What Happens Inside Git? Uncover the Secrets of init, add, commit, and Branches
This guide demystifies Git’s inner workings by walking through repository initialization, the structure of the .git directory, how objects, blobs, trees, and commits are created, and the roles of the index, HEAD, branches, and remote configuration, complete with commands, examples, and visual diagrams.
Understanding Git as a System
Git is a distributed version‑control system that stores data as snapshots of the entire project tree. Grasping its underlying concepts—objects, hashes, and the three‑area model (working directory, index, repository)—helps you use Git confidently and avoid common pitfalls.
Creating a Repository with git init
Running mkdir git-demo && cd git-demo && git init creates a .git folder containing all metadata needed for version control. The command also removes sample hook scripts ( rm -rf .git/hooks/*.sample) and you can watch the directory structure with watch -n 1 -d find ..
# left side execution
$ mkdir git-demo
$ cd git-demo && git init
$ rm -rf .git/hooks/*.sample
# right side execution
$ watch -n 1 -d find ..git Directory Layout
The top‑level entries are:
HEAD – pointer to the current branch reference.
config – repository‑specific configuration (user name, email, etc.).
description – optional repository description.
hooks – sample hook scripts.
info – includes exclude for untracked file patterns.
objects – stores all Git objects (blobs, trees, commits, tags).
refs – contains heads (branches) and tags.
➜ tree .git
.git
├── HEAD
├── config
├── description
├── hooks
├── info
│ └── exclude
├── objects
│ ├── info
│ └── pack
└── refs
├── heads
└── tagsConfiguring User Identity
Set the local user name and email that will appear in commits:
git config user.name "demo"
git config user.email "[email protected]"These values are stored in .git/config alongside the global configuration file ~/.gitconfig.
➜ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
[user]
name = demo
email = [email protected]Understanding Objects: blobs, trees, and commits
When you add a file, Git creates a blob object that stores the file’s contents (but not its name). The object’s identifier is a SHA‑1 hash of the string blob <size>\0<content>.
# create a file
$ echo "hello git" > hello.txt
$ git add hello.txt
# inspect the object
$ git cat-file -t 8d0e41 # shows "blob"
$ git cat-file -p 8d0e41 # prints "hello git"
$ git cat-file -s 8d0e41 # size in bytes (10)
# manual SHA‑1 calculation
$ echo "blob 10\0hello git" | shasum
8d0e41234f24b6da002d962a26c2495ea16a425f -The .git/objects directory stores objects in a two‑level hash‑based layout (e.g., .git/objects/8d/0e41234f24b6da002d962a26c2495ea16a425f). Objects are compressed with zlib; small objects may appear larger after compression.
# view raw compressed object (binary output)
$ cat .git/objects/8d/0e41234f24b6da002d962a26c2495ea16a425f
xKOR04`HWH,6A%Python can decompress an object:
import zlib
contents = open('0e41234f24b6da002d962a26c2495ea16a425f', 'rb').read()
print(zlib.decompress(contents))The Index (Staging Area)
The index file ( .git/index) records which blobs belong to which paths. You can list staged files with:
$ git ls-files # list staged paths
$ git ls-files -s # show mode, SHA‑1, and stage numberWhen you modify a file, git status compares the working‑tree file against the index entry to report changes.
Creating Commits
A commit records a snapshot of the entire tree and references a parent commit (except for the initial commit). The commit object contains metadata (author, date, message) and a pointer to the root tree.
# make a commit
$ git commit -m "1st commit"
# inspect commit and tree objects
$ git cat-file -t 6e4a700 # "commit"
$ git cat-file -p 6e4a700 # shows metadata and tree SHA‑1
$ git cat-file -t 64d6ef5 # "tree"
$ git cat-file -p 64d6ef5 # lists blobs and sub‑treesHEAD and Branches
HEADis a symbolic reference that points to the current branch’s tip (e.g., ref: refs/heads/master). Branches are simply files under .git/refs/heads that store a commit SHA‑1.
# view current HEAD and branch tip
$ cat .git/HEAD
ref: refs/heads/master
$ cat .git/refs/heads/masterCreating a new branch writes a new ref file; switching branches updates HEAD to point at the new ref.
$ git branch dev # creates .git/refs/heads/dev
$ git checkout dev # HEAD now points to dev
$ cat .git/HEAD
ref: refs/heads/devDetached HEAD and Reflog
Checking out a specific commit puts Git into a detached‑HEAD state; HEAD points directly at that commit. The git reflog command records all moves of HEAD, allowing you to recover lost commits after branch deletion.
$ git checkout 6e4a700 # detached HEAD
$ git reflog # shows recent HEAD positionsViewing Differences
Use git diff to compare the working tree with the index, or git diff --cached (or git diff HEAD) to compare the index with the last commit.
$ git diff # unstaged changes
$ git diff --cached # staged changes
$ git diff HEAD # all changes since last commitConnecting to a Remote Repository
After initializing a local repository, add a remote URL (e.g., GitHub) with:
$ git remote add origin [email protected]:escapelife/git-demo.gitThe remote configuration appears in .git/config under a [remote "origin"] section.
[remote "origin"]
url = [email protected]:escapelife/git-demo.git
fetch = +refs/heads/*:refs/remotes/origin/*Pushing to a Remote
Push the current branch and set the upstream with: $ git push -u origin master Git creates additional refs under .git/refs/remotes/origin and logs under .git/logs to track the remote’s state.
tree .git
├── logs
│ ├── HEAD
│ └── refs
│ ├── heads
│ │ ├── dev
│ │ ├── master
│ │ └── tmp
│ └── remotes
│ └── origin
│ └── master
└── refs
├── heads
│ ├── dev
│ ├── master
│ └── tmp
└── remotes
└── origin
└── masterSummary of File States
Git tracks files in three states: untracked , modified (working tree vs. index), and staged (index vs. HEAD). Commands like git add, git commit, git checkout, and git reset move files between these states.
Key Takeaways
Git stores content as immutable objects identified by SHA‑1 hashes.
The index bridges the working directory and the repository.
Branches are lightweight pointers; HEAD determines the current context.
Reflog preserves history of HEAD moves, enabling recovery of dangling commits.
Remote configuration extends the same object model to other servers.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
