What Did Git’s First Commit Implement? A Deep Dive into Its 1,000‑Line Source
This article explores the original 1,000‑line Git source from its first commit, showing how to obtain, compile, and analyze the core commands and objects—init‑db, update‑cache, cat‑file, show‑diff, write‑tree, read‑tree, commit‑tree—while explaining Git’s design principles, storage layout, SHA‑1 naming, and the limitations of the initial implementation.
Introduction
Git is the most widely used modern version‑control system, created by Linus Torvalds in 2005. The very first usable version consisted of roughly 1,000 lines of C code and already implemented the essential concepts of a distributed VCS: a workspace, an index (staging area), and a repository, as well as the three fundamental object types (blob, tree, commit).
Obtaining the Source
The original source can be cloned from the official mirror:
# Get the Git source
git clone https://github.com/git/git.git
# Show the first commit (chronological order)
git log --date-order --reverse
# Checkout the first commit by its hash
git checkout e83c5163316f89bfbde7d9ab23ca2e25604af290File Structure and Size
Running tree -h on the checked‑out commit reveals the key source files (e.g., init-db.c, update-cache.c, cat-file.c, write-tree.c, etc.) and a total of 1,089 lines of code.
Compiling the First Commit
The original Makefile lacks the required compression libraries. Adding -lz -lcrypto to LIBS resolves the issue. After the change, the project can be built on Linux with:
# Compile the source
makeNote: the binary only runs on Linux platforms.
Core Command Analysis
init‑db – Initialise a Repository
Creates the initial directory structure ( .dircache and its sub‑directories) that represents Git’s workspace in this early version.
# Initialise repository
init-dbRunning tree .dircache after execution shows 256 sub‑directories (00‑ff) under .dircache/objects.
update‑cache – Add Files to the Index
Reads the workspace, writes file snapshots as blob objects, and records their metadata in .dircache/index. The workflow:
Parse .dircache/index.
Traverse modified files, compute SHA‑1, and store blob objects.
Write entries to the index.
# Add a new file
echo "hello git" > README.md
# Stage the file
update-cache README.md
# Inspect the index (hex dump)
hexdump -C .dircache/indexcat‑file – Inspect Objects
Given a SHA‑1, locates the corresponding object file under .dircache/objects/, decompresses it, and writes the content to a temporary file for viewing.
# View a blob object
cat-file 82f8604c3652fa5762899b5ff73eb37bef2da795
# The temporary file contains "hello git!"
cat ./temp_git_file_tBTXFMshow‑diff – Show Workspace vs. Index Differences
Compares the current workspace files with the index entries and reports either ok or a unified diff.
# No changes
show-diff
# Modify README.md
echo "hello world!" > README.md
# Show diff
show-diffwrite‑tree – Create a Tree Object
Aggregates all staged blob objects into a single tree object and stores it.
# Write tree
write-tree
# Output: c771b3ab2fe3b7e43099290d3e99a3e8c414ec72read‑tree – Read a Tree Object
Displays the entries (mode, filename, SHA‑1) contained in a given tree.
# Read the previously written tree
read-tree c771b3ab2fe3b7e43099290d3e99a3e8c414ec72commit‑tree – Create a Commit Object
Combines a tree SHA‑1 with author/committer metadata and an optional parent list to produce a commit object.
# Prepare a changelog
echo "first commit" > changelog
# Create commit
commit-tree c771b3ab2fe3b7e43099290d3e99a3e8c414ec72 < changelog
# Resulting commit SHA‑1: 7ea820bd363e24f5daa5de8028d77d88260503d9Object Storage Details
Objects are stored under .dircache/objects/ using a two‑level directory scheme: the first byte of the SHA‑1 determines the sub‑directory (00‑ff), and the remaining 39 hex characters form the filename. Each object file is a zlib‑compressed blob consisting of <type> <size> <content>. The three object types are:
blob – raw file snapshot.
tree – list of entries (mode, name, SHA‑1) for a directory.
commit – references a tree, optional parents, author/committer info, and a commit message.
Index File Format
The index ( .dircache/index) is a binary file composed of a 32‑byte header followed by a series of entry records (minimum 63 bytes each). The header’s SHA‑1 is computed over the entire file (header plus entries). Each entry stores path, mode, timestamps, and the SHA‑1 of the corresponding blob.
Design Principles
Git was built as a decentralized system: every developer’s workspace contains a full repository, enabling offline work and easy collaboration. The core design follows the Unix philosophy—“Write programs that do one thing and do it well”—and relies on simple, composable primitives (objects, commands, and the three areas: workspace, index, repository).
Limitations of the First Commit
While the initial implementation provides the essential plumbing, many higher‑level features (branches, remote handling, reflogs, hooks, etc.) were added later. The early code also shows typical quality issues: incomplete error handling, memory leaks (e.g., in write-tree.c), and sub‑optimal data structures (arrays instead of linked lists for index entries).
Conclusion
The first Git commit, though modest in size, encapsulates the fundamental concepts that make Git powerful today. Understanding its source offers insight into the elegant design of a distributed VCS and highlights areas where the project has evolved and improved over more than a decade of iterative development.
References
Git official website: https://git-scm.com
Git documentation: https://git-scm.com/doc
Git Internals – Objects: https://git-scm.com/book/en/v2/Git-Internals-Objects
zlib library: http://zlib.net
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
