Fundamentals 19 min read

What Did Git’s First Commit Implement? A Deep Dive into Its 1,000‑Line Source

This article explores the original 1,000‑line Git source from its first commit, showing how to obtain, compile, and analyze the core commands and objects—init‑db, update‑cache, cat‑file, show‑diff, write‑tree, read‑tree, commit‑tree—while explaining Git’s design principles, storage layout, SHA‑1 naming, and the limitations of the initial implementation.

ITPUB
ITPUB
ITPUB
What Did Git’s First Commit Implement? A Deep Dive into Its 1,000‑Line Source

Introduction

Git is the most widely used modern version‑control system, created by Linus Torvalds in 2005. The very first usable version consisted of roughly 1,000 lines of C code and already implemented the essential concepts of a distributed VCS: a workspace, an index (staging area), and a repository, as well as the three fundamental object types (blob, tree, commit).

Obtaining the Source

The original source can be cloned from the official mirror:

# Get the Git source
git clone https://github.com/git/git.git
# Show the first commit (chronological order)
git log --date-order --reverse
# Checkout the first commit by its hash
git checkout e83c5163316f89bfbde7d9ab23ca2e25604af290

File Structure and Size

Running tree -h on the checked‑out commit reveals the key source files (e.g., init-db.c, update-cache.c, cat-file.c, write-tree.c, etc.) and a total of 1,089 lines of code.

Compiling the First Commit

The original Makefile lacks the required compression libraries. Adding -lz -lcrypto to LIBS resolves the issue. After the change, the project can be built on Linux with:

# Compile the source
make

Note: the binary only runs on Linux platforms.

Core Command Analysis

init‑db – Initialise a Repository

Creates the initial directory structure ( .dircache and its sub‑directories) that represents Git’s workspace in this early version.

# Initialise repository
init-db

Running tree .dircache after execution shows 256 sub‑directories (00‑ff) under .dircache/objects.

update‑cache – Add Files to the Index

Reads the workspace, writes file snapshots as blob objects, and records their metadata in .dircache/index. The workflow:

Parse .dircache/index.

Traverse modified files, compute SHA‑1, and store blob objects.

Write entries to the index.

# Add a new file
 echo "hello git" > README.md
# Stage the file
update-cache README.md
# Inspect the index (hex dump)
hexdump -C .dircache/index

cat‑file – Inspect Objects

Given a SHA‑1, locates the corresponding object file under .dircache/objects/, decompresses it, and writes the content to a temporary file for viewing.

# View a blob object
cat-file 82f8604c3652fa5762899b5ff73eb37bef2da795
# The temporary file contains "hello git!"
cat ./temp_git_file_tBTXFM

show‑diff – Show Workspace vs. Index Differences

Compares the current workspace files with the index entries and reports either ok or a unified diff.

# No changes
show-diff
# Modify README.md
echo "hello world!" > README.md
# Show diff
show-diff

write‑tree – Create a Tree Object

Aggregates all staged blob objects into a single tree object and stores it.

# Write tree
write-tree
# Output: c771b3ab2fe3b7e43099290d3e99a3e8c414ec72

read‑tree – Read a Tree Object

Displays the entries (mode, filename, SHA‑1) contained in a given tree.

# Read the previously written tree
read-tree c771b3ab2fe3b7e43099290d3e99a3e8c414ec72

commit‑tree – Create a Commit Object

Combines a tree SHA‑1 with author/committer metadata and an optional parent list to produce a commit object.

# Prepare a changelog
 echo "first commit" > changelog
# Create commit
commit-tree c771b3ab2fe3b7e43099290d3e99a3e8c414ec72 < changelog
# Resulting commit SHA‑1: 7ea820bd363e24f5daa5de8028d77d88260503d9

Object Storage Details

Objects are stored under .dircache/objects/ using a two‑level directory scheme: the first byte of the SHA‑1 determines the sub‑directory (00‑ff), and the remaining 39 hex characters form the filename. Each object file is a zlib‑compressed blob consisting of <type> <size> <content>. The three object types are:

blob – raw file snapshot.

tree – list of entries (mode, name, SHA‑1) for a directory.

commit – references a tree, optional parents, author/committer info, and a commit message.

Index File Format

The index ( .dircache/index) is a binary file composed of a 32‑byte header followed by a series of entry records (minimum 63 bytes each). The header’s SHA‑1 is computed over the entire file (header plus entries). Each entry stores path, mode, timestamps, and the SHA‑1 of the corresponding blob.

Design Principles

Git was built as a decentralized system: every developer’s workspace contains a full repository, enabling offline work and easy collaboration. The core design follows the Unix philosophy—“Write programs that do one thing and do it well”—and relies on simple, composable primitives (objects, commands, and the three areas: workspace, index, repository).

Limitations of the First Commit

While the initial implementation provides the essential plumbing, many higher‑level features (branches, remote handling, reflogs, hooks, etc.) were added later. The early code also shows typical quality issues: incomplete error handling, memory leaks (e.g., in write-tree.c), and sub‑optimal data structures (arrays instead of linked lists for index entries).

Conclusion

The first Git commit, though modest in size, encapsulates the fundamental concepts that make Git powerful today. Understanding its source offers insight into the elegant design of a distributed VCS and highlights areas where the project has evolved and improved over more than a decade of iterative development.

References

Git official website: https://git-scm.com

Git documentation: https://git-scm.com/doc

Git Internals – Objects: https://git-scm.com/book/en/v2/Git-Internals-Objects

zlib library: http://zlib.net

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

GitLinuxVersion ControlOpenSSLsource codeinternalsSHA1
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.