Understanding How Git Works: Objects, Branches, Merges, and the Myers Diff Algorithm
This article explains the internal mechanics of Git, covering its distributed nature, object model (commits, trees, blobs), branch creation, merging strategies, conflict resolution, and the Myers algorithm used for diff operations, illustrated with command‑line examples and diagrams.
Introduction
Git is a distributed version control system that stores multiple local repositories and optionally a central server. Although many developers use Git without knowing its internals, understanding how Git manages repositories can broaden one’s perspective on source control.
Git Features
Differences
SVN is a centralized VCS; the repository lives on a single server and developers work on local copies, pulling updates before committing.
Git is distributed; every host acts as a full repository without a required central server.
Advantages
Git tracks whether a file as a whole has changed, rather than line‑by‑line differences.
Git stores snapshots of files in a miniature file system; unchanged files are referenced, not duplicated.
Almost all Git operations are local, making them extremely fast compared to centralized systems.
How Git Actually Works
To explore Git’s operation we look at the hidden .git directory. Inside, the most relevant sub‑directory is objects , which contains three primary object types:
Commit : links tree objects to form history and stores metadata such as parent commits.
Tree : represents a directory, recording entries and the blob objects they point to.
Blob : holds the raw content of a file (a snapshot).
Commit Objects
Listing the objects directory shows many two‑character folders. Git hashes the first two characters of an object’s SHA‑1 to create the folder name and uses the remaining 38 characters as the object identifier.
objects
├── 0c
│ ├── 8867d7e175f46d4bcd66698ac13f4ca00cf592
│ └── c8002da0403724dfaa6792885eaa97faa71bcf
├── 1b
│ └── 716fafdd3aeb3636222a0026d1d4971078db05
…Running git log -4 --oneline shows the latest four commits with short hashes. Converting a short hash to its full form with git rev-parse reveals the 40‑character identifier used in the object tree.
git rev-parse 9a5bf36
# => 9a5bf367f10390c64a3f7b3e738b78bd78a3d781Inspecting the full object with git cat-file -p 9a5bf36 displays a commit object containing a tree hash, parent hash, author, committer, and message.
Tree Objects
Running git cat-file -p <tree‑hash> shows entries such as:
100644 blob 0cc8002da0403724dfaa6792885eaa97faa71bcf README.md
040000 tree 3c121291ffc25ce6792f9350883b77cea2633048 srcThis demonstrates that a tree can contain both blob files and nested tree directories, mirroring the project’s directory structure.
Blob Objects
Displaying a blob with git cat-file -p <blob‑hash> reveals the raw file content, e.g., a LICENSE file.
MIT License
Copyright (c) 2019
Permission is hereby granted, free of charge, to any person obtaining a copy …Branch Creation and Merging
A branch in Git is simply a mutable pointer to a commit object. The default branch is master . Creating a new branch adds a new pointer to the current commit, which can be switched instantly because Git only moves the pointer.
The special HEAD pointer indicates the currently checked‑out branch. Switching branches updates HEAD to point to a different commit.
Code Merge and Conflicts
When merging, Git performs a three‑way merge using the two branch tips and their common ancestor. The result is a new commit with two parents.
If the same lines are edited in both branches, a conflict occurs. Git marks the conflicting sections with <<<<<<< HEAD, =======, and >>>>>>> markers.
// Merge conflict example
Auto-merging index.html
CONFLICT (content): Merge conflict in index.html
Automatic merge failed; fix conflicts and then commit the result.Code Merge Algorithm (Myers)
Git’s diff engine is based on the Myers algorithm, which finds the shortest edit script between two sequences. The algorithm models the problem as a graph where moving right represents a deletion, moving down an insertion, and moving diagonally a match.
The optimal path yields a diff such as:
- A
- B
C
+ B
A
B
- B
A
+ CGit also shows context lines with @@ markers, indicating the range of lines affected in each file.
@@ -1,15 +1,5 @@
- console.log('watch')
- const add = (a,c) => { … }
+ const add = (a,b) => { … }
add(4,8)
- console.log(reduce(-2,-9))
- console.log(new Date().getDate(),'第二次提交')Conclusion
The article provides a brief overview of Git’s internal mechanisms, including objects, branching, merging, conflict resolution, and the Myers diff algorithm. Readers are encouraged to explore further to deepen their understanding of this powerful tool.
References
Pro Git
Advanced Git
The Myers diff algorithm: part 1 (https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-part-1/)
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.