Fundamentals 12 min read

How Merkle Trees Enable Efficient Data Change Detection

This article explains the principles of Merkle trees, shows how they are built from file hashes, and demonstrates their use for fast data‑change detection in cloud sync, blockchain verification, and P2P file sharing, complete with a Node.js implementation.

Full-Stack Cultivation Path
Full-Stack Cultivation Path
Full-Stack Cultivation Path
How Merkle Trees Enable Efficient Data Change Detection

In modern digital systems, verifying the integrity of large data sets efficiently is crucial for cloud storage sync, blockchain transaction validation, and peer‑to‑peer file sharing. The article introduces the Merkle tree as the core data structure that solves this problem.

Hash Functions

A hash function maps any input (text or file) to a fixed‑length digest. For example, the file foo.js hashed with SHA‑1 yields 5f44557c8c615183ddfc42e82544945ce01f3c2a. Any minor change, such as adding a space, produces a completely different hash ( 490be46b3ce0259122c266f500919022d5046cf0), allowing detection of modifications.

What Is a Merkle Tree?

A Merkle tree (hash tree) is a binary tree whose leaves are the hashes of individual data blocks. It consists of a root node, intermediate nodes, and leaf nodes. By pairing leaf hashes, hashing the pair to form a parent, and repeating upward, a single root hash represents the entire data set.

Example construction: given files file1.txtfile4.txt, their leaf hashes are paired, hashed again to form intermediate nodes, and finally the root hash 7ab2c4e is obtained.

Merkle Tree Applications

Data Synchronization

Consider a cloud‑sync folder with four files. The initial Merkle tree root is ROOT[7ab2c4e]. After modifying file2.txt, the new tree root becomes ROOT[🔴e9f8d2c], and the changed leaf hash is highlighted. By comparing nodes from the root down, the system identifies the modified file in O(log n) time, far fewer comparisons than a linear scan.

文件数量    传统方式     Merkle Tree
10         10次比较     ≤ 4次比较
100        100次比较    ≤ 7次比较
1,000      1000次比较   ≤ 10次比较
10,000     10000次比较  ≤ 14次比较
100,000    100000次比较 ≤ 17次比较

A complete Node.js implementation is provided to build a Merkle tree from a file map, compute hashes with SHA‑256, and find differences between two trees.

const crypto = require('crypto');

class MerkleNode {
  constructor(hash, filename = '') {
    this.hash = hash;
    this.filename = filename;
    this.left = null;
    this.right = null;
  }
}

class MerkleTree {
  constructor() { this.root = null; }
  static hash(data) { return crypto.createHash('sha256').update(data).digest('hex'); }
  buildTree(files) {
    const leaves = Object.entries(files).map(([filename, content]) => {
      const hash = MerkleTree.hash(content);
      return new MerkleNode(hash, filename);
    });
    this.root = this.buildTreeFromNodes(leaves);
    return this.root;
  }
  buildTreeFromNodes(nodes) {
    if (nodes.length === 0) return null;
    if (nodes.length === 1) return nodes[0];
    const parents = [];
    for (let i = 0; i < nodes.length; i += 2) {
      const left = nodes[i];
      const right = i + 1 < nodes.length ? nodes[i + 1] : null;
      const combinedHash = right ? MerkleTree.hash(left.hash + right.hash) : left.hash;
      const parent = new MerkleNode(combinedHash);
      parent.left = left;
      parent.right = right;
      parents.push(parent);
    }
    return this.buildTreeFromNodes(parents);
  }
  findDifferences(otherTree) {
    const differences = [];
    const compare = (node1, node2) => {
      if (!node1 || !node2 || node1.hash === node2.hash) return;
      if (node1.filename) { differences.push(node1.filename); return; }
      compare(node1.left, node2.left);
      compare(node1.right, node2.right);
    };
    compare(this.root, otherTree.root);
    return differences;
  }
}

function main() {
  const originalFiles = {
    'file1.txt': 'Hello',
    'file2.txt': 'World',
    'file3.txt': 'Merkle',
    'file4.txt': 'Tree'
  };
  const modifiedFiles = {
    'file1.txt': 'Hello',
    'file2.txt': 'JavaScript', // modified
    'file3.txt': 'Merkle',
    'file4.txt': 'Tree'
  };
  const cloudTree = new MerkleTree();
  cloudTree.buildTree(originalFiles);
  const localTree = new MerkleTree();
  localTree.buildTree(modifiedFiles);
  const differences = localTree.findDifferences(cloudTree);
  console.log('需要同步的文件:', differences);
  console.log('
验证过程:');
  console.log('根节点哈希值比较:');
  console.log('本地:', localTree.root.hash);
  console.log('云端:', cloudTree.root.hash);
}

main();

Blockchain Verification

In a blockchain, each block contains many transactions. Light nodes use a Merkle tree to verify a single transaction without downloading the whole block. The article shows the verification steps: request the Merkle proof, compute leaf hash, combine with sibling hashes up to the root, and compare with the block header's stored Merkle root.

P2P File Sharing

BitTorrent splits large files into blocks, builds a Merkle tree, and stores the root in the torrent metadata. During download, peers can request the Merkle proof for a block to confirm its integrity before accepting it. The article illustrates the block‑splitting, tree construction, and verification process with concrete hash examples.

Conclusion

The Merkle tree embodies the divide‑and‑conquer principle: a massive integrity‑checking problem is reduced to verifying a small number of hashes along a tree path. As data volumes continue to grow, this structure becomes increasingly valuable for synchronization, blockchain, and peer‑to‑peer systems.

References

Wikipedia: Merkle Tree https://en.wikipedia.org/wiki/Merkle_tree

An Intro to Merkle Tree: What is it and How Does it Work? https://hackernoon.com/an-intro-to-merkel-tree-what-is-it-and-how-does-it-work

Bitcoin: A Peer‑to‑Peer Electronic Cash System https://bitcoin.org/files/bitcoin-paper/bitcoin_zh_cn.pdf

Merkle Tree Structure https://yeasy.gitbook.io/blockchain_guide/05_crypto/merkle_trie

WTF Solidity Minimal Intro: Merkle Tree https://github.com/AmazingAng/WTF-Solidity/blob/main/36_MerkleTree/readme.md

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Node.jsData SynchronizationP2PBlockchainMerkle Treehash function
Full-Stack Cultivation Path
Written by

Full-Stack Cultivation Path

Focused on sharing practical tech content about TypeScript, Vue 3, front-end architecture, and source code analysis.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.