Fundamentals 14 min read

Exploring Diff Algorithms: Shortest Edit Distance, Longest Common Subsequence, and Myers Algorithm with TypeScript Implementations

This article investigates how diff tools work by presenting three algorithmic approaches—shortest edit distance, longest common subsequence, and the Myers algorithm—each explained with dynamic‑programming concepts, back‑tracing techniques, and complete TypeScript code examples.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Exploring Diff Algorithms: Shortest Edit Distance, Longest Common Subsequence, and Myers Algorithm with TypeScript Implementations

The author became curious about the implementation of online JSON diff tools and decided to explore three classic ways to compute the difference between two strings.

Shortest Edit Distance is introduced first. The classic dynamic‑programming solution builds a two‑dimensional DP table that records the minimal number of operations (insert, delete, replace) needed to transform one string into another. The table is filled row by row, and the final cell gives the edit distance.

function minDistance(word1: string, word2: string) {
  const m = word1.length
  const n = word2.length
  const dp = Array.from({ length: m + 1 }, () => Array.from({ length: n + 1 }, () => 0))
  for (let i = 1; i <= m; i++) {
    dp[i][0] = i
  }
  for (let i = 1; i <= n; i++) {
    dp[0][i] = i
  }
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (word1[i - 1] === word2[j - 1]) {
        dp[i][j] = dp[i - 1][j - 1]
      } else {
        dp[i][j] = Math.min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]) + 1
      }
    }
  }
  console.log(dp)
}

Running minDistance('horse', 'ros') produces a DP matrix whose bottom‑right value ( dp[5][3] ) equals the minimal edit distance. By examining neighboring cells, one can backtrack to reconstruct the exact sequence of operations (delete, insert, replace) that transforms the source string into the target.

Longest Common Subsequence (LCS) is then described. Like the edit‑distance solution, it uses a two‑dimensional DP array, but the recurrence focuses on matching characters to build the longest subsequence common to both strings.

function longestCommonSubsequence(word1: string, word2: string) {
  const len1 = word1.length
  const len2 = word2.length
  const dp = Array.from({ length: len1 + 1 }, () => Array.from({ length: len2 + 1 }, () => 0))
  for (let i = 1; i <= len1; i++) {
    for (let j = 1; j <= len2; j++) {
      if (word1[i - 1] === word2[j - 1]) {
        dp[i][j] = dp[i - 1][j - 1] + 1
      } else {
        dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1])
      }
    }
  }
  console.log(dp)
}

After the DP table is filled, back‑tracing from dp[len1][len2] yields the LCS and, by comparing the remaining characters, an edit script that minimizes unnecessary operations.

type TBackTrace = { type: EOpType; index: number; str: string }

function backTraceLSC(dp: number[][], word1: string, word2: string) {
  const ses: TBackTrace[] = []
  let dpXIndex = dp.length - 1
  let dpYIndex = dp[0].length - 1
  let top = -1
  let left = -1
  let topLeft = -1
  let current = -1
  while (dpXIndex !== 0 || dpYIndex !== 0) {
    if (dpXIndex === 0) {
      // only can insert
      ses.push({ type: EOpType.InsertBefore, index: dpYIndex - 1, str: word2[dpYIndex - 1] })
      dpYIndex--
      continue
    }
    if (dpYIndex === 0) {
      // only can delete
      ses.push({ type: EOpType.Delete, index: dpXIndex - 1, str: word1[dpXIndex - 1] })
      dpXIndex--
      continue
    }
    top = dp[dpXIndex - 1][dpYIndex]
    left = dp[dpXIndex][dpYIndex - 1]
    topLeft = dp[dpXIndex - 1][dpYIndex - 1]
    current = dp[dpXIndex][dpYIndex]
    if (top === left && top === topLeft && current - 1 === topLeft) {
      ses.push({ type: EOpType.Hold, index: dpXIndex - 1, str: word1[dpXIndex - 1] })
      dpXIndex--
      dpYIndex--
      continue
    }
    if (left == current) {
      ses.push({ type: EOpType.InsertBefore, index: dpYIndex, str: word2[dpYIndex - 1] })
      dpYIndex--
      continue
    }
    if (top === current) {
      ses.push({ type: EOpType.Delete, index: dpXIndex - 1, str: word1[dpXIndex - 1] })
      dpXIndex--
      continue
    }
  }
  return ses.reverse()
}

The article then introduces the Myers algorithm , which improves both time and space complexity and yields an optimal edit path. A concise TypeScript implementation is provided, showing how the algorithm expands a frontier of “k‑lines”, follows diagonal moves while characters match, and finally back‑tracks to produce the shortest edit script.

enum Operation {
  INSERT = 1,
  DELETE = 2,
  MOVE = 3
}
function reverse(s: Operation[]) {
  const result: Operation[] = []
  const len = s.length
  for (let index = 0; index < s.length; index++) {
    result[len - 1 - index] = s[index]
  }
  return result
}
function shortestEditScript(src: string, dst: string) {
  const n = src.length
  const m = dst.length
  const max = n + m
  const trace: Record
[] = []
  let x: number
  let y: number
  let isDone = false
  for (let d = 0; d <= max; d++) {
    const v: Record
= {}
    trace.push(v)
    if (d === 0) {
      let t = 0
      while (n > t && m > t && src[t] === dst[t]) {
        t++
      }
      v[0] = t
      if (t === n && t === m) {
        break
      }
      continue
    }
    const lastV = trace[d - 1]
    for (let k = -d; k <= d; k += 2) {
      // down
      if (k === -d || (k !== d && (lastV?.[k - 1] || 0) < (lastV?.[k + 1] || 0))) {
        x = lastV?.[k + 1] || 0
      } else {
        // right
        x = (lastV?.[k - 1] || 0) + 1
      }
      y = x - k
      // diagonal
      while (x < n && y < m && src[x] === dst[y]) {
        x++
        y++
      }
      v[k] = x
      if (x === n && y === m) {
        isDone = true
        break
      }
    }
    if (isDone) break
  }
  // backtrack
  const script: Operation[] = []
  x = n
  y = m
  let k: number
  let prevK: number
  let prevX: number
  let prevY: number
  for (let d = trace.length - 1; d > 0; d--) {
    k = x - y
    const lastV = trace[d - 1]
    if (k === -d || (k !== -d && lastV[k - 1] < lastV[k + 1])) {
      prevK = k + 1
    } else {
      prevK = k - 1
    }
    prevX = lastV[prevK]
    prevY = prevX - prevK
    while (x > prevX && y > prevY) {
      script.push(Operation.MOVE)
      x--
      y--
    }
    if (x === prevX) {
      script.push(Operation.INSERT)
    } else {
      script.push(Operation.DELETE)
    }
    x = prevX
    y = prevY
  }
  if (trace[0][0] !== 0) {
    for (let i = 0; i < trace[0][0]; i++) {
      script.push(Operation.MOVE)
    }
  }
  return reverse(script)
}

Visualizations (not reproduced here) show added characters in green, deleted characters in red, and unchanged characters in black. The author notes that while these algorithms illustrate the principles of diff, production editors such as Monaco or CodeMirror already provide highly optimized diff implementations, and further work would be needed to adapt the presented code to real‑world scenarios.

typescriptAlgorithmDiffedit-distancelongest-common-subsequencemyers-algorithm
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.