08 · Dynamic Programming
LCS · edit distance · 0/1 knapsack — three classic DP tables fill cell-by-cell, then traceback in red.
What you're watching
A DP table is a 2-D array indexed by sub-problem coordinates. Each cell stores the answer to one sub-problem; later cells combine earlier cells' answers via a recurrence. Watching the cells fill, in order, makes the recurrence concrete: every cell's value comes from at most three other cells (its top, left, and diagonal neighbours, depending on the problem). The red path at the end is the traceback — the chain of cells whose decisions add up to the optimal answer at the bottom-right corner.
The three problems
Longest Common Subsequence (LCS) O(mn)
Given two strings A and B of length m and n, find the longest sequence of characters that appears in both, in order (but not necessarily contiguous). Recurrence:
- dp[i][j] = 0 if i = 0 or j = 0
- dp[i][j] = dp[i-1][j-1] + 1 if A[i-1] = B[j-1]
- dp[i][j] = max(dp[i-1][j], dp[i][j-1]) otherwise
Final answer is at dp[m][n]. Traceback walks diagonally on a match, otherwise to whichever neighbour holds the larger value.
Edit (Levenshtein) Distance O(mn)
Minimum number of single-character edits (insert / delete / substitute) to turn A into B. Recurrence:
- dp[i][0] = i, dp[0][j] = j
- dp[i][j] = dp[i-1][j-1] if A[i-1] = B[j-1]
- dp[i][j] = 1 + min(dp[i-1][j-1], dp[i-1][j], dp[i][j-1])
The three options at each non-match cell correspond to the three edit operations: substitute (diagonal), delete (up), insert (left).
0/1 Knapsack O(nW)
Given n items with weights and values, and a knapsack of capacity W, pick a subset of items maximising total value while keeping total weight ≤ W. The "0/1" means each item is either taken whole or not at all. Recurrence:
- dp[i][w] = 0 if i = 0 or w = 0
- dp[i][w] = dp[i-1][w] if weight[i-1] > w (can't fit)
- dp[i][w] = max(dp[i-1][w], dp[i-1][w-weight[i-1]] + value[i-1])
Note this is pseudo-polynomial: O(nW), where W is the numerical capacity, not the input size in bits. When W is huge, DP becomes infeasible — the problem is actually NP-hard in the general case.
What I learned writing this
- "DP solves a problem by filling a table" is a slogan worth seeing. The recurrences are short — three to four lines each — but watching cells light up in row-major order and seeing how each new cell consults a small number of already-filled cells makes the local-recurrence-becomes-global-answer pattern concrete in a way the formulas alone don't.
- Edit distance's traceback is more delicate than LCS's. LCS goes diagonal on a match, then picks the larger neighbour. Edit distance has three predecessor options; you have to pick the one whose value equals the recurrence (i.e., the cheapest cause of the current cell). Easy to off-by-one when the neighbours all have the same value — you have to commit to a tie-breaking order or the path can become inconsistent.
- Knapsack tables can get big fast. Even small inputs (n=10, W=20) produce 11×21 = 231 cells. When you push W up by a factor of 100, the table grows by the same factor. Hence "pseudo-polynomial" — the table size depends on the numerical value of an input, not its bit-length. In practice this is when memoisation + only the recently-needed rows (O(W) space instead of O(nW)) becomes essential.
- Colour-by-value is a useful diagnostic. When the LCS table is mostly the same shade of pale blue with a gradient toward the bottom-right, you can see the cumulative-max nature of the recurrence. When the edit-distance table forms a diagonal stripe, the two strings share a common spine. These visual signatures are easier to read than scanning the numbers.