How text diff works

Comparing two pieces of text seems trivial until you try to do it well. A good diff does not just say "these are different," it shows the smallest set of changes that turns one version into the other. That is a real computation, and understanding it makes diffs easier to read.

What a diff is really computing

Under the hood, most diff tools solve a version of the longest common subsequence problem: find the largest amount of content the two texts share, in order, then everything outside that shared core is what changed. Framing it this way is why a good diff highlights a small inserted sentence rather than marking everything after it as different. The goal is the minimal, most readable set of edits.

Line, word, and character granularity

Line diff treats each line as a unit. It is ideal for code and config, where changes are naturally line-based, and it is what version control shows.
Word diff compares word by word. For prose, where a sentence is edited mid-line, this pinpoints the changed words instead of flagging the whole line.
Character diff is the finest grain, useful for spotting a single changed digit or a typo.

Picking the right granularity is most of the battle. A line diff of edited prose is noisy; a character diff of code is overwhelming.

Reading a unified diff

The unified format, the one you see in code review, marks removed lines with -, added lines with +, and surrounds them with a few unchanged context lines. The @@ markers give the line numbers each hunk affects. Once you read - as "old" and + as "new," the whole format becomes obvious.

The Diff Checker compares two texts in your browser, with nothing uploaded, and highlights exactly what changed. For counting and analyzing a single text rather than comparing two, see the text statistics tool.