Formalizing edit operations: lifting document mutations to source code edits

Definitions

We denote $s$ to be the source code. Capitalized to represent the set of possible source code — $s \in S$.

We denote $a(s)$ to be the AST of $s$. Capitalized to represent the set of possible ASTs — $a(s) \in A$.

We denote $\alpha$ to represent data.

We denote $i(a, \alpha)$ the intermediate representation (IR) of $s$ given $\alpha$ — $i(s, \alpha) \in I$.

We denote $d(i)$ the HTML document that results from $s$ given $\alpha$ — $d(s, \alpha) \in D$.

As such we have a pipeline of $s$ given $\alpha$ that produces a singular HTML document.

$$ s \rightarrow a \rightarrow i \rightarrow d $$

Remembering that $a$ is strictly $f(s)$ and that $i$ is strictly $f(a, \alpha)$. Then it follows that, by substitution, $i$ is $f(s, \alpha)$.

Remembering that $d$ is strictly $f(i)$, then it follows that $d = f(s, \alpha)$.

Therefore we can simplify our notation by ignoring the intermerdiate representations $a$ and $i$ from our system.

For a given $\alpha$, it follows that all $s \in S$ produce a singular $d \in D$. In other words, there is only one $d$ per given $s$. We write this rendering function as:

$$f_\alpha : S \to D$$

where $f_\alpha(s) = d$. The subscript $\alpha$ reminds us that data is fixed; the function maps source to document.

Mutations on $D$

Suppose we want to mutate a document $d$ into another document $d'$. We write this as:

$$\delta : D \to D$$

where $\delta$ is a document mutation — e.g., "insert a <div> after the third child of <body>."

The problem: we do not author documents directly. We author source code. The document is a derived artifact. So we need to find a corresponding source mutation:

$$\sigma : S \to S$$

such that applying $\sigma$ to the source and then rendering produces the same result as rendering and then applying $\delta$ to the document:

$$f_\alpha(\sigma(s)) = \delta(f_\alpha(s))$$

Or equivalently:

$$f_\alpha \circ \sigma = \delta \circ f_\alpha$$

As a commuting diagram:

$$ \begin{array}{ccc} S & \xrightarrow{\sigma} & S \\ \downarrow f_\alpha & & \downarrow f_\alpha \\ D & \xrightarrow{\delta} & D \end{array} $$

Reading it: "mutating the source then rendering = rendering then mutating the document."

We call $\sigma$ a lift of $\delta$ along $f_\alpha$.

The design constraint

In general, lifting $\delta$ to $\sigma$ is problematic: $f_\alpha$ is many-to-one (multiple source codes can produce the same document), so the lift may not be unique. And arbitrary document mutations may produce documents outside the image of $f_\alpha$, so the lift may not exist.

We avoid both problems by restricting the allowed mutations. We require:

Every $\delta$ is liftable: $\delta$ only produces documents in $\text{im}(f_\alpha)$.
Every $\delta$ lifts uniquely: for each $\delta$, there is exactly one $\sigma$ satisfying the commuting square.

In other words, we design our mutation sets $N$ and $M$ together so that $\text{lift}_\alpha$ is a total, injective function:

$$\text{lift}_\alpha : N \to M$$

This is a strong constraint. It means the document mutations are "source-aware" — they carry enough structure to unambiguously determine a source edit.

Edit operations compose

We call the set of allowed source edits $M$ and the set of allowed document mutations $N$. These sets have three properties that make them useful:

There is a "do nothing" operation. Applying it changes nothing.
Any two operations can be chained into one. "Move node A" followed by "delete node B" is itself a single operation "move A then delete B."
Chaining is associative. $(a \circ b) \circ c = a \circ (b \circ c)$ — grouping doesn't matter, only order.

Each operation in $M$ takes a source file and produces a new source file. Each operation in $N$ takes a document and produces a new document:

$$m : S \to S \qquad \delta : D \to D$$

For the mathematically inclined: a set with these three properties is called a monoid, and the way it transforms a space is called a monoid action.

The lift preserves composition

Because of our design constraint, every document mutation $\delta \in N$ maps to exactly one source edit $\sigma \in M$. We write this mapping as:

$$\text{lift}_\alpha : N \to M$$

The critical property: lifting respects chaining. If you chain two document mutations and then lift, you get the same result as lifting each one and then chaining the source edits:

$$\text{lift}_\alpha(\delta_1 \circ \delta_2) = \text{lift}_\alpha(\delta_1) \circ \text{lift}_\alpha(\delta_2)$$

And lifting "do nothing" gives "do nothing":

$$\text{lift}_\alpha(\text{id}_D) = \text{id}_S$$

For the mathematically inclined: this makes $\text{lift}\alpha$ a monoid homomorphism — a structure-preserving map between the two operation sets._

This gives us two properties for free:

Composability: a sequence of document mutations lifts to a sequence of source edits. The order is preserved and the result is the same whether you compose then lift, or lift then compose.
Optimistic application: we can apply $\delta$ to the document immediately and know that a deterministic $\sigma$ exists. The source edit can be computed asynchronously.

Since the mapping is 1:1, the two operation sets are structurally identical — they differ only in whether they speak in terms of documents or source code.

charlespwd/edits-math.md

Select an option

No results found