LLMs frequently emit malformed S-expressions (unbalanced ()[]{}) when generating Clojure/EDN/Lisp code. Post-hoc “repair” is surprisingly difficult. Saving source code files with balancing errors, or streaming them to nREPL/compilers, wastes time and relies on imbalance detection after the tokens are generated.
What if we could prevent these types of error by creating a tool (stateless, stack-based, cursor-aware) for LLMs to use to help them write perfectly balanced S-expression. Ideally while minimizing latency, round-trips, total time required to generate code, and without any MCP server-side state.
Current LLMs are inherently auto-regressive auto-completers, which makes them inherently bad at the problem of closing S-expressions -- they don’t have anything resemblig a “stack” for them to store state (e.g., I just emitted an open parenthesis, and here’s an open bracket. I want to close the form — now what?). Nor does it have an equivalent of a “find matching brace” tool like we have in our editors.
In other words, the LLM is essentially vibing/guessing the closing forms. Put this way, it’s a miracle that it works as well as it does! So instead of detection/correction tools, let’s explore what tools would be required that could prevent these types of errors!
- Stateless "Delimiter Stack" MCP server that exposes four small tools the LLM can call.
- The LLM holds the stack state client-side and passes it in/out as JSON; the server stays pure/stateless.
- A sentinel cursor marks where generation is happening. Tools validate/transform only the delta near the cursor, not the whole buffer, to avoid quadratic token blow-ups.
- The tools provide typed, defensive operations (
push,pop-with-expected,set-sentinel,balance-and-resume) with consistent error contracts so the LLM can self-correct immediately. - Optional helpers: batch operations (push/pop sequences) and diff payloads (apply/return minimal edits) to further reduce tokens & latency.
Large Language Models (LLMs) are autoregressive autocompleters—they generate code token by token, continuously predicting the next likely token based on prior context.
Introducing frequent MCP tool calls (e.g., pushDelimiter, popDelimiter) interrupts that natural flow. Each call
requires a round trip between the model and the server, pausing generation while waiting for a response. (How fast can an
LLM go from "let's use a function/tool" to "call the function/tool" to "I have the result of a function/tool and can add that to my context to resume
code generation?
If tools are invoked too often—say, once per delimiter—the LLM's code generation could slow dramatically, transforming what should be a smooth predictive stream into a stop-start sequence of RPCs.
Beyond latency, every call carries token overhead: JSON payloads, stack state, document slices, and responses.
If the LLM calls the tool successively on growing pieces of code, total token usage can scale quadratically—proportional to N × average payload size.
This could potentially significantly increase the cost of everyday coding sessions.
The Bitter Lesson: rather than hard-coding fixed heuristics ("always check every form"), we should teach the LLM about the trade-offs.
With the right instruction and examples, the LLM can learn to use these tools strategically, reasoning about cost vs. benefit—calling them when the complexity of a region justifies the overhead.
The long-term goal is adaptive tool use:
The LLM free-runs through straightforward code but calls the delimiter tools only when entering syntactically tricky territory, such as:
- Deeply nested or multi-layered S-expressions
- Macro expansions or reader conditionals
- Quoted or quasiquoted forms (
'(...),`(...)) - Mixed delimiter contexts (
#{},[],{}inside())
(I recently had a nightmare unbalanced
This selective strategy lets the LLM preserve its natural fluency while still invoking the MCP server when it truly adds value.
Over time, we expect the model to internalize balancing patterns, using the tool sparingly and intelligently—a safety net, not a constant companion.
- Protocol: MCP (stdio/HTTP/SSE—any is fine; examples assume stdio).
- Tool namespace:
sexp.guard - Version: v1 (include
serverInfo: { name, version }on initialize)
{
"DelimiterType": "paren | bracket | brace",
"Stack": ["paren", "brace"], // top is the last element
"Cursor": 0, // integer codepoint index in the document
"DocSlice": { "text": "…", "start": 120, "end": 145 }, // optional optimization
"Ok": true
}Error schema (all tools may return):
{
"ok": false,
"error": {
"code": "MISMATCH | UNDERFLOW | INVALID_TYPE | BAD_CURSOR | BAD_INPUT",
"message": "human-readable",
"expected?": "paren|bracket|brace",
"found?": "paren|bracket|brace",
"stack": ["…"]
}
}Purpose: Push an opening delimiter onto a client-owned stack.
Request:
{
"stack": ["paren","brace"],
"type": "paren", // one of: paren, bracket, brace
"cursor": 214, // where open-token inserted (info only)
"docSlice?": { "text": "(", "start": 214, "end": 215 } // optional delta
}Response (success):
{
"ok": true,
"stack": ["paren","brace","paren"]
}Failure cases: INVALID_TYPE, BAD_INPUT
Purpose: Pop with expectation (safer than blind pop). Fails fast on mismatch.
Request:
{
"stack": ["paren","brace","paren"],
"type": "paren", // expected closing kind
"cursor": 300, // where close-token inserted (info only)
"docSlice?": { "text": ")", "start": 300, "end": 301 }
}Response (success):
{
"ok": true,
"stack": ["paren","brace"]
}Failure cases: UNDERFLOW (empty stack), MISMATCH (top != type), INVALID_TYPE
Purpose: Place/update a sentinel cursor for cooperative editing.
Request:
{ "cursor": 320 }Response:
{ "ok": true, "sentinel": 320 }Failure: BAD_CURSOR
Purpose: Verify local balance at the sentinel boundary and resume. By default, asserts empty stack (balanced scope). A mode allows "soft close" to auto-insert missing closers as a returned edit, not server mutation.
Request:
{
"stack": ["paren"], // client's current view
"sentinel": 320,
"mode": "assert-empty | suggest-fix",
"docSlice?": { "text": "(inc 1", "start": 310, "end": 320 } // optional
}Response (assert-empty, success):
{ "ok": true, "balanced": true, "stack": [] }Response (suggest-fix):
{
"ok": true,
"balanced": false,
"fix": {
"edits": [
{ "start": 320, "end": 320, "text": ")" }
],
"newStack": []
}
}Failure: MISMATCH, UNDERFLOW, BAD_CURSOR
To avoid n×m round-trips:
- pushMany: push a run of opens in one call
- popMany: pop a run of closes (with expected list)
- Accept and return diffs only:
{edits:[{start,end,text}]}. - Permit opaque doc hash (rolling or full) to detect drift:
"docHash":"sha256:…"
Example batched request:
{
"stack": [],
"ops": [
{ "op": "push", "type": "paren" },
{ "op": "push", "type": "bracket" },
{ "op": "pop", "type": "bracket" },
{ "op": "pop", "type": "paren" }
]
}- Stack top is rightmost element.
popDelimitermust validate type vs top; on mismatch returnMISMATCHand do not mutate returned stack.balanceAndResume(mode=assert-empty)returnsbalanced:trueiff stack is empty.- Server is stateless; all state (stack, sentinel, doc hash) comes from/returns to client.
- Tools never modify the document; they only return suggested edits.
- Prefer batch ops at AST-phrase boundaries to reduce calls.
- Send
docSliceonly for the local region near the cursor; avoid shipping the full buffer. - Use diff edits +
docHashto keep both sides in sync without quadratic payloads. - Heuristic control ("Bitter Lesson"): the LLM chooses when to invoke tools—e.g., deep nesting, macros, reader conditionals, mixed
#{}/()[]—and otherwise free-runs.
- Inputs validated against a tight schema; unknown fields ignored.
- Hard limits on ops length and
docSlicesize. - Deterministic behavior (pure functions): identical input → identical output.
- LLM writes most of the form freely.
- Calls
pushDelimiter(paren), then laterpopDelimiter(paren). - Calls
balanceAndResume(assert-empty)at the sentinel.- Returns
{balanced:true, stack:[]}.
- Returns
- Continue.
popDelimiter(type="brace")returns:
{
"ok": false,
"error": {
"code": "MISMATCH",
"expected?": "paren",
"found?": "brace",
"stack": ["paren"]
}
}- LLM emits
)edit and re-invokespopDelimiter("paren").
balanceAndResume(suggest-fix)returns an edit list inserting the needed closers at sentinel.
Property tests (Clojure):
- Invariant: For any op sequence where pops match prior pushes, returned stack is empty.
- Mismatch: Any out-of-order closing yields
MISMATCHwith unchanged stack. - Idempotence: Replaying the same request yields the same response.
Table tests:
()[]{}sequences, nested patterns, mixed types, underflow at first pop, large depth (e.g., depth 1024), random ops with oracle checker.
(ns sexp.guard.core)
(defn push-delim [stack type]
(when-not (#{"paren" "bracket" "brace"} type)
{:ok false :error {:code "INVALID_TYPE" :message "bad type"}})
{:ok true :stack (conj (vec stack) type)})
(defn pop-delim [stack type]
(cond
(empty? stack)
{:ok false :error {:code "UNDERFLOW" :message "empty stack"} :stack stack}
(not= type (peek stack))
{:ok false :error {:code "MISMATCH" :expected? (peek stack) :found? type} :stack stack}
:else
{:ok true :stack (pop (vec stack))}))
(defn balance-and-resume [stack mode]
(case mode
"assert-empty"
(if (empty? stack)
{:ok true :balanced true :stack []}
{:ok false :error {:code "MISMATCH" :message "unclosed delimiters"} :stack stack})
"suggest-fix"
(let [closers (map (fn [t] (case t "paren" ")" "bracket" "]" "brace" "}")) (reverse stack))]
{:ok true
:balanced (empty? stack)
:fix {:edits [{:start :sentinel :end :sentinel :text (apply str closers)}]
:newStack []}})
{:ok false :error {:code "BAD_INPUT" :message "unknown mode"}}))(Wrap these in MCP tool handlers; stdio/HTTP glue omitted for brevity.)
{
"tools": [
{
"name": "sexp.guard.pushDelimiter",
"description": "Push an opening delimiter onto the stack",
"inputSchema": {
"type": "object",
"required": ["stack","type"],
"properties": {
"stack": { "type": "array", "items": { "enum": ["paren","bracket","brace"] } },
"type": { "enum": ["paren","bracket","brace"] },
"cursor": { "type": "integer" },
"docSlice": {
"type":"object",
"properties": {
"text":{"type":"string"},
"start":{"type":"integer"},
"end":{"type":"integer"}
}
}
}
}
},
{
"name": "sexp.guard.popDelimiter",
"description": "Pop with expected type; returns mismatch on error",
"inputSchema": {
"type": "object",
"required": ["stack","type"],
"properties": {
"stack": {"type":"array","items":{"enum":["paren","bracket","brace"]}},
"type":{"enum":["paren","bracket","brace"]},
"cursor":{"type":"integer"},
"docSlice":{"type":"object"}
}
}
},
{
"name": "sexp.guard.setSentinel",
"description": "Set/update the sentinel cursor",
"inputSchema": {
"type": "object",
"required": ["cursor"],
"properties": {
"cursor": { "type":"integer" }
}
}
},
{
"name": "sexp.guard.balanceAndResume",
"description": "Assert empty stack or suggest closers at the sentinel",
"inputSchema": {
"type":"object",
"required":["stack","sentinel","mode"],
"properties": {
"stack":{"type":"array","items":{"enum":["paren","bracket","brace"]}},
"sentinel":{"type":"integer"},
"mode":{"enum":["assert-empty","suggest-fix"]},
"docSlice":{"type":"object"}
}
}
}
]
}- Free-run text generation until encountering:
- depth > 3, macro forms,
#()reader, or mixed delimiters → batch push/pop.
- depth > 3, macro forms,
- On mismatch error → immediately emit corrective edit, retry.
- At logical boundaries (
defn/let/cond/closing top-level form) →balanceAndResume(assert-empty). - For partials (interactive REPL cells) →
balanceAndResume(suggest-fix)and apply returned closers as an edit rather than hallucinating them.
If you want, I can turn this into a small bb-based MCP server scaffold (stdio) and a Clojure test suite (clojure.test + test.check) you can run locally.