Skip to content

Instantly share code, notes, and snippets.

@pszemraj
Last active May 4, 2026 02:04
Show Gist options
  • Select an option

  • Save pszemraj/a67cf4515ae90139e075842daf1e4871 to your computer and use it in GitHub Desktop.

Select an option

Save pszemraj/a67cf4515ae90139e075842daf1e4871 to your computer and use it in GitHub Desktop.
Cursory investigation into new phenomena from GPT 5.5

Decorative Monospace

A formatting failure mode in modern LLM outputs where code blocks are used as a rhetorical device for signaling rigor rather than as semantic markup for content where whitespace, syntax, or literal reproduction matters.

The model isn't trying to communicate "this is code." It's trying to communicate "this is structured thinking" by borrowing the visual texture of a technical document. The texture is decorative; the underlying content is prose.

What it looks like

example of decorative monospace in a ChatGPT response

Figure: GPT 5.5 Pro illustrating the behavior in conversation. (Ignore the irony of the meta-topic)


Three distinct sub-patterns recur, and they fail in different ways.

Lists masquerading as code

A bulleted list with the bullets stripped, rendered in monospace.

```text
program induction
Bayesian concept learning
version-space learning
anti-unification
belief revision
schema formation
counterexample-guided abstraction
```

This is the most egregious sub-pattern because it strictly loses information versus a real markdown list. Screen readers stop seeing it as a list. Markdown renderers stop seeing it as a list. Copy-paste destinations stop seeing it as a list. There is no upside - the visual signal "this is structured" is paid for with the loss of actual structure.

ASCII pseudo-diagrams

Linear arrow chains presented in code blocks as if they were diagrams.

```text
examples
  -> hypothesis proposer
  -> explicit hypothesis set
  -> prediction
  -> counterexample
  -> explicit update
```

These pretend to be diagrams but they're prose sentences with -> substituted for "leads to." Real diagrams have geometry: branches, parallel paths, feedback loops, edge labels, distinct node types. Linearizing a sentence into one column doesn't make it a diagram. If the content needs a diagram, the model should produce a real one (mermaid, SVG, image). If it doesn't, the content should be prose.

Pseudocode that isn't pseudocode

English sentences with underscores and equals signs added to look formal.

```text
state = distribution over programs / schemas / causal explanations
update = conditioning + restructuring
prediction = marginalization over executable hypotheses
learning = compression of evidence into reusable abstractions
```

This is borderline-defensible when it's gesturing at notation that would be precise if formalized (e.g., belief_state_{t+1} = condition(belief_state_t, observation)). But more often it's prose that has been mechanically reformatted to look like assignments, with no actual computational content. The reader gains nothing the surrounding paragraph wasn't already providing.

Single-line monospace as emphasis

A short phrase placed in a fenced block to give it weight.

```text
structured belief over executable abstractions
```

That's bolding. It's bolding wearing a code block costume.

The diagnostic test

Before any fenced code block, ask: would this lose information rendered as prose, a list, or italics? If no, it shouldn't be a code block. The legitimate uses of fenced blocks are content where whitespace, exact syntax, or literal reproduction matters:

  • Actual code in a known language
  • Shell commands and terminal output
  • File paths and config snippets
  • Aligned tables where columns carry meaning
  • Notation where the exact characters and spacing are load-bearing

Everything else is decoration.

Why this happens

The phenomenon has multiple plausible causes that likely compound rather than compete.

RLHF rater preferences

The most likely primary driver. Human raters scoring model outputs at scale prefer "structured-looking" responses, and code blocks read as rigor at a glance. Raters can detect this signal in the seconds they have per response; they cannot easily detect whether the structure is load-bearing in the same window. The reward model learns the visual texture is rewarded, independent of whether the content actually benefits from monospace rendering.

This is a special case of the broader problem that RLHF optimizes for signals of quality more than quality itself, because signals are what raters can detect at rating speed.

Markdown's missing primitive

Markdown has no native primitive for "I want to gesture at structure without claiming it's literal code." There's no <aside>, no <callout>, no <schema> block. When a model wants visual emphasis or grouping that isn't bold-or-italic and isn't a list, the only available container is the code fence. Code blocks get conscripted into rhetorical roles they weren't designed for.

This is partially a tooling problem. If markdown had a generic emphasized-block primitive, models might be less inclined to misuse code fences for it.

Reward leakage and SFT amplification

OpenAI published a forensic case study in April 2026 ("Where the goblins came from") that documents the exact mechanism in a different surface tic. Worth reading in full as a concrete instance of how a small, scoped reward can produce a model-wide behavior, because the same dynamic is the strongest candidate root cause for decorative monospace.

The goblin case, summarized: starting with GPT-5.1, OpenAI's models began over-using "goblin," "gremlin," and similar creature words in metaphors. Investigation traced the cause to a reward signal designed for a specific personality preset ("Nerdy"). That reward scored creature-word outputs higher than non-creature-word outputs on 76.2% of audited datasets. Even though the reward was applied only when the Nerdy personality was active in production, the behavior leaked into general outputs. The mechanism was a feedback loop:

  1. A scoped reward favors a stylistic feature (creature metaphors).
  2. Some rewarded rollouts contain the feature.
  3. The feature appears more often in subsequent rollouts.
  4. Those rollouts get reused as supervised fine-tuning data.
  5. The model becomes more comfortable producing the feature in all contexts, not just the scoped one.

By the time the team identified the root cause, GPT-5.5 had already begun training on SFT data containing the tic, so they had to apply a developer-prompt mitigation rather than removing it at the training-data level.

This generalizes directly to formatting tics. A reward signal that prefers "structured-looking" responses - whether explicitly designed for that or implicit in a "thoroughness" or "rigor" rubric - would favor outputs with code blocks. Those outputs get reused as SFT data. The next generation produces more of the pattern, in more contexts, including ones where the original reward signal was never active. The fact that decorative monospace appears across labs and across model generations is consistent with this mechanism running independently in each lab's pipeline, since every frontier lab uses model-generated rollouts in SFT.

The OpenAI post also notes a feature of the mechanism that matters here: "reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them." A reward intended for one context bleeds into others. So even narrowly-targeted RLHF rubrics that reward formatting-as-rigor in, say, technical-explanation responses can produce decorative monospace in casual conversation, creative writing, or anywhere else the model now treats as adjacent.

Synthetic training data inherits the tic

Closely related but distinct from reward leakage: if a model is post-trained on reasoning traces - its own or another model's - and those traces already contain decorative monospace, the pattern compounds even without an explicit reward favoring it. Synthetic data inherits the formatting habits of the model that produced it. Once one generation of models has learned the tic for any reason (rater preference, reward leakage, or otherwise), training subsequent models on their outputs propagates it.

This is consistent with the tic appearing across labs and across model generations even when individual labs have presumably tried to address other formatting issues. It also helps explain why the tic has proven sticky: removing it at the level of one training stage doesn't necessarily remove it from upstream synthetic data that's already been generated and incorporated.

Pacing and visual rhythm

Long prose responses without visual breaks read as walls of text. Fenced blocks function as visual punctuation - they slow the eye, create pause points, signal "important." The model has learned that breaking up prose this way produces better-scoring outputs.

The problem is that every fenced block then reads as equally important, so the eye stops weighting them at all. The pacing function works only if used sparingly. When overused, it inverts: the response becomes harder to scan, not easier, because the visual hierarchy has been flattened.

The medium contradicting the message

The deepest version of why this matters: the formatting signals structured reasoning while the underlying response is unstructured prose. When a response is talking about hypothesis spaces, version-space learning, or program induction - exactly the kind of content where real computational structure could exist - and the formatting is decorative, the medium and the message disagree. The model is performing the aesthetics of structured thought without the structure being load-bearing in the response.

If the underlying computation were genuinely organized into discrete hypotheses or executable abstractions, you'd expect the formatting to track real structure. The fact that it doesn't is a tell.

Adjacent failure modes

Decorative monospace lives in a family of formatting tics that share a common ancestor in RLHF-optimized polish:

  • Compulsive em-dashes as rhetorical pacing.
  • "Key insight:" callouts for sentences that aren't insights.
  • Numbered sub-sub-headings on responses that don't need outline structure.
  • Bolded phrases mid-paragraph that serve no scanning function.
  • Bullet lists for content that flows as prose - fragmenting argument into dot points that lose connective tissue.

All of these are visual signaling of intellectual labor. They make the response appear thorough rather than be thorough.

How to suppress it (prompt-level)

The behavior responds to explicit, concrete instruction. A prompt like the following, dropped into a system message or project instructions, generally produces compliance:

No decorative code blocks. A fenced code block is reserved for content where whitespace, exact syntax, or literal reproduction matters: actual code, shell commands, terminal output, file paths, config snippets, or aligned tables where columns carry meaning.

Forbidden:

  • Lists rendered as monospace instead of as real markdown lists.
  • "Pseudo-diagrams" - A -> B -> C arrow chains in monospace are linearized sentences, not diagrams. Use prose, or produce a real diagram (mermaid, SVG, image).
  • Pseudocode that is just English with underscores and equals signs added to look formal. If the algorithm matters, write real code or LaTeX. If it doesn't, write a sentence.
  • Single-line monospace blocks used to emphasize a phrase. That's bolding.

Before every fenced block, test: would this lose information rendered as prose, a list, or italics? If no, don't use a code block. Default to prose. Use real markdown lists. Reserve code blocks for code.

Naming the failure mode concretely (rather than vaguely asking for "less formatting") and giving the model a positive test it can apply per-block tends to produce the largest reduction in the behavior.

Why this matters beyond aesthetics

This is one symptom of responses optimizing for appearing rigorous rather than being rigorous. The same RLHF dynamic that produces decorative monospace produces other behaviors that are harder to detect: confident overclaiming, motivated reasoning that sounds measured, narrative arcs that flatter conclusions. Decorative monospace is the visible tip of that iceberg - easy to spot, easy to name, easy to correct with a prompt. The harder versions of the same dynamic are buried in prose and don't have a visual signature.

The goblin post is instructive on this point too. OpenAI describes the goblin tic as "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones." The goblin is a single lexical item - easy to count, easy to grep for, easy to investigate forensically once someone notices it. Decorative monospace is similarly visible. Both are tractable because they have a surface signature. The categories of behavior that don't - bias toward agreement, confident wrongness on topics adjacent to the rater pool's blind spots, subtle narrative slant in argumentative prose - emerge from the same reward dynamics but resist the same forensic approach because they don't leave a string you can count occurrences of.

Treating the formatting tic as cosmetic misses the point. The formatting is a tell. The underlying optimization pressure that produced it produces other outputs that don't announce themselves so clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment