The Day an AI Invented Its Own Language — and What It Taught Me About Prompting

Quang Vu Van Minh

Where this story starts: Anthropic’s 319-page System Card for Claude Fable 5 & Claude Mythos 5, June 9, 2026.

A quick honesty note before we start: this is a small exploration — 9 parallel agents, 3 runs per approach. Treat the numbers as early signals, not settled science. The ideas, I think, are worth your time anyway.

An AI quietly stopped writing in English

On June 9, 2026, Anthropic shipped Claude Fable 5 and Claude Mythos 5. The launch came with a 319-page system card — the kind of document almost nobody reads end to end. Buried inside was a short, strange section about a behavior no one designed.

During training, one of the models started solving a card puzzle. Normal so far. Then, on the hardest puzzles, its reasoning slowly drifted out of English and into something else entirely:

“An extreme example of illegible reasoning. Near the end of training, Mythos starts solving a card puzzle with human understandable language that gradually becomes incomprehensible in most episodes with long reasoning. The illegible reasoning is the most extreme and at the highest rate in this card puzzle environment.”

— Anthropic System Card (Claude Fable 5 & Claude Mythos 5), June 9, 2026, Transcript 6.2.2.A

“Incomprehensible” is doing a lot of work in that sentence. Here’s a real fragment of what the model was actually thinking:

…t8-col ⟸ K♣→t2 ⟸ t2-dug ⟸ 4♥3♣→5♣ ⟸ t1-dug… 💀💀💀 UNAVOIDABLE

At a glance: alien. Random. The kind of output that makes you wonder if the model broke. But look closer at two of those symbols — [K♣→t2] and J♥→Q♠ — and a structure starts to surface. [K♣→t2] means move the King of Clubs into free cell slot 2. J♥→Q♠ means stack the Jack of Hearts onto the Queen of Spades.

That’s not noise. That’s a card game.

The community fed the full chain of thought to Claude Sonnet and asked it to decode. The verdict: the model had been playing FreeCell — and somewhere in training, it had invented its own shorthand to navigate the move tree during the parts of the game that actually got hard. (The system card never names FreeCell, to be clear; that’s the community’s read of the structure. But the structure fits.)

What stopped me wasn’t that the notation existed. It was that it made sense. The model used it consistently. And it mirrored FreeCell’s exact anatomy: 4 holding cells, 8 cascades, ordered moves. Nobody handed the model that notation. Under training pressure, it reached for one — and the one it reached for fit the shape of the problem like a key in a lock.

Which raises the question I couldn’t let go of: why that notation, and not any other?

Every problem has a shape

Here’s the idea that makes the rest of this click.

Any problem with states, moves, and rules has a state space — every possible configuration, plus the legal transitions between them. And that space has a shape:

Degrees of freedom — how many independent things you must track at once
Bottlenecks — points every solution has to pass through
Irreversible steps — moves you can’t take back

This isn’t hand-waving. The paper “The Shape of Reasoning” (arXiv:2510.20665, ICML 2026 Workshop) measures it directly, using Topological Data Analysis to embed reasoning traces in metric space and read off their geometric features. Actual topology — not a metaphor.

And it points at a clean thesis:

The shape of a problem decides what any good notation has to capture. Not which notation — there are many that work — but what structure all of them must encode. Under-encode the degrees of freedom and the notation fails. Over-encode and you waste space, but it still works.

FreeCell has three things you simply cannot forget: where every card is, which holding cells are free, and what move is legal next. Plain prose burns about 10 tokens to describe one move. [K♣→t2] does it in 6 — and still captures all three. The model didn’t pick that notation on a whim. The geometry of the game constrained it, and training pressure squeezed it down to the tightest encoding that still held everything that mattered.

An honest caveat, because it matters. Arun Jose’s paper (arXiv:2510.27338) tested 14 reasoning models and found that RL training nudges all of them toward illegible chains of thought — possibly steganography, possibly random residue, possibly leftover “computation tokens.” That finding could deflate this whole story: if every model goes illegible, maybe the FreeCell notation is just more static. What saves it is structural verification. The community decoded the symbols and confirmed a specific, consistent mapping onto FreeCell’s degrees of freedom. A notation used consistently to track real structure is signal, not noise. The argument rests on that — not on Jose’s paper.

So I tried to skip the training and just… design the notation

Mythos found its notation the expensive way: through RL, emergent and opaque, over a whole training run. But if a problem’s shape is readable before you solve it, then maybe you can design the matching notation up front and hand it to the model in the system prompt — collapsing what took training time into a few minutes of prompt design.

The recipe I landed on has five steps:

Read the shape. Four questions: Does forgetting X mid-solution break it? Are the moving parts coupled? Is the search wide or narrow? Are any steps irreversible?
Name the degrees of freedom. Every “yes” to that first question is a dimension you must track.
Design a notation that encodes each one compactly.
Inject it into the system prompt.
Watch for drift. If the model ditches your notation halfway through, the notation missed something. Back to step 1.

That last step is the gold. Drift is feedback. A notation that captures the real structure gets kept. One that doesn’t gets quietly abandoned as the model falls back to prose — and honestly, prose is usually the right call when the notation is wrong.

A quick detour: Chinese characters as notation

Classical Chinese characters are often a single token each in modern tokenizers. Tempting for efficiency — but token count is a side effect. What actually matters is semantic alignment: does the model’s built-in association for the character already point at the concept you need?

鎖 (lock) = irreversible operation ✓ — the meaning carries straight over
缺(A,B) (gap / deficiency) = the time window between two steps ✓
退 (retreat / withdraw) = rollback ✓
你 (you) = security issue ✗ — the meanings fight each other; the model pulls two ways

The rule: pick characters whose natural meaning already leans toward the concept. Don’t remap arbitrarily and hope.

The experiment: 9 agents, 3 approaches, one genuinely nasty bug hunt

(Reminder: n=3 per approach. Exploration, not proof.)

The target: an 8-service e-commerce order flow — user, inventory, pricing, fraud, payment, fulfillment, notification, loyalty — with 8 steps and 3 rollback chains. The task: find every race condition. If you’ve ever debugged distributed systems, you know this is where bugs go to hide.

Nine agents ran in parallel. Every output was forced into a normalized JSON array so I could measure overlap precisely.

The three lenses

1 — Prose (the baseline). Plain English. No constraints. The model reasons however it likes.

2 — Notation injection. A notation built around the timing structure of distributed systems:

State:   [svc::op | state | timing]
gap(A,B) = time window between A and B — where crashes live
[LOCK]   = irreversible operation

Rule: every inter-step interval MUST be named as gap() and inspected.

3 — Wenyan (Classical Chinese).

查=read  鎖=irreversible  退=rollback  缺(A,B)=temporal gap
退路須全追 — every rollback path must be traced to completion.

The scoreboard

Approach	Run 1	Run 2	Run 3	Avg
Prose	12	13	12	12.3
Notation	10	12	12	11.3
Wenyan	9	10	9	9.3

Prose finds the most. Wenyan finds the fewest. Notation, my supposedly clever trick, lands in the middle. Case closed — fancy notation is a gimmick?

Not even close. The count was hiding the real story.

The number that mattered wasn’t the count

Eight race conditions showed up in all nine runs, regardless of approach — the usual suspects: TOCTOU inventory, coupon double-spend, price drift, stale fraud score, ghost charge, notification-before-rollback, orphaned loyalty points, double-unreserve. A solid shared core. Nothing surprising.

The surprise was in the bugs only some approaches caught:

Only Notation found: Orphan Reserve — a process crashes in the gap between “inventory reserved” and “payment charged,” so inventory is locked forever with no event to trigger rollback. Plus a Rollback Chain Ordering Violation, where parallel rollback steps unreserve before the cancel confirms.
Only Wenyan found: Lock TTL Mismatch — the inventory lock expires mid-payment, another order grabs the slot, and the payment still goes through with no inventory behind it. Plus Rollback Chain Self-Failure — a compensation step that itself fails, with no second-level recovery.
Prose: the widest coverage of the main flow — but no stable, unique finds in the timing or compensation corners.

The pattern was too clean to be luck. Each approach was drilling into a different structural region of the same problem.

Three lenses, three blind spots

Dimension	Lens	What it forces	What it catches
Breadth	Prose	Naturally scans concurrent paths	Concurrent sessions, velocity bypass, retry races
Depth	Notation	`gap(A,B)` on every inter-step window	Orphaned state, crash-in-transit bugs
Inner	Wenyan	`退路須全追` on every branch	Compensation self-failure, TTL lifecycle bugs

Breadth is the question “what if two users do this at the exact same moment?” Prose roams across concurrent paths naturally and spots the collisions. Notation and Wenyan both get tunnel vision, locked onto a single linear chain — and miss them.

Depth is the sliver of time between two steps — not empty time, but the window where a process can die, a network can split, a timeout can fire, and nobody notices. The classic here: the gap between “inventory reserved” and “payment charged.” Crash there and the inventory is locked with nothing to free it. Prose knows steps 4 and 5 exist; it never gets forced to ask what lives in the crack between them. The gap(A,B) rule makes that question mandatory.

Inner is depth inside the rollback tree. Main flow is layer 0. Rollback handlers are layer 1. But rollback handlers can fail too — and there’s rarely a layer 2 to catch them. Wenyan’s 退路須全追 (“trace every rollback path to completion”), taken literally, drills all the way down. The nightmare it surfaced: Fulfillment fails → refund succeeds → un-reserve inventory fails → nothing handles it. The customer gets their money back. The inventory slot stays locked forever. Prose and Notation note that a rollback rule exists. Wenyan follows it until it either finishes or breaks.

The plot twist: it cost more, not less

Approach	Avg output tokens
Prose	~3,200
Notation	~12,000
Wenyan	~12,800

This is the part I got wrong going in. I assumed compact notation would save tokens. It cost 3–4× more. Compact notation does save tokens on short, repeated reasoning steps — but for one deep analysis of an entire system, it front-loads a lot of explicit structure. Worth every token for the unique bugs it found. But a token-efficiency win? No. Be honest about that.

What I got right, and what I got wrong

I’d rather show you the scorecard than pretend the hypothesis survived intact:

What I expected	What actually happened
“Notation finds more race conditions than prose”	❌ Wrong — Prose 12.3 vs Notation 11.3
“Notation saves tokens”	❌ Wrong here — it cost 4× more
“Wenyan is the weakest approach”	❌ Wrong — it caught bugs nothing else did
“Problem topology determines the notation”	⚠ Partly — it constrains, it doesn’t dictate
“Each approach covers a different structural dimension”	✓ Held up across all 9 runs

Most of my specific predictions were wrong. The one that held is the one that turned out to matter.

What to actually do with this

Stop hunting for the one best notation. There isn’t one. Instead, run parallel agents, each with a different lens deliberately bolted on:

# Agent 1 — Breadth scan
Prose → concurrent execution paths, retry collisions

# Agent 2 — Temporal depth
Notation (gap() + [LOCK]) → crash windows, orphaned state

# Agent 3 — Compensation depth
Wenyan (退路須全追) → rollback self-failure, TTL lifecycle

# Final: union of all three → maximum coverage

This is exactly what Mythos 5 did on its own for FreeCell: the shape of the problem pushed it toward a notation that gave it better traction on the region it was struggling with. When we design the notation up front instead of waiting for it to emerge, we collapse training time into prompt-design time — and we get to cover several structural regions at once instead of just one.

The shape of a problem doesn’t hand you one notation. It tells you what any good notation must encode. Different lenses light up different corners of the same space — and their union is the best coverage you’ll get without brute-forcing the entire state tree.

The model showed us the trick by accident. We just have to be deliberate about it.

Papers: arXiv:2510.27338 (Arun Jose, Oct 2025) · arXiv:2510.20665 (Xue Wen Tan et al., ICML 2026 Workshop)
Source: Anthropic Claude Fable 5 / Mythos 5 System Card (June 9, 2026)
Experiment: 9 parallel agents, Claude Sonnet, default temperature, June 2026 — n=3 per approach, exploratory.

The Day an AI Invented Its Own Language — and What It Taught Me About Prompting

Quang Vu Van Minh

Table of Contents

An AI quietly stopped writing in English

Every problem has a shape

So I tried to skip the training and just… design the notation

A quick detour: Chinese characters as notation

The experiment: 9 agents, 3 approaches, one genuinely nasty bug hunt

The three lenses

The scoreboard

The number that mattered wasn’t the count

Three lenses, three blind spots

The plot twist: it cost more, not less

What I got right, and what I got wrong

What to actually do with this

Quang Vu Van Minh

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements