The Flat Illusion
For a long time, researchers thought AI models organized concepts in straight lines. Want to find the opposite of a concept? Just draw a straight line. We call this the Linear Hypothesis. Let's try it!
A tiny, wiggly explorable explanation about why mechanistic interpretability is moving from flat Euclidean maps to curved neural manifolds.
For a long time, researchers thought AI models organized concepts in straight lines. Want to find the opposite of a concept? Just draw a straight line. We call this the Linear Hypothesis. Let's try it!
Turns out, AI doesn't store concepts in empty space. It groups them based on massive amounts of context, creating a dense, curved landscape. We call this a Neural Manifold. Let's see what happens to our straight line when the data gets wiggly.
If you want to understand what the AI actually "thinks", you can't be a bird flying over the landscape. You have to be a surfer riding the wave. You have to measure the distance along the curves. In math, this is called Geodesic Distance.
Here is the jump. Evo 2 is a genome model trained on roughly 9 trillion DNA letters from across life. If you flatten its thoughts, some organisms look like nonsense neighbors. A fold in the map makes far-away species sit next to each other on the page.
But if you stay on the ribbon, the model's hidden geometry starts looking like evolution. Now the question changes from "is the model confused?" to "what shape did biology carve into the model?"
Now imagine safety. You find a region related to deception. Great. Turn it off, right?
Wait a minute. In a flat picture, deception may sit right beside creativity, planning, humor, or logic. A crude edit might remove the bad behavior and accidentally make the model duller, less useful, or just weird in a new way.
A model used by a billion people has a weird job: be good for everyone at once. Your perfect sentence may be too sharp for one person, too slow for another, too strange for a third. So the default voice drifts toward the safe middle.
That is why AI writing can feel nice-but-dead. It is averaging many tastes into one polite fog. The same thing happens outside writing: science, safety, tutoring, and post-training all need the right mode, not the average mode.
This is the important part: AGI is not just more tokens, more math, bigger box. The magic moments come when a system can enter the right mode on purpose.
That is what the manifold unlocks: not one giant personality, but a model we can navigate. Read the shape. Pick the neighborhood. Keep the useful parts alive.