The smallest building blocks that make low-energy, reliable world models practical. Six foundations — from representation to transport — designed from the beginning for sparsity, evidence, and energy discipline.
Most AI today is built on dense vectors — long lists of numbers where almost every position is non-zero. That forces machines to move and process almost everything on every step: heavy memory traffic, constant multiply-adds, and high energy demand.
Dense systems can work in a few large data centres. They are the wrong starting point for world models that must run continuously, everywhere, and for years. Sparse Supernova was born from a simple requirement: keep the meaning, lose the waste.
Most values are active. Everything is moved and processed on every step.
Most values are inactive — by design. Only the bright points matter.
The energy cost in computing mainly comes from two things: moving data (reading and writing values in memory) and doing maths (multiply-add operations). In Sparse Supernova, the system ignores the zeros and computes only on the small set of active non-zeros — reducing memory traffic and operations over time.
| # | Primitive | What It Does | Why It Matters | Energy Win |
|---|---|---|---|---|
| 1 | Sparse Signatures | Represent the world | State without dense embeddings | Store / move less |
| 2 | Elevation-Shaped Hashing | Decide what lights up | Selective sparsity with control | Keep density low |
| 3 | Sparse Distances & Anomalies | Measure change / novelty | Always-on monitoring | Compute on non-zeros |
| 4 | Conformal Sparse Detection (USAD) | Decide "is this unusual?" | Lightweight safety layer | Always-on, low cost |
| 5 | Universal Saturation Law (USL) | Know when to stop | Trust vs cost sizing | Avoid over-build |
| 6 | Smart-Atom Router | Move sparse safely | Distributed world models | Fewer bytes + governed |
A sparse signature is a compact "fingerprint" of a situation. Instead of filling a vector with thousands of non-zero numbers, we store only a small set of important features in a very large space. Most entries are exactly zero.
Think of it like a night sky: mostly dark, with a few bright points that matter.
A compact fingerprint that represents a situation using only a small set of important features in a very large space. Text, images, audio, and numerical streams can all be encoded into a common sparse space.
World models carry state continuously. Dense embeddings make that state expensive to store, expensive to move, and expensive to compare. Sparse signatures make long-running state realistic: most of the world stays "off" until it matters.
If sparse signatures are a map, elevation-shaped hashing is how we decide which points appear — and how strong they are. We don't place features randomly; we shape importance so the system stays sparse while still capturing what matters.
Elevation is strength/importance: term strength in text, edge intensity in images, event strength in signals. This primitive shapes which features "light up" and how brightly, keeping sparsity high without losing structure.
World models fail in two ways: they either waste energy by activating too much, or they lose meaning by throwing away the wrong signals. Elevation-shaped hashing is the control that keeps sparsity high without losing structure.
To understand change, you need to compare "now" to "before". Our sparse distance and anomaly measures do that using only the few active positions — not the whole vector.
A set of comparison tools that measure how different "now" is from "before" — using only the small number of active positions, not every single value. This enables continuous monitoring without overwhelming compute costs.
A world model is only useful if it can monitor change continuously. Dense comparisons are too expensive to run all the time. Sparse distances keep the monitoring always-on without dragging energy costs up with it.
This primitive answers: "Is this unusual enough that I should care?" — and it does it with a clear statistical control layer rather than a vague score.
A lightweight guardrail that can run everywhere the world model runs. It provides a clear yes/no answer with statistical rigour — not a vague anomaly score — so the system knows when something genuinely unusual has occurred.
If a world model runs everywhere, it needs a guardrail that can run everywhere too. USAD is designed to be always-on without turning safety into a second heavy system.
USL is our law of scaling and agreement. It describes how agreement improves as you add effective dimension (or independent checks), and it gives a practical outcome: when to stop scaling so you don't waste energy.
A design tool that tells you: beyond a certain point, extra scale buys very little and costs a lot. Size for the trust you need, then stop. This is how we avoid the "scale until budgets run out" trap.
Most systems scale until budgets run out. World models need a more responsible approach: size for the trust you need, then stop. USL is a design tool for doing that deliberately.
Sparse isn't only about how we represent the world. It's also about how we move information through a system. In real deployments, energy is burned in data movement: storage, bandwidth, and network hops. The Smart-Atom Router is our transport layer for sparse payloads — move only what matters, not everything.
If you represent the world sparsely but move it densely, you lose the benefit. The Smart-Atom Router keeps sparsity end-to-end: representation, comparison, monitoring, and transport — so the energy savings carry through the entire system.
World models are distributed. If you represent the world sparsely but move it densely, you lose the benefit. The Smart-Atom Router keeps sparsity end-to-end: representation, comparison, monitoring, and transport.
Say less. Do more. Prove it.