Recommend reading this previous post, too - Compression is not Cognition

The artificial intelligence community is at a curious inflection point. Large language models have achieved capabilities that seemed impossible just years ago, yet their limitations are increasingly apparent. They hallucinate with confidence, struggle with novel reasoning tasks, and collapse under their own outputs when trained recursively. The standard response has been to scale further: more parameters, more data, more compute. But what if the bottleneck isn't scale? What if we've been optimizing for the wrong objective?

The prevailing approach assumes that efficiently compressing human knowledge will naturally produce intelligence. Training models to predict the next token across vast text corpora is expected to teach both language and the reasoning behind it. This "compression hypothesis" has yielded impressive results: modern language models can write, code, translate, and appear to plan. However, this fluency masks a fundamental fragility that scaling alone cannot address.

Consider what happens when we train these models. Written language is the final output of human reasoning, not the process that generated it. When a mathematician writes a proof, the published version is clean and linear. It presents axioms, derives theorems, and arrives at conclusions. It does not show the dead ends explored, the false starts abandoned, or the moment of insight that restructured the approach. The text is optimized for efficient communication between humans who already understand mathematical reasoning, not for teaching a system how to reason from scratch. It is a compressed artifact, a residue of thought rather than thought itself.

This distinction matters profoundly. When models train on this residue, they learn correlations in expression without internalizing the constraints that make those expressions valid. They become exceptionally good at predicting what comes next in sequences that look like reasoning, but they don't necessarily learn what reasoning is. The result is a system that can generate text with the statistical texture of intelligence while lacking the underlying machinery that would make that intelligence robust. 

Evidence for this limitation appears in predictable patterns. Models excel within the distribution of their training data but fracture when pushed beyond it. They can follow familiar reasoning chains but struggle to construct novel ones. They exhibit what might be called "persona collapse" when subjected to sustained contradiction or recursive self-reference. Their simulated identities destabilize because those identities are probability distributions rather than stable constraint structures. Most tellingly, when models are trained on their own outputs, performance degrades. The system enters a closed loop, amplifying its own biases and reducing the diversity of its outputs. This is not the behavior of a system that understands; it is the behavior of a sophisticated compression algorithm converging on its own approximations. 

The field's response to these limitations has been to add more modalities: vision, audio, and eventually embodiment. The reasoning goes that language alone is insufficient, so we need to ground models in richer sensory experience. This intuition is partially correct but incomplete. Perception provides access to the state, not the structure. A camera can capture every pixel of a falling object, but observing falls does not teach you gravity. That requires intervention, action, and consequence. Without agency and irreversibility, even multimodal data collapses into another static dataset with more dimensions. 

One example stands apart, perhaps - World Labs - World Labs, a spatial-intelligence AI company co-founded by Fei-Fei Li, develops Large World Models (LWMs) designed to perceive, generate, reason about, and interact with coherent three-dimensional environments, rather than solely predicting text or two-dimensional images. The flagship model, Marble, creates persistent and editable 3D worlds from text, images, or videos. This approach provides AI systems with a richer, more structured representation of physical space and interaction, marking a significant shift from language-based models to those that incorporate spatial and physical constraints, essential for reasoning about the real world. Such world models address critiques of next-token compression by supporting systems that learn underlying constraints through environmental interaction, thereby offering infrastructure that facilitates constraint discovery and the simulation of cause-and-effect relationships more effectively than traditional language models.- Link

This is where the current wave of "world models" and "spatial intelligence" efforts becomes instructive. Companies are building increasingly sophisticated three-dimensional representations of environments, generating coherent spaces that maintain consistency and obey apparent physics. These are impressive technical achievements and likely necessary infrastructure for what comes next. But infrastructure alone does not create intelligence. A photorealistic simulation is still just a dataset if the learning system is a passive observer or if the environment allows infinite retries at no cost. The question is not whether the world looks real, but whether it imposes genuine constraint pressure on the learner.

What would genuine constraint pressure look like? It would mean environments where actions have irreversible consequences, where shortcuts are penalized rather than rewarded, and where naive heuristics systematically fail. It would mean problems that cannot be decomposed into independent subproblems, forcing the learner to maintain global coherence. Most critically, it would mean shifting the optimization objective from maximizing reward or minimizing prediction error to reducing uncertainty about the constraints that govern the environment. 

This reorientation deserves careful unpacking because it represents not just a technical adjustment but a different conception of what learning should accomplish. Standard reinforcement learning optimizes for policies: given this environment and this reward function, what should I do? The constraint model is learned only instrumentally, as it helps maximize reward. Model-based approaches learn dynamics models, but these too are typically optimized for prediction accuracy rather than structural correctness. The system might learn to predict that clouds move in certain patterns without ever learning why, capturing the correlation without the causation. 

Inverse constraint discovery inverts this priority. The primary objective becomes inferring the latent rules, dependencies, and invariants that generate observed outcomes. In a simple grid world, this means not just learning a policy that navigates successfully, but also explicitly representing which walls are solid, which transitions are deterministic, and which actions have delayed effects. The system is rewarded for the accuracy of its inferred constraint structure, not just for completing the task. Success means learning what must be true, not just what to do. 

The difference manifests most clearly in transfer. A policy learned for one reward function often fails catastrophically when rewards change. But constraints—the actual physics or logic of the environment—transfer across tasks. If you have correctly inferred that walls are solid and gravity pulls downward, that knowledge applies whether your goal is to reach the exit or collect treasure. This is why human intelligence transfers so well. We do not just learn behaviors; we build models of how the world works and apply them flexibly as our goals change. 

This framework helps explain why certain benchmarks are difficult even for sophisticated models. The Abstraction and Reasoning Corpus, for instance, presents visual puzzles that require inferring transformation rules from examples. These puzzles are designed to resist memorization and pattern matching, forcing genuine constraint discovery. Current models struggle not because the tasks require complex computation, but because they require something the models weren't trained to do: infer and represent explicit structural rules rather than implicit statistical associations.

How might we train systems differently? One approach involves co-evolutionary pressure. Imagine two systems: one generates environments or problems, the other attempts to solve them. The generator is rewarded not for creating difficult problems but for creating problems solvable only through correct constraint inference. This penalizes shortcuts and memorization. Over time, the generator would evolve a curriculum of increasing constraint complexity, and the solver would be forced to develop genuine inference machinery instead of pattern-matching heuristics.

The environments themselves need not be realistic. In fact, excessive realism may be counterproductive if it introduces irrelevant complexity. What matters is constraint structure: irreversibility, causal depth, and non-decomposability. These properties can be instantiated in abstract spaces like grid worlds or formal systems. The hypothesis, falsifiable and testable, is that training on such synthetic constraint spaces develops domain-general reasoning machinery, a kind of grammar of inference that can later be grounded in embodied experience to provide semantic content and value.

This view reconciles with embodied cognition research rather than contradicting it. Human intelligence depends on sensorimotor grounding and social interaction. But it may be possible to factor the problem: synthetic environments train the inference engine, the ability to discover and represent constraints, while embodied interaction trains what to apply that engine to, what matters, what has value. The abstract reasoning measured by IQ tests suggests that some component of intelligence generalizes across domains, independent of specific content knowledge.

The research program this implies would look markedly different from current practice. Less emphasis on dataset scale, more on interaction depth. Less focus on compressing existing knowledge, more on discovering new structure. Less passive observation, more active hypothesis testing. The objective functions would reward epistemic accuracy, how well your model captures the true constraints, rather than task performance alone. The architectures might integrate program synthesis, treating discovered constraints as executable code rather than implicit parameters.

This raises immediate practical questions. How do we formalize "constraint complexity"? What makes one environment more reasoning-inducing than another? When does correlation actually reflect causation, versus when does it mislead? These questions lack complete answers today, which is why they represent research opportunities rather than solved problems. Understanding why certain constraint structures force reasoning is as fundamental as understanding computational complexity, why some problems require planning, while others reduce to lookup.

The paradigm also makes concrete, falsifiable predictions. Systems trained with constraint discovery as the primary objective should exhibit stronger transfer when goals change, greater robustness under distribution shift, and reduced degradation when trained on their own outputs compared to systems trained for imitation or reward maximization. Performance should scale with the diversity and depth of constraints encountered during training rather than with raw dataset size. If these predictions fail, the framework should be revised or abandoned.

What about current efforts in spatial intelligence and world modeling? These developments provide crucial infrastructure for representing and generating coherent three-dimensional spaces with consistent physics. But infrastructure is not destiny. The same technology that creates photorealistic environments for human game designers could also create reasoning gyms for AI agents, provided the training objective shifts from generation quality to constraint discovery. The question is whether these systems will be used as tools for human creativity or as training grounds for artificial agents learning to reason through interaction.

The path forward requires acknowledging that we may have been solving adjacent problems while missing the central one. Causal discovery in scientific contexts, program synthesis from specifications, active inference in robotics, and developmental learning in psychology all contribute to the puzzle. What is missing is a unified framework that places constraint discovery at the center of how we think about training general intelligence, not as a helpful side effect but as the primary objective from which other capabilities emerge.

This does not mean abandoning language models, world models, or embodied agents. It means recognizing their roles more clearly. Language provides symbols and abstractions, but not the reasoning machinery to manipulate them robustly. World models provide consistent representations, but not the pressure that forces genuine understanding. Embodiment provides grounding and value, but not necessarily the inference engine to discover abstract structure. Each component matters, but none is sufficient on its own.

The compression paradigm has taken us far, perhaps as far as it can. Models trained to predict the next token have learned to imitate human communication with remarkable fidelity. But imitation, no matter how sophisticated, eventually hits a ceiling. Discovery requires something different: environments that resist our priors, tasks that penalize shortcuts, and objectives that reward understanding structure rather than mimicking surface patterns. Intelligence does not emerge from compressing the past more efficiently. It emerges from discovering what must be true by acting under uncertainty, from building models that capture not just what happens but why it must happen.

Whether this reorientation succeeds remains an open question, one that can only be answered empirically. But the question is now well-formed enough to test. The field has the technical machinery, the computational resources, and increasingly the awareness that something fundamental is missing. What remains is the willingness to test a different hypothesis about what learning should optimize for, not how well we predict the shadows on the cave wall, but how accurately we infer the objects casting them.