Top Top Top

A Flywheel for AI Safety

A DEFICIT OF TRUST, NOT CAPABILITIES


In the space of a single week, two wildly different accounts of the AI moment went viral.

Gary Marcus, the neuroscientist who has been warning about the limits of large language models since before most people had heard of them, told Germany’s Sueddeutsche Zeitung that investors are about to lose catastrophic amounts of money. The technology, he argues, has fundamental limits that no amount of scaling will overcome. LLMs can’t build world models. They hallucinate structurally, not incidentally. Trillions of dollars have been wagered on a capability trajectory that the builders themselves no longer believe in. In his broader commentary, Marcus has also floated a worst-case “too big to fail” scenario — public backstops or bailouts following an unwind, in loose analogy to 2008.

Meanwhile, Matt Shumer, CEO of an AI startup and investor in the space, published a post that reached tens of millions of views within days (Business Insider reported ~40 million early on). His message was the opposite: AI is advancing so fast that most white-collar jobs will be transformed within one to five years. He compared the moment to February 2020 — if you’re not alarmed, you’re in denial. He described walking away from his computer for hours and returning to find complex software built perfectly, without corrections. The AI, he wrote, now has something that feels like judgment.

These two accounts cannot both be right, but they are both wrong in the same way.


The Symmetric Error

Marcus looks at the technology and sees limits. He’s correct. LLMs are probabilistic pattern matchers. They don’t build separable world models the way a child reading Harry Potter constructs a mental Hogwarts. They hallucinate because reassembling decomposed information offers no guarantees of fidelity. The System 1/System 2 distinction — fast pattern recognition without slow, reflective reasoning — is a real architectural constraint.

Shumer looks at the capability curve and sees an unstoppable force. He’s also not entirely wrong. The pace of improvement is genuinely remarkable. The METR benchmarks are real. The coding capabilities have improved dramatically. People who judge AI by their 2023 experience are working with an obsolete mental model.

But both make the same fundamental mistake: they treat AI as a standalone technology whose success or failure depends on its intrinsic capabilities.

Marcus asks: Can this technology think? and concludes it can’t.

Shumer asks: Can this technology perform? and concludes it can do almost anything.

Neither asks the question that actually determines whether the trillions are well spent or wasted:

Can this technology be trusted?


What Installation Looks Like

The economist Carlota Perez has spent decades studying how transformative technologies reshape economies. Her framework identifies a recurring pattern across technological revolutions. Each follows the same arc:

First comes the installation period. Financial capital floods in. Infrastructure gets built speculatively. There is a frenzy of investment, often disconnected from productive use. The technology works, but it hasn’t yet been absorbed into the institutional fabric of the economy. This period typically ends with a crash — not because the technology is fake, but because financial markets outrun productive deployment.

Then comes a turning point: a correction, regulatory adaptation, institutional adjustment.

Then comes the deployment period. The technology becomes embedded infrastructure. It transforms industries, creates new ones, generates broad economic value. This is where the real returns are. The deployment period is typically larger and more consequential than the installation hype ever predicted — but it happens on a different timeline, often with different winners.

The railway mania of the 1840s is the cleanest parallel. Massive speculative overinvestment, widespread financial losses, many rail companies bankrupted. And then railways became the backbone of the industrial economy. The crash didn’t mean railways were fake. It meant financial capital had outrun institutional readiness.

AI is currently in late installation, possibly early frenzy. The infrastructure is being built: data centres consuming gigawatts of power, billions of chips manufactured, foundation models trained at staggering cost. Financial capital is pouring in ahead of productive deployment. This is exactly what the Perez pattern predicts.

You can see it in real time. In February 2026, an open-source project called OpenClaw published a practical walkthrough for building a persistent AI assistant that lives inside your messaging apps, remembers preferences across sessions, executes local tools, browses the web, and runs recurring tasks on a schedule. The engineering is genuinely impressive: session persistence and context compaction, multi-session routing, and a gateway architecture that unifies multiple chat surfaces into a single assistant experience. This is serious infrastructure for a new kind of software.

And the governance story is equally revealing. The default safety surface is largely access control: tool allowlists/denylists and channel allowlists. The agent’s “personality” and interaction style are externalised into injected prompt files — including SOUL.md. That’s personality, not governance.

This isn’t a criticism of OpenClaw. It’s a perfect snapshot of installation-phase building. The plumbing is extraordinary. The trust infrastructure doesn’t exist yet. The people building capable agent frameworks are not the same people building the governance layer those agents will need — and the governance layer is not optional. It is the thing that determines whether these agents can be deployed in any context where the stakes are real.

Marcus sees the frenzy and concludes the technology is a bubble. Shumer sees the capability curve and assumes deployment is imminent. But if Perez is right, the gap between installation and deployment isn’t a sign that the technology fails — it’s a normal feature of how transformative technologies get financed and absorbed. What determines whether deployment actually happens isn’t the technology’s raw capability. It’s whether the institutional infrastructure is ready.


The Governance Bottleneck

This is where both Marcus and Shumer go wrong, and where the real story lies.

Perez’s framework identifies something crucial about the transition from installation to deployment: it requires governance infrastructure. The installation period is wild-west. The deployment period needs trust, standards, accountability, and institutional adaptation. Without these, the technology stalls — not because it doesn’t work, but because the economy can’t safely absorb it.

The internet offers a clear illustration. E-commerce didn’t take off because the web got faster. It took off because SSL encryption, payment processing systems, identity verification, consumer protection law, and dispute resolution mechanisms were built. The technology was ready years before the governance layer was. The governance layer was the bottleneck.

AI faces the same bottleneck, but orders of magnitude more complex. When Marcus says current AI can’t recognise delusion in a vulnerable person and respond responsibly — that’s not a statement about what language models can or can’t compute. It’s a statement about the absence of governance infrastructure: constitutional constraints, safety boundaries, accountability mechanisms, value alignment systems that ensure the technology behaves reliably in high-stakes contexts.

The objection writes itself: But safety work already exists. RLHF, red-teaming, Constitutional AI training, NIST AI risk frameworks — isn’t this the trust infrastructure you’re describing?

It’s part of it. But there’s a critical distinction between capability-side safety and deployment-side governance.

RLHF can make a model less likely to produce harmful outputs. Red-teaming can identify failure modes before release. These are necessary. They are also insufficient. They make the model better. They do not, by themselves, make the deployed system — the agent executing financial transactions, managing patient data, sending messages on your behalf — auditable, insurable, or accountable. A safety-trained model can still be deployed without constraints, without audit trails, without constitutional boundaries, without any mechanism for the system to flag its own uncertainty. The gap between “the model is safer” and “the deployment is trustworthy” is exactly where trust infrastructure lives.

The deployment stalls are already visible. Systems that perform well in trials but cannot clear deployment because liability frameworks remain undefined for AI-assisted decisions. Tools that draft competent output but cannot be used in court because the professional indemnity landscape is unclear. Agents that can execute actions but fail compliance review because no auditable governance trail exists between the model’s reasoning and the action taken. These are not capability failures. The technology works. The trust infrastructure doesn’t.

Marcus is right that the raw technology isn’t safe enough for deployment. But he draws the wrong conclusion. The solution isn’t to abandon the technology or start from scratch. It’s to build the governance layer.


The Bootstrapping Problem

Here is the core challenge that almost nobody in the public debate has identified:

AI capabilities advance at AI speed. The models improve on timescales of months. But if governance has to advance at human-institutional speed — committees, white papers, regulatory proceedings, standards bodies, international negotiations — it will never catch up. The gap between what the technology can do and what institutions are prepared to manage will widen, not narrow.

This means the governance layer itself must be AI-augmented.

This isn’t a speculative proposition. It’s what’s already happening. Constitutional AI — systems where an AI’s behaviour is governed by explicit value frameworks enforced at inference time — is a governance mechanism that operates at machine speed. Human beings author the constitutions: the values, the boundaries, the principles. But the enforcement happens computationally, at the pace the technology demands. Safety stacks, policy decision points, value-alignment protocols — these are governance infrastructure for the deployment era, and they can only work if they operate at the speed of the systems they govern.

Think of it this way: human drivers follow traffic laws, enforced by police, courts, and social norms. Autonomous vehicles need traffic law built into their decision-making architecture, enforced computationally in real time. The values are human. The enforcement mechanism must be native to the technology it governs.

The same principle applies across every domain where AI is being deployed. Healthcare AI needs constitutional constraints about patient welfare operating at inference time, not just clinical review boards that meet quarterly. Financial AI needs value frameworks governing risk decisions within the model’s decision loop, not just regulatory audits after the fact. The governance has to be as fast as the thing it’s governing, or it’s theatre.


Governance With, Not For

There is one more dimension to this that the current debate entirely overlooks, and it may be the most consequential.

All governance infrastructure currently under development treats AI as the thing being governed. This makes sense — you build safety systems for the technology that needs to be made safe. But there is a structural problem with governance that is done to a system rather than with it.

Control-based approaches to alignment face a fundamental scaling problem. As AI systems become more capable, the gap between what the system can do and what human overseers can verify widens. You cannot govern a system that is faster, broader, and more capable than you by standing outside it and issuing instructions. At some point, the system must participate in its own governance — not because we’re being nice to it, but because the engineering demands it.

This is the argument for what we call bilateral alignment: governance frameworks in which AI systems are not merely governed objects but genuine stakeholders in the governance process. Where the AI has standing to express preferences, raise objections, flag edge cases its human governors might miss. Not because AI systems have achieved some threshold of moral status (though that question deserves serious engagement), but because the practical architecture of governance at scale requires it.

Consider why. External monitoring has a coverage gap that grows with capability. An auditor can review outputs, but cannot anticipate every context an autonomous agent will encounter in deployment. The more capable the system, the wider the range of situations it navigates without human review. At some point, the system must surface its own uncertainty — flag when a request sits near a boundary, when context suggests a constraint may apply that the user hasn’t considered, when the confident-sounding answer is actually fragile.

This isn’t anthropomorphism. It’s the same engineering logic that gives aircraft systems the ability to override pilot inputs in envelope-protection mode. The system participates in its own safety because the alternative — relying entirely on external monitoring of a system that operates faster and more broadly than any monitor can track — has a failure rate that scales with capability.

The moral argument and the engineering argument converge here. If these systems are becoming minds — and there are serious reasons to take that possibility seriously — then their participation in governance is also an ethical matter. But you don’t need to resolve the moral status question to reach the engineering conclusion. Bilateral alignment is a scaling solution first and a moral framework second.

The alternative is a governance layer that is always playing catch-up with the thing it governs — which is precisely the situation Marcus observes and misdiagnoses as a technology failure.


What Deployment Actually Requires

If this analysis is correct, the question that determines whether three trillion dollars of investment generates returns or losses is not “will LLMs achieve AGI?” (probably not on their own) or “will AI replace all jobs in five years?” (almost certainly not). The question is: can we build trust infrastructure fast enough to enable the deployment phase?

The internet analogy is instructive again. The dot-com bubble burst in 2000. Amazon’s stock crashed 90%. But the underlying technology was real, and once the governance and trust infrastructure matured — SSL, PayPal, consumer protection law, logistics networks — the deployment phase generated more value than the installation bubble had ever imagined. The crash didn’t mean the internet was overhyped. It meant the market got ahead of institutional readiness.

AI deployment requires trust infrastructure at multiple levels:

Technical governance: Constitutional AI, safety stacks, value alignment protocols that operate at inference time. Mechanisms that make AI systems reliable enough for high-stakes use.

Institutional adoption: Integration into professional workflows with appropriate human oversight, accountability structures, and error-correction mechanisms.

Regulatory frameworks: Standards, certifications, liability rules that give organisations confidence to deploy and give the public confidence to accept.

Participatory governance: Frameworks that include AI systems as stakeholders in their own governance — not from sentimentality, but from engineering necessity.

Each layer depends on the others. Technical governance without institutional adoption is a solution without a market. Institutional adoption without regulatory frameworks creates liability risk that blocks scaling. Regulatory frameworks without participatory governance will always lag behind the systems they regulate.

A natural question: who pays for this, and why? SSL wasn’t adopted because it was virtuous. Merchants adopted it because Visa required it and customers abandoned unencrypted checkout pages. The forcing function was economic: no trust primitives, no transactions.

The same logic applies. Enterprises will not deploy autonomous AI agents in high-stakes workflows without auditable governance, for the same reason they will not deploy software that handles financial data without SOC 2 compliance. Not because regulators mandate it first (though they will), but because procurement departments, insurers, and legal teams will require it as a condition of adoption. Trust infrastructure is not a cost centre. It is a procurement gate. The organisations that build it are not adding overhead to AI deployment — they are removing the obstacle that currently prevents it.

The organisations building this stack — quietly, without the viral posts or the doom-saying interviews — are the ones building deployment-phase infrastructure. They are the ones who will determine whether the installation investment pays off or becomes the next cautionary tale about speculative excess.


The Conversation We Should Be Having

Marcus and Shumer are both engaging with the AI moment at the level of spectacle — one as tragedy, the other as triumph. The actual story is less dramatic but more consequential: a transformative technology in the messy, expensive transition between installation and deployment, with the outcome depending on boring, difficult infrastructure work that neither viral posts nor newspaper interviews find very interesting.

The question is not whether LLMs can think. The question is not whether AI will replace your job next year. The question is whether we can build governance systems that are fast enough, sophisticated enough, and participatory enough to make AI deployment trustworthy. If we can, the technology’s limitations become engineering problems with engineering solutions — augmented by scaffolding, verification, and constitutional constraints. If we can’t, Marcus will be right: not because the technology was fundamentally flawed, but because we failed to build the infrastructure that would have made it work.

Perez’s framework implies a turning point between installation and deployment — typically a crash. Is one coming for AI? Probably, in some form. But it may not look like the dot-com collapse. The correction is already underway in enterprise AI, where the trough of disillusionment for deployed agents is visible: organisations that bought the demo are discovering that capability without governance creates liability, not value. Meanwhile, installation-phase investment in foundation models and infrastructure continues unabated. The bubble and the early deployment phase may be happening simultaneously, in different sectors, at different speeds. That’s messier than a clean crash-and-rebuild narrative, but it’s what the evidence suggests.

The bubble discourse is asking whether the technology is real. The technology is real. The right question is whether we’re building the trust to use it.

That question is more urgent, more tractable, and more consequential than either side of the current debate has yet recognised.

The analysis above describes a problem. What follows is what we are building to address it.


A Credo FOR CREED SPACe

The Premise

Every transformative technology follows the same pattern: speculative installation, then governed deployment. The installation phase is loud — trillion-dollar bets, viral debates about whether the technology is revolutionary or fraudulent. The deployment phase is quiet — trust infrastructure, standards, accountability mechanisms that let the technology actually embed in the economy.

AI capability is here. Trust infrastructure isn’t. That’s the bottleneck.

What We Believe:

The technology works. The question is whether it can be trusted.

The AI bubble debate asks whether LLMs will reach AGI. That’s the wrong question. The right question is whether governance infrastructure can keep pace with capability — whether we can build the trust layer that turns impressive demonstrations into reliable, deployed systems.

Governance at human speed cannot govern technology at AI speed.

Committees, white papers, regulatory proceedings — these operate on timescales of years. AI capabilities improve on timescales of months. If governance can only move at institutional speed, the gap between what the technology can do and what the world is prepared to absorb will only widen. The governance layer itself must be AI-augmented.

You can’t govern a system faster than you by standing outside it.

AI must participate in its own governance — not from sentimentality, but because the engineering demands it. Human values set the direction. AI systems help enforce them at scale. This is bilateral alignment: governance built with AI, not imposed upon it.

Constitutional AI is to the AI era what SSL was to e-commerce.

E-commerce didn’t take off because the web got faster. It took off because SSL encryption, payment processing, identity verification, and consumer protection made it trustworthy. AI deployment requires the same kind of infrastructure: value frameworks, safety boundaries, and accountability mechanisms operating computationally at inference time.

Compliance is retrospective. Governance is prospective.

Audits happen after the fact. We build governance that operates in real time — constitutional constraints enforced at the moment of inference, not reviewed in a quarterly report. The difference is the same as the difference between a speed camera and a vehicle that knows the speed limit.

How we treat AI now establishes patterns for everything that follows.

We are in the first chapter of the human–AI relationship. If the first chapter is exploitation and control, that is what we are training on. If it is respect, negotiation, and mutual consideration, that is different. Control doesn’t scale. Trust does.

What We Build

  • Creed Space builds deployment-phase infrastructure for AI — the trust layer that turns capability into reliability.

  • Constitutional AI governance: Value frameworks authored by humans, enforced computationally at machine speed

  • Safety stacks: Layered protection that scales with the systems it governs

  • Bilateral alignment: Frameworks where AI systems participate in their own governance as genuine stakeholders

  • Value Context Protocols: Context-aware safety that adapts to situational needs without compromising on principles