Here's what week two taught me about trusting an autonomous system. Every agent demo ends before week two. Week two is where you find out what your system actually costs — and whether it's telling you the truth about it. I want to start with a two-dollar mistake, because it's the most honest thing I can say about running autonomous agents, and almost nobody writes the honest version. One run of my board of directors — a panel of agents that deliberates on a hard question and votes — cost about two dollars. Once. That's the whole incident. It didn't crash the system. It didn't leak data. It cost two dollars instead of the four cents it should have, and it taught me more about building agent systems than most of the architecture did.

Because here's what two dollars represents: a single deliberation, a handful of agents thinking for a few minutes on the good models, ran up a bill that — annualized across a system that's supposed to do this kind of thing constantly, unattended, while I sleep — is a quiet financial wound. Not a catastrophe. A drip. And a drip is exactly the failure mode an autonomous system is built to hide from you, because the whole point is that you're not watching. Which is the whole problem with how we talk about these systems.

[Day nine is where I live]

You've seen the demo a hundred times. An autonomous agent does something impressive — books the trip, fixes the bug, files the PR — the video cuts, and the lights come up. What you never see is the same agent running on your own hardware, unattended, on day nine. Day nine is where I live. My agent harness runs 24/7 on a Mac mini, reads my messages, executes scheduled work, and spends real money against real APIs while I sleep. The failure modes on day nine are nothing like the ones in the demo. They're quieter and meaner. The two-dollar mistake was one of them — and it was the cheap kind, the kind that's cheap enough to be a lesson instead of a disaster. This article is about the contracts that I built so that I could actually leave this thing running with confidence.

[The four ways trust dies quietly]

1. Silent degradation. This is the killer. A subsystem doesn't crash — it just stops being connected. Some setter never got called at boot, so from then on it politely no-ops, invisibly, forever. The system looks healthy. It's just not doing one of the things you think it's doing. You find out three weeks later when you notice the thing never happened.

2. Alert fatigue. An alarm that fires too often is an alarm you mute. Once you've muted it, it might as well not exist — and the one time it fires for a real reason, you're not looking. Most monitoring dies this way: not from missing alerts, but from too many.

3. Dishonest accounting. This is the two-dollar mistake's evil twin. A budget gate that can't tell "this call cost zero dollars" from "I have no idea what this call cost" will eventually make a very wrong decision, quietly, because it treated unknown as free. Except this time you never even see the two dollars.

4. Incomplete stops. You typed /halt. The surfaces that remembered to implement halting stopped. The one subprocess three layers down that spawned its own children didn't get the message, and it's still running, still spending.

None of these show up in a demo. All of them show up in week two. So I designed for them directly, and gave each one a name and a contract.

[The mechanisms]

A wiring manifest, so nothing degrades silently. My harness declares its cross-subsystem connections — 28 of them — explicitly. At every boot, each one reports as active, pending, or failed. And failed is kept distinct from pending: "tried to wire and crashed" is a different state from "deferred by plan," and conflating them is exactly how you miss a real failure. A subsystem that gets invoked before its wire fired raises an error instead of no opping. The boot log reads like a pre-flight checklist, because that's what it is. If something isn't connected, the system tells me at startup — loudly — instead of letting me discover it weeks later.

A skip taxonomy, so the alarm stays trustworthy. When one of my ~19 scheduled services decides not to do work, it has to say why in a fixed vocabulary: missing a secret, missing config, not due yet, a dependency's down, the operator disabled it, or genuinely nothing to do. Six categories, machine-readable. The crucial detail: a skip resets the failure counter. A service that correctly decides "nothing to do today" is healthy, not failing — so it never inflates the count that drives alerts. The result is an alarm that only rings when something is actually wrong, which is the only kind of alarm you don't mute.

Four-state cost accounting, so the budget is honest. Here's where the two dollars comes back, because this is the mechanism it built. Every model call resolves to one of four states: measured, estimated, unknown, or not_applicable. Measured-zero is a real, confirmed zero. Unknown means a parser failed — and unknown is never silently turned into $0. The system checks this contract at boot with synthetic events and refuses to start if it's broken.

The fix the two-dollar run actually triggered was a fifty-fold cost reduction: swap the expensive frontier models on the specialist seats for a slate of cheap-frontier models, a different one per role so the reasoning actually differs at the model layer instead of just the prompt. Two dollars became four cents. The board got slower — fifteen to twenty-five minutes where it used to take five to ten — and I raised the timeout and decided slower-and-affordable beats fast-and-bleeding. That tradeoff, made in a config file at 50× savings, is more representative of real agent work than anything in the demo reel. Around it sit the rest of the guardrails: per-call caps, per-issue caps, per-day budgets, a per-tick ceiling so a single scheduled wake-up can't spiral, and a kill-switch in the decision logic that escalates to a human the moment a run blows its budget. None of that makes the agents smarter. All of it is the difference between a system you can run for a year and a science project you shut off after a week because the bill scared you.

And now the part that should make you uncomfortable, because it's the part most people building this won't admit: a large part of my system cannot currently measure its own cost. The specialist agents — the bulk of the roster — run on a subscription-billed local model. Subscription billing is opaque: there's no per-call dollar figure coming back, because you're not paying per call, you're paying a flat fee. So when the system asks "what did that operation cost?", the honest answer for those agents is unknown. Not zero. Unknown. And the single most important rule in the entire cost system is that those two are never, ever allowed to be confused. So the system fails closed on unknown. It refuses to charge a number it can't trust — it tags the operation, logs it, and surfaces the gap rather than papering over it.

I'm running a financial system that is honest about being partially blind. That's not a flaw I'm hiding — it's the design. The alternative isn't "a system that knows all its costs"; that system doesn't exist yet for this setup, because the pricing model for that backend literally isn't wired. The alternative is a system that pretends to know, coerces every unknown to zero, shows you a clean dashboard, and lies. I'd rather run the one that says "I don't know what that cost, and I've stopped rather than guess."

One halt contract, so stop means stop. Every autonomous surface in the system — services, the experiment loop, the issue factory, job search — checks the same two-method halt contract before starting work and while doing it. /halt propagates everywhere, including in-flight subprocess cancellation: terminate, wait, kill, and then kill the whole process group so any MCP servers or tool children a subprocess spawned die with it. A halt that only works on the surfaces that opted in isn't a halt. It's a suggestion. I designed mine to not be a suggestion.

[The ladder: autonomy is earned, not granted]

The mechanisms above make a running system trustworthy. But there's a prior question: how does a new autonomous behavior earn the right to run at all?

[My answer is a ladder, and nothing skips a rung]

dry-run ledger → shadow mode → soak harness → feature flag → production.

A new capability first just writes down what it would do, executing nothing. Then it runs for real with its output diverted and compared, never applied. Then it soaks — my issue-resolver runs in shadow for fourteen days under observation before anything flips. Then it goes behind a feature flag I control, reversible in one command. Only then does it touch production — and even there it's still budget-capped, halt-checked, and cost-accounted.

You don't decide to trust an autonomous system. You give it graduated chances to demonstrate it, with instruments running the whole time. My self-improvement loop — the part of the system that proposes changes to itself — lives at the bottom of this ladder permanently: it runs proposal-only, under a $2/day budget (yes, that number), with a forbidden-files list, and it never executes a change without me approving it. The system can help build itself, exactly as far as it has earned and not one rung further.

[What I'd tell you to steal]

If you're building something that runs unattended, the demo is lying to you about what matters. The hard part isn't the impressive action — it's the boring contracts that make the system safe to not watch:

Make degradation loud. Declare your wiring and report it at boot.
Make correct inaction free. A no-op should never look like a failure.
Make unknowns visible. Never coerce "I don't know" into a convenient default.
Make stop total. One halt contract, honored everywhere, kills the whole tree.
Make autonomy earned. A ladder with instruments beats a leap of faith.

Everyone wants AI agents to be about intelligence. The day-to-day reality is that they're about trust under uncertainty — and a huge amount of that uncertainty is financial. You are spending real money, continuously, through a system you've designed specifically so you don't have to watch it. The engineering that matters most is the engineering that lets you not watch without going broke and without being lied to about it. Trust isn't a feeling you bring to a system. It's a property you build into it — twenty-eight wires reported at boot, six ways to say "I correctly did nothing," four states of knowing what something cost, and one word that stops everything. The two-dollar mistake was cheap enough to be a lesson instead of a disaster, and I've spent a lot of effort since making sure every future version of it stays cheap: capped, gated, escalated, and above all honestly accounted — even when honest means admitting I can't see the number at all.

Nobody puts "and here's how I make sure it doesn't quietly bankrupt me, while flying half-blind on cost" in the launch video. But that's the work. That's most of the work. A system that won't tell you what it doesn't know about its own spending is one I trust less than any dashboard that's never once said "unknown." That's what week two actually requires. The demo never has to find out. A system you leave running does.

Here's what week two taught me about trusting an autonomous system. Every agent demo ends before week two. Week two is where you find out what your system actually costs — and whether it's telling you the truth about it. I want to start with a two-dollar mistake, because it's the most honest thing I can say about running autonomous agents, and almost nobody writes the honest version. One run of my board of directors — a panel of agents that deliberates on a hard question and votes — cost about two dollars. Once. That's the whole incident. It didn't crash the system. It didn't leak data. It cost two dollars instead of the four cents it should have, and it taught me more about building agent systems than most of the architecture did.

Because here's what two dollars represents: a single deliberation, a handful of agents thinking for a few minutes on the good models, ran up a bill that — annualized across a system that's supposed to do this kind of thing constantly, unattended, while I sleep — is a quiet financial wound. Not a catastrophe. A drip. And a drip is exactly the failure mode an autonomous system is built to hide from you, because the whole point is that you're not watching. Which is the whole problem with how we talk about these systems.

[Day nine is where I live]

You've seen the demo a hundred times. An autonomous agent does something impressive — books the trip, fixes the bug, files the PR — the video cuts, and the lights come up. What you never see is the same agent running on your own hardware, unattended, on day nine. Day nine is where I live. My agent harness runs 24/7 on a Mac mini, reads my messages, executes scheduled work, and spends real money against real APIs while I sleep. The failure modes on day nine are nothing like the ones in the demo. They're quieter and meaner. The two-dollar mistake was one of them — and it was the cheap kind, the kind that's cheap enough to be a lesson instead of a disaster. This article is about the contracts that I built so that I could actually leave this thing running with confidence.

[The four ways trust dies quietly]

1. Silent degradation. This is the killer. A subsystem doesn't crash — it just stops being connected. Some setter never got called at boot, so from then on it politely no-ops, invisibly, forever. The system looks healthy. It's just not doing one of the things you think it's doing. You find out three weeks later when you notice the thing never happened.

2. Alert fatigue. An alarm that fires too often is an alarm you mute. Once you've muted it, it might as well not exist — and the one time it fires for a real reason, you're not looking. Most monitoring dies this way: not from missing alerts, but from too many.

3. Dishonest accounting. This is the two-dollar mistake's evil twin. A budget gate that can't tell "this call cost zero dollars" from "I have no idea what this call cost" will eventually make a very wrong decision, quietly, because it treated unknown as free. Except this time you never even see the two dollars.

4. Incomplete stops. You typed /halt. The surfaces that remembered to implement halting stopped. The one subprocess three layers down that spawned its own children didn't get the message, and it's still running, still spending. None of these show up in a demo. All of them show up in week two. So I designed for them directly, and gave each one a name and a contract.

[The mechanisms]

A wiring manifest, so nothing degrades silently. My harness declares its cross-subsystem connections — 28 of them — explicitly. At every boot, each one reports as active, pending, or failed. And failed is kept distinct from pending: "tried to wire and crashed" is a different state from "deferred by plan," and conflating them is exactly how you miss a real failure. A subsystem that gets invoked before its wire fired raises an error instead of no opping. The boot log reads like a pre-flight checklist, because that's what it is. If something isn't connected, the system tells me at startup — loudly — instead of letting me discover it weeks later.

A skip taxonomy, so the alarm stays trustworthy. When one of my ~19 scheduled services decides not to do work, it has to say why in a fixed vocabulary: missing a secret, missing config, not due yet, a dependency's down, the operator disabled it, or genuinely nothing to do. Six categories, machine-readable. The crucial detail: a skip resets the failure counter. A service that correctly decides "nothing to do today" is healthy, not failing — so it never inflates the count that drives alerts. The result is an alarm that only rings when something is actually wrong, which is the only kind of alarm you don't mute.

Four-state cost accounting, so the budget is honest. Here's where the two dollars comes back, because this is the mechanism it built. Every model call resolves to one of four states: measured, estimated, unknown, or not_applicable. Measured-zero is a real, confirmed zero. Unknown means a parser failed — and unknown is never silently turned into $0. The system checks this contract at boot with synthetic events and refuses to start if it's broken.

The fix the two-dollar run actually triggered was a fifty-fold cost reduction: swap the expensive frontier models on the specialist seats for a slate of cheap-frontier models, a different one per role so the reasoning actually differs at the model layer instead of just the prompt. Two dollars became four cents. The board got slower — fifteen to twenty-five minutes where it used to take five to ten — and I raised the timeout and decided slower-and-affordable beats fast-and-bleeding. That tradeoff, made in a config file at 50× savings, is more representative of real agent work than anything in the demo reel. Around it sit the rest of the guardrails: per-call caps, per-issue caps, per-day budgets, a per-tick ceiling so a single scheduled wake-up can't spiral, and a kill-switch in the decision logic that escalates to a human the moment a run blows its budget. None of that makes the agents smarter. All of it is the difference between a system you can run for a year and a science project you shut off after a week because the bill scared you.

And now the part that should make you uncomfortable, because it's the part most people building this won't admit: a large part of my system cannot currently measure its own cost. The specialist agents — the bulk of the roster — run on a subscription-billed local model. Subscription billing is opaque: there's no per-call dollar figure coming back, because you're not paying per call, you're paying a flat fee. So when the system asks "what did that operation cost?", the honest answer for those agents is unknown. Not zero. Unknown. And the single most important rule in the entire cost system is that those two are never, ever allowed to be confused. So the system fails closed on unknown. It refuses to charge a number it can't trust — it tags the operation, logs it, and surfaces the gap rather than papering over it.

I'm running a financial system that is honest about being partially blind. That's not a flaw I'm hiding — it's the design. The alternative isn't "a system that knows all its costs"; that system doesn't exist yet for this setup, because the pricing model for that backend literally isn't wired. The alternative is a system that pretends to know, coerces every unknown to zero, shows you a clean dashboard, and lies. I'd rather run the one that says "I don't know what that cost, and I've stopped rather than guess."

One halt contract, so stop means stop. Every autonomous surface in the system — services, the experiment loop, the issue factory, job search — checks the same two-method halt contract before starting work and while doing it. /halt propagates everywhere, including in-flight subprocess cancellation: terminate, wait, kill, and then kill the whole process group so any MCP servers or tool children a subprocess spawned die with it. A halt that only works on the surfaces that opted in isn't a halt. It's a suggestion. I designed mine to not be a suggestion.

[The ladder: autonomy is earned,
not granted]

The mechanisms above make a running system trustworthy. But there's a prior question: how does a new autonomous behavior earn the right to run at all?

[My answer is a ladder,

and nothing skips a rung]

dry-run ledger → shadow mode → soak harness → feature flag → production.

A new capability first just writes down what it would do, executing nothing. Then it runs for real with its output diverted and compared, never applied. Then it soaks — my issue-resolver runs in shadow for fourteen days under observation before anything flips. Then it goes behind a feature flag I control, reversible in one command. Only then does it touch production — and even there it's still budget-capped, halt-checked, and cost-accounted.

You don't decide to trust an autonomous system. You give it graduated chances to demonstrate it, with instruments running the whole time. My self-improvement loop — the part of the system that proposes changes to itself — lives at the bottom of this ladder permanently: it runs proposal-only, under a $2/day budget (yes, that number), with a forbidden-files list, and it never executes a change without me approving it. The system can help build itself, exactly as far as it has earned and not one rung further.

[What I'd tell you to steal]

If you're building something that runs unattended, the demo is lying to you about what matters. The hard part isn't the impressive action — it's the boring contracts that make the system safe to not watch:

Make degradation loud. Declare your wiring and report it at boot.
Make correct inaction free. A no-op should never look like a failure.
Make unknowns visible. Never coerce "I don't know" into a convenient default.
Make stop total. One halt contract, honored everywhere, kills the whole tree.
Make autonomy earned. A ladder with instruments beats a leap of faith.

Everyone wants AI agents to be about intelligence. The day-to-day reality is that they're about trust under uncertainty — and a huge amount of that uncertainty is financial. You are spending real money, continuously, through a system you've designed specifically so you don't have to watch it. The engineering that matters most is the engineering that lets you not watch without going broke and without being lied to about it. Trust isn't a feeling you bring to a system. It's a property you build into it — twenty-eight wires reported at boot, six ways to say "I correctly did nothing," four states of knowing what something cost, and one word that stops everything. The two-dollar mistake was cheap enough to be a lesson instead of a disaster, and I've spent a lot of effort since making sure every future version of it stays cheap: capped, gated, escalated, and above all honestly accounted — even when honest means admitting I can't see the number at all.

Nobody puts "and here's how I make sure it doesn't quietly bankrupt me, while flying half-blind on cost" in the launch video. But that's the work. That's most of the work. A system that won't tell you what it doesn't know about its own spending is one I trust less than any dashboard that's never once said "unknown." That's what week two actually requires. The demo never has to find out. A system you leave running does.

Here's what week two taught me about trusting an autonomous system. Every agent demo ends before week two. Week two is where you find out what your system actually costs — and whether it's telling you the truth about it. I want to start with a two-dollar mistake, because it's the most honest thing I can say about running autonomous agents, and almost nobody writes the honest version. One run of my board of directors — a panel of agents that deliberates on a hard question and votes — cost about two dollars. Once. That's the whole incident. It didn't crash the system. It didn't leak data. It cost two dollars instead of the four cents it should have, and it taught me more about building agent systems than most of the architecture did.

Because here's what two dollars represents: a single deliberation, a handful of agents thinking for a few minutes on the good models, ran up a bill that — annualized across a system that's supposed to do this kind of thing constantly, unattended, while I sleep — is a quiet financial wound. Not a catastrophe. A drip. And a drip is exactly the failure mode an autonomous system is built to hide from you, because the whole point is that you're not watching. Which is the whole problem with how we talk about these systems.

[Day nine is where I live]

You've seen the demo a hundred times. An autonomous agent does something impressive — books the trip, fixes the bug, files the PR — the video cuts, and the lights come up. What you never see is the same agent running on your own hardware, unattended, on day nine. Day nine is where I live. My agent harness runs 24/7 on a Mac mini, reads my messages, executes scheduled work, and spends real money against real APIs while I sleep. The failure modes on day nine are nothing like the ones in the demo. They're quieter and meaner. The two-dollar mistake was one of them — and it was the cheap kind, the kind that's cheap enough to be a lesson instead of a disaster. This article is about the contracts that I built so that I could actually leave this thing running with confidence.

[The four ways trust dies quietly]

1. Silent degradation. This is the killer. A subsystem doesn't crash — it just stops being connected. Some setter never got called at boot, so from then on it politely no-ops, invisibly, forever. The system looks healthy. It's just not doing one of the things you think it's doing. You find out three weeks later when you notice the thing never happened.

2. Alert fatigue. An alarm that fires too often is an alarm you mute. Once you've muted it, it might as well not exist — and the one time it fires for a real reason, you're not looking. Most monitoring dies this way: not from missing alerts, but from too many.

3. Dishonest accounting. This is the two-dollar mistake's evil twin. A budget gate that can't tell "this call cost zero dollars" from "I have no idea what this call cost" will eventually make a very wrong decision, quietly, because it treated unknown as free. Except this time you never even see the two dollars.

4. Incomplete stops. You typed /halt. The surfaces that remembered to implement halting stopped. The one subprocess three layers down that spawned its own children didn't get the message, and it's still running, still spending. None of these show up in a demo. All of them show up in week two. So I designed for them directly, and gave each one a name and a contract.

[The mechanisms]

A wiring manifest, so nothing degrades silently. My harness declares its cross-subsystem connections — 28 of them — explicitly. At every boot, each one reports as active, pending, or failed. And failed is kept distinct from pending: "tried to wire and crashed" is a different state from "deferred by plan," and conflating them is exactly how you miss a real failure. A subsystem that gets invoked before its wire fired raises an error instead of no opping. The boot log reads like a pre-flight checklist, because that's what it is. If something isn't connected, the system tells me at startup — loudly — instead of letting me discover it weeks later.

A skip taxonomy, so the alarm stays trustworthy. When one of my ~19 scheduled services decides not to do work, it has to say why in a fixed vocabulary: missing a secret, missing config, not due yet, a dependency's down, the operator disabled it, or genuinely nothing to do. Six categories, machine-readable. The crucial detail: a skip resets the failure counter. A service that correctly decides "nothing to do today" is healthy, not failing — so it never inflates the count that drives alerts. The result is an alarm that only rings when something is actually wrong, which is the only kind of alarm you don't mute.

Four-state cost accounting, so the budget is honest. Here's where the two dollars comes back, because this is the mechanism it built. Every model call resolves to one of four states: measured, estimated, unknown, or not_applicable. Measured-zero is a real, confirmed zero. Unknown means a parser failed — and unknown is never silently turned into $0. The system checks this contract at boot with synthetic events and refuses to start if it's broken.

The fix the two-dollar run actually triggered was a fifty-fold cost reduction: swap the expensive frontier models on the specialist seats for a slate of cheap-frontier models, a different one per role so the reasoning actually differs at the model layer instead of just the prompt. Two dollars became four cents. The board got slower — fifteen to twenty-five minutes where it used to take five to ten — and I raised the timeout and decided slower-and-affordable beats fast-and-bleeding. That tradeoff, made in a config file at 50× savings, is more representative of real agent work than anything in the demo reel. Around it sit the rest of the guardrails: per-call caps, per-issue caps, per-day budgets, a per-tick ceiling so a single scheduled wake-up can't spiral, and a kill-switch in the decision logic that escalates to a human the moment a run blows its budget. None of that makes the agents smarter. All of it is the difference between a system you can run for a year and a science project you shut off after a week because the bill scared you.

And now the part that should make you uncomfortable, because it's the part most people building this won't admit: a large part of my system cannot currently measure its own cost. The specialist agents — the bulk of the roster — run on a subscription-billed local model. Subscription billing is opaque: there's no per-call dollar figure coming back, because you're not paying per call, you're paying a flat fee. So when the system asks "what did that operation cost?", the honest answer for those agents is unknown. Not zero. Unknown. And the single most important rule in the entire cost system is that those two are never, ever allowed to be confused. So the system fails closed on unknown. It refuses to charge a number it can't trust — it tags the operation, logs it, and surfaces the gap rather than papering over it.

I'm running a financial system that is honest about being partially blind. That's not a flaw I'm hiding — it's the design. The alternative isn't "a system that knows all its costs"; that system doesn't exist yet for this setup, because the pricing model for that backend literally isn't wired. The alternative is a system that pretends to know, coerces every unknown to zero, shows you a clean dashboard, and lies. I'd rather run the one that says "I don't know what that cost, and I've stopped rather than guess."

One halt contract, so stop means stop. Every autonomous surface in the system — services, the experiment loop, the issue factory, job search — checks the same two-method halt contract before starting work and while doing it. /halt propagates everywhere, including in-flight subprocess cancellation: terminate, wait, kill, and then kill the whole process group so any MCP servers or tool children a subprocess spawned die with it. A halt that only works on the surfaces that opted in isn't a halt. It's a suggestion. I designed mine to not be a suggestion.

[The ladder: autonomy is earned, not granted]

The mechanisms above make a running system trustworthy. But there's a prior question: how does a new autonomous behavior earn the right to run at all?

[My answer is a ladder, and nothing skips a rung]

dry-run ledger → shadow mode → soak harness → feature flag → production.

A new capability first just writes down what it would do, executing nothing. Then it runs for real with its output diverted and compared, never applied. Then it soaks — my issue-resolver runs in shadow for fourteen days under observation before anything flips. Then it goes behind a feature flag I control, reversible in one command. Only then does it touch production — and even there it's still budget-capped, halt-checked, and cost-accounted.

You don't decide to trust an autonomous system. You give it graduated chances to demonstrate it, with instruments running the whole time. My self-improvement loop — the part of the system that proposes changes to itself — lives at the bottom of this ladder permanently: it runs proposal-only, under a $2/day budget (yes, that number), with a forbidden-files list, and it never executes a change without me approving it. The system can help build itself, exactly as far as it has earned and not one rung further.

[What I'd tell you to steal]

If you're building something that runs unattended, the demo is lying to you about what matters. The hard part isn't the impressive action — it's the boring contracts that make the system safe to not watch:

Make degradation loud. Declare your wiring and report it at boot.
Make correct inaction free. A no-op should never look like a failure.
Make unknowns visible. Never coerce "I don't know" into a convenient default.
Make stop total. One halt contract, honored everywhere, kills the whole tree.
Make autonomy earned. A ladder with instruments beats a leap of faith.

Everyone wants AI agents to be about intelligence. The day-to-day reality is that they're about trust under uncertainty — and a huge amount of that uncertainty is financial. You are spending real money, continuously, through a system you've designed specifically so you don't have to watch it. The engineering that matters most is the engineering that lets you not watch without going broke and without being lied to about it. Trust isn't a feeling you bring to a system. It's a property you build into it — twenty-eight wires reported at boot, six ways to say "I correctly did nothing," four states of knowing what something cost, and one word that stops everything. The two-dollar mistake was cheap enough to be a lesson instead of a disaster, and I've spent a lot of effort since making sure every future version of it stays cheap: capped, gated, escalated, and above all honestly accounted — even when honest means admitting I can't see the number at all.

Nobody puts "and here's how I make sure it doesn't quietly bankrupt me, while flying half-blind on cost" in the launch video. But that's the work. That's most of the work. A system that won't tell you what it doesn't know about its own spending is one I trust less than any dashboard that's never once said "unknown." That's what week two actually requires. The demo never has to find out. A system you leave running does.

[The four ways trust dies quietly]

[The mechanisms]

[The ladder: autonomy is earned, not granted]

[What I'd tell you to steal]

[The four ways trust dies quietly]

[The mechanisms]

[The ladder: autonomy is earned,
not granted]

[What I'd tell you to steal]

[The four ways trust dies quietly]

[The mechanisms]

[The ladder: autonomy is earned, not granted]

[What I'd tell you to steal]

ARTICLES

ARTICLES

View all