Own the Running System
You learned to put an agent to work. This is the part that decides whether it still works in six months: maintaining the harness around it. You leave able to run a recurring Harness Pass on your own setup — and you will have run one, for real, on the system you built.
Part of The Install Track · certificate on completion
You leave holding work.
- Spot the two drifts: the world moving under your agent, and the model outgrowing the rails you built
- Map your harness into four surfaces and run the five checks to produce evidenced findings
- Decide cut / keep / verify-first — and defend keeping a low-usage item that is armed, not dead
- Install a recurring Harness Pass with a real trigger, logged so the next pass is faster
- Harness maintenance
- Verify-before-cut
- The five checks
- Maintenance cadence
There are 4 modules in this course.
4 modules · each ends in a worksheet your answers save to · Built on the live maintenance pass we ran on our own agent system in June 2026 — including the cut we almost got wrong..
07.1The Two Drifts
<h3>Launch is not the finish line</h3><p>A harness lives in motion. Think of a sailboat: the weather changes, lines loosen, salt gets into everything. Setting sail is not the same as staying underway. Your agent runs — but 'it ran this morning' and 'it's still doing what you need it to do' are two different claims.</p><h3>Agents break in both directions</h3><p>We're used to software breaking when it gets <em>worse</em> — a dependency goes stale, a server hiccups. Agents also break when the model gets <em>better</em>. A new, more capable model can outgrow the rails you built for a clumsier one. The guardrail that stopped a weaker model from overreaching now stops a smarter model from doing exactly what you need. That is not a failure you can catch by watching logs.</p><h3>Two drifts, one diagnosis</h3><p><strong>World-drift</strong> is what happens when the environment your agent reads has moved on: an SOP that's no longer current, a tag convention that changed, a customer list the agent consults that hasn't been updated in two months. The agent fires on schedule. It produces nothing useful. The logs say it ran fine.</p><p><strong>Model-drift</strong> is what happens when the model underneath your agent has become more capable than the guardrails you wrote. Yesterday's “only summarize, don't infer” rule was the right constraint for a model that used to infer badly. Today it's a handcuff on a model that infers well — and the agent is now blocked from doing the thing you most need it to do.</p><h3>Dana's job-status agent</h3><p>Dana runs a weekly job-status report agent for her specialty contractor operation. The SOP it reads still names last quarter's crew leads — the world drifted and nobody updated the file. And the “don't infer, just list” rule she added when the model was unreliable? It now stops a capable model from flagging the obvious slipping job — the one a crew lead would catch in a Monday walkthrough and Dana's agent is now blind to. Both drifts are present. One looks alive in the logs.</p><p><em>From our own shop:</em> We found a job in our own system — task-triage — that had been firing every morning for eight days and crashing on its last line before doing anything. It looked alive in the logs. It produced nothing. That's world-drift you can't see from “did it run.”</p>
- You will be able to name, for your OWN agent setup, one thing that broke because the world changed (a process or SOP drifted) and one thing that will break because the model got better (a guardrail you added that a smarter model now trips over).
Do thisList your agent setup's parts and mark each: current, or written for a world/model that has moved? Find one world-drift item and one model-drift item.
07.2Read Your Harness
<h3>Four surfaces</h3><p>A harness has four surfaces, and drifts hide in all of them. <strong>Instructions</strong> are the SOPs, prompts, and rules you wrote — the things that tell the agent what to do and what not to do. <strong>Memory</strong> is what the system remembers between runs — the context it carries forward. <strong>Capability</strong> is its tools and how far their reach actually extends. <strong>Scheduling</strong> is what runs on its own, on what trigger, at what cadence. If you can't name all four for your own setup, you don't know what you're maintaining.</p><h3>Five checks</h3><p>Run these against any surface to surface a finding — a specific, evidenced claim about what you actually saw:</p><p><strong>Eats</strong> — what does it read, and is that current or stale? <strong>Reach</strong> — what can it touch, and is that too broad or too narrow now? <strong>Job</strong> — has its actual purpose quietly drifted from its stated one? <strong>Proof</strong> — can it show a linkable trail, or is it asserting without evidence? <strong>Value</strong> — does anyone use the output?</p><p>Five checks, not five questions to answer generically. Each one, on each surface, either surfaces a finding or clears the surface. A cleared surface is as useful as a finding — it narrows where the drift actually is.</p><h3>Dana runs the checks</h3><p>Dana maps her job-status agent across the four surfaces: the SOP file (Instructions), the crew-lead roster it reads at start of run (Memory), the scheduling tool connection (Capability), the Monday 6am trigger (Scheduling). Then she picks one surface — Instructions — and runs all five checks on it. The Eats check surfaces the finding: the SOP names crew leads from last quarter. That is a finding. It has a pointer. It is not a feeling.</p><p><em>From our own shop:</em> We ran exactly this across our own system's four surfaces. The memory surface alone turned up an index that had grown so large it was silently truncating at load — dropping context every session while looking fine.</p>
- You will be able to map your own harness into its four surfaces and run the five checks against one surface to produce a specific, evidenced finding.
Do thisPick ONE of your four surfaces. Run all five checks on it. Write one evidenced finding — what you actually saw, with a pointer, not a feeling.
07.3The Delete Instinct
<h3>Add vs. ask what to remove</h3><p>The beginner instinct is to add: more rules, more tools, more memory. The maintainer's instinct is different. It asks what to remove. Vercel made its agent better by deleting 80% of its tools — not by building more. A harness that accumulates without pruning becomes a system nobody can read and nobody trusts.</p><h3>A cut is a claim</h3><p>But a cut is not a reflex. A cut is a claim: this thing is dead. Prove it. Dead by a documented decision — someone chose to stop doing that thing. Dead by data — usage telemetry shows zero invocations and the absence is explained. Not just quiet. Not just unused this month. Dead.</p><p>Here is the trap that makes this hard: your harness encodes an old picture of the world. So it will confidently tell you what to cut — and it will be wrong about the thing you're about to need. Low usage is not death. It is a prompt to investigate. What changed in the business that made this thing quiet? Is the answer “nothing” or “actually, something's coming”?</p><h3>Dana's compliance checklist</h3><p>Dana's job-status agent has a compliance-checklist step that almost never fires. She adds it to the cut list. Then she remembers: the bigger contract she's bidding next quarter requires exactly that checklist. The agent isn't dead. It's armed. The cut would have amputated live muscle, and nobody would have noticed until the bid was already in.</p><p><em>From our own shop:</em> Our own audit flagged ~31 compliance skills as dead — zero uses, ever. We almost cut them. They were armed for an enterprise move the system's own instructions didn't yet know about. The operator caught it. The fix wasn't to cut — it was to write down the missing fact. A stale harness will hand you a confident, wrong cut list.</p>
- You will be able to take a 'cut candidate' from your harness and decide cut / keep / verify-first — and to say why a low-usage item might be exactly the thing to keep.
Do thisTake one thing in your harness you'd be tempted to delete for being unused. Before cutting, decide: dead by a DECISION, dead by DATA, or just quiet? If it's just quiet, what changed in your business that it might be waiting for?
07.4The Maintenance Ritual
<h3>Maintenance is a voyage, not a launch</h3><p>The mature question is not “can I get this boat off the dock?” It is “can I keep this boat afloat over the trip?” Those are different questions, and they require different habits. A harness that gets checked once at launch and never again is not a maintained system — it is a running bet that nothing has drifted.</p><h3>The Harness Pass</h3><p>A Harness Pass is the five checks run on a cadence. Quarterly is a floor; a named trigger fires one off-cycle. Model upgrade: run a pass. A process changes: run a pass. A new crew lead joins or a crew lead leaves: run a pass on the Instructions surface at minimum. The cadence without the trigger is not enough — harnesses actually break in between scheduled passes, and the trigger is what catches the off-cycle drift before it costs you.</p><h3>The Maintenance Log</h3><p>One row per finding: <em>surface · finding with pointer · keep/prune/repair decision · verified-by</em>. The log does two things. It creates a paper trail so the next pass can start from what the last one found rather than re-discovering the same thing. And it makes every cut defensible — “we pruned this on this date, for this reason, verified by this run.”</p><h3>Telemetry beats memory</h3><p>Measure what actually fires before you cut. A tally of which tools and agents invoke each capability tells you where the usage actually is — not where you assume it is. Memory of how often something runs is wrong more often than it is right. The number is the finding; the intuition is the starting hypothesis.</p><p>This module hands off to the gate. You're about to run a real pass — not a drill, not a worked example. Your system, your four surfaces, your five checks, your log.</p><p><em>From our own shop:</em> We built a nightly tally of which skills and agents actually fire, so our next cut runs on data, not memory. The first run showed only 102 of 190 skills had ever fired — but, per the last module, that's a list to investigate, not a kill list.</p>
- You will be able to turn the five checks into a recurring Harness Pass with a named cadence and triggers, and keep a Maintenance Log that makes the next pass faster.
Do thisWrite your Harness Pass cadence in one line: when you'll run it and which event triggers an off-cycle pass. Then start your Maintenance Log with the findings you already have.
Michael Sebastian
I install AI for operators. The Lab is where the method is taught, and where my clients onboard. This course is that method.
More about me →Asked, answered.
Do I need a technical background?
No. The track is written for operators, not engineers. If you run a business or a role and you’re honest about where you actually are, you have the prerequisites.
How long does this course take?
Lessons are short on purpose — one idea per screen. Most people finish a course in two or three sittings. The worksheets take longer, because they’re real work on your real business. That’s the point.
What does the $497 option add?
Our eyes on your audit. $497 is the Working Intelligence Audit course plus our written review — we read your submissions and send a one-page response: what your scores say, and what to install first. It’s the bridge between self-serve and working with us directly.
Start with the free course — this one unlocks on the way.
This course is $397 on its own. All five together are the Track — $1,197, where buying them one at a time runs $1,985. Course 06 unlocks this one — the chain matters, each course feeds the next its raw material.
Live-class attendees: your $100 credit applies. Or enter the Lab directly.


