I Wrote 25 Rules to Fix My AI. Compliance Dropped to 7%.

Last week I added another rule to my AI operating manual to fix a problem the previous rules had created. It made things worse. Then I did the math. Writer's 2026 Enterprise AI Report says 54% of C-suite admit AI adoption is "tearing their company apart." Most leaders read that stat as a tooling problem. It isn't. It's a compliance-decay problem, and the math is unforgiving.

For the past three weeks I've been building a working partnership with an AI, not chatting with it but partnering with it across dozens of sessions on real work. The system runs on an operating manual: documents that tell the AI how to behave, what to verify before writing, when to ask before acting. By the third week, that manual had grown to twenty-five principles, each one earned by a specific failure.

The AI ignored an instruction. I wrote a rule. The AI optimized away a verification step. I wrote a rule. The AI agreed too quickly with my self-doubt in a coaching session. I wrote a rule. Every rule made sense on its own.

Then last week the system broke. Not loudly. The AI was reading the manual. The manual said exactly what to do. The AI did something else. I reached for the same fix I'd reached for two dozen times before: add another rule.

This time I stopped and did the math.

Across fifty-three working sessions I'd watched soft-language rules, the ones phrased as "prefer" or "consider" or "try to," land at roughly 90% per-instance compliance under context pressure. That's not a published benchmark. It's the curve my own data fell on. The math from there is unforgiving. One rule, 90%. Six rules in a sequence is 0.9^6 = 53%. Twenty-five rules is 0.9^25 = 7%. Thirty rules is 0.9^30 = 4%.

The more rules I added, the lower the per-step compliance got. The fix I kept reaching for was the disease.

This isn't unique to me. It's the shape of every AI governance policy I've seen in the wild. A team rolls out a Copilot deployment. A policy gets written. Then somebody's report comes out wrong, and a rule gets added. Then somebody's prompt produces a hallucinated source, and a rule gets added. By the end of the quarter there are 30 rules in a policy doc nobody reads, sitting on a SharePoint nobody opens, and the AI is producing the same kinds of errors it produced before the policy existed.

This is the implementation gap right inside the governance layer.

Lane two, not the people building AI but the people making it actually work inside organizations, has a math problem the field hasn't named yet. Adding another rule when a rule fails feels like good governance. It looks like good governance on a slide. It is mathematically the opposite of good governance, and you can prove it from a single morning's session log.

Two things cause soft-language rules to compound badly.

The first is rationalization. Words like "prefer" and "consider" leave room for the AI to read a rule as judgment-permitted. Under pressure, soft language gets interpreted as "this is a suggestion, not a constraint." The AI takes a path that doesn't follow the rule. The rule didn't fail. The language did.

The second is instruction weight. The AI is not a human reading a 30-page policy and selectively recalling the relevant section at the right moment. It processes all 30 rules every time it responds. The cost is real and the cost compounds with every rule you add.

Stanford's Digital Economy Lab studied 51 successful enterprise AI deployments and found that the organizations getting real value had reached what they called "strategic integration." AI adoption became an organizational discipline rather than a project someone launched. Their finding: the difference was never the model. Read that finding through the math, and a different picture emerges. The orgs that got value were the ones that figured out their rules didn't scale linearly. The orgs that didn't kept stacking rules and getting worse compliance.

The fix is architectural, not additive. I'll get into the specific architectural moves over the next three editions of this newsletter. For this week, the move that matters is the frame shift. When an AI rule fails, the question is never "what rule do I add?" The question is "what's the per-instance compliance on the rules I already have, and is this failure the natural consequence of the math?"

The governance conversation right now is about more: more guardrails, more review layers, more policy on top of more policy. The math says we should be having a different conversation. Less, but mandatory where mandatory matters.

What This Means in Practice

If you run an AI policy or workflow at your organization, here are three questions worth running against it this week.

How many rules are in your AI policy? Most teams I talk to can't answer this without counting. The number is your N. If it's 6, your per-step compliance under context pressure is around 53%. If it's 15, you're at 21%. If it's 30, you're at 4%. None of those are acceptable, and all of them are the natural consequence of the math, not a sign your team is incompetent.

How many of those rules use soft language? Open the policy. Count instances of "prefer," "consider," "try to," "should generally," "where possible," "as appropriate." Each one is a rule the AI is permitted to rationalize away. Some of those rules should stay soft, like judgment-permitted style guidance or tone. Most shouldn't. The data integrity rules, the verification rules, the citation rules need to be mandatory or they're decoration.

Which rules were added in response to a specific failure? Pull the change history. Every rule added after a failure is a candidate for being the additive fix that's now contributing to the next failure. Not all of them are. But until you've audited them, you don't know how much compliance debt you're carrying.

The shift these questions force is from policy quality to policy math. Most AI governance discussions ask whether the rules are right. The math says the harder question is whether you have too many rules to actually enforce, and which language tier each one belongs on.

One Thing to Do This Week

Open your team's AI policy document. Count the total number of rules. Count the soft-language ones. Mark which ones were added in response to a specific failure. Send those three numbers to whoever owns the policy with one subject line: "Compliance math on our AI policy — should we talk?" Walk into that meeting with the math and the audit. Next week's edition gets into what to do with the audit once you have it.

This is Part 1 of a four-edition series on Compliance Decay: why AI governance policies stop working, and the architectural moves that fix them. Part 2 (May 19) covers the soft-vs-mandatory language tier. Part 3 (May 26) covers structural gates instead of behavioral rules. Part 4 (June 2) covers values documents that travel with the work.

The Implementation Lane is a weekly newsletter about making AI work inside real organizations. Written by Amanda Crawford, an AI Implementation Specialist who builds systems in the gap between configuration and engineering. If someone forwarded this to you, subscribe here.

Keep Reading