{"id":255,"date":"2026-06-28T15:56:44","date_gmt":"2026-06-28T14:56:44","guid":{"rendered":"https:\/\/knowtech.waszmann.com\/?p=255"},"modified":"2026-06-28T16:01:58","modified_gmt":"2026-06-28T15:01:58","slug":"agentic-ai-authroing-delegating-to-agents-without-outsourcing-accountability","status":"publish","type":"post","link":"https:\/\/knowtech.waszmann.com\/?p=255&lang=en","title":{"rendered":"Agentic AI Authroing: Delegating to Agents Without Outsourcing Accountability"},"content":{"rendered":"<h2>The 2026 shift isn&#8217;t from prompting to partnership. It&#8217;s from conversation to delegation \u2014 and delegation comes with strings the vendors aren&#8217;t talking about.<\/h2>\n<p>Imagine a 40-person consultancy that wires up an automated assistant to read incoming support emails, pull relevant context from the company wiki, and draft replies. To save time, the team flips a switch from &#8220;draft for approval&#8221; to &#8220;send automatically below a confidence threshold.&#8221;<\/p>\n<p>Three weeks later, a client calls. They received an email promising a refund the company never agreed to give. The agent, having generalized from a few past resolutions, decided this was the right answer. The email went out under a real person&#8217;s name. The client has already forwarded it to their legal team.<\/p>\n<h3>Who is responsible?<\/h3>\n<p>The agent has no legal personhood. The vendor&#8217;s terms of service almost certainly exclude liability for output. The team member whose signature appeared on the message didn&#8217;t write it. The manager who flipped the switch didn&#8217;t see this particular email. The agent did the thing, but everyone else is left holding the consequences.<\/p>\n<p>In my last article, I argued that the new models got smarter but not more honest. That asymmetry was uncomfortable when they only talked. It becomes load-bearing now that they act.<\/p>\n<h3>From Conversation to Delegation<\/h3>\n<p>The real shift in 2026 is not what most AI in the workplace articles claim. Models did not become partners. They did not develop judgment, accountability, or stakes. What changed is that they can now do things: browse, click, run code, edit files, send messages, modify state in systems you depend on.<\/p>\n<p><strong>That&#8217;s not partnership. Partners share judgment and consequences. What you actually have is a highly capable delegate: you authorize, it executes, you own the outcome.<\/strong><\/p>\n<p>The distinction matters because the law, the contract, and the org chart already know what to call this. When an agent acts using your API key, your OAuth token, your credentials, your account, your domain: you are the principal. The agent is acting in your name. The legal frameworks are catching up to this reality. In California, AB 316 (Civil Code \u00a71714.46), effective January 1, 2026, prohibits defendants who &#8220;developed, modified, or used&#8221; an AI system from arguing that the AI autonomously caused the harm. In the EU, the new Product Liability Directive,\u00a0 which Member States must transpose into national law by December 9, 2026, extends strict liability to AI systems as &#8220;products&#8221;,\u00a0 with rebuttable presumptions that lower the burden of proof for claimants. The direction is consistent: accountability follows authorization. <strong>Whoever deployed it, owns it.<\/strong><\/p>\n<p>This isn&#8217;t a thought experiment. Browser agents, coding agents, spreadsheet agents, MCP-connected tool agents, they are in production at small and medium businesses right now. Most of those businesses have not thought hard about what that means.<\/p>\n<h3>The Verification Asymmetry<\/h3>\n<p>When an LLM only talks, verification is cheap. You read the output. You catch the mistake or you don&#8217;t, but the cost of missing one is bounded \u2014 usually wasted time, occasionally a bad decision based on bad information.<\/p>\n<p>When an LLM acts, verification has to move *upstream* of the action. Once committed, state changes propagate. Some you can roll back. Some you can&#8217;t.<\/p>\n<h3>A rough taxonomy worth carrying in your head:<\/h3>\n<ol>\n<li><strong>Reversible<\/strong>: drafts, internal queries, reads, sandbox writes. Cost of a wrong action is roughly zero.<\/li>\n<li><strong>Hard to reverse<\/strong>: sent emails, posted messages, calendar invites, most outbound API calls. You can apologize, retract, follow up \u2014 but the recipient already saw it.<\/li>\n<li><strong>Effectively irreversible<\/strong>: deletions without backup, financial transactions, accepted terms of service, public statements, anything where another human or system has already acted on the output.<\/li>\n<\/ol>\n<p>The 2025 reflex of &#8220;let the model run, then read what it produced&#8221; does not survive contact with the second and third categories. By the time you read it, the action has already happened.<\/p>\n<p>This is the structural reason &#8220;trust the agent more&#8221; is the wrong advice for 2026. The question is not how much to trust. The question is *where* in the workflow trust gets verified, and for irreversible actions, that has to be before the action, not after.<\/p>\n<h3>The New Guardrail Stack<\/h3>\n<p>What actually works, in roughly the order most small and medium businesses should adopt it, mirrors established DevSecOps practice, the Microsoft Security Development Lifecycle (SDL) being the canonical example. None of these are new ideas in IT; what is new is that SMEs now operate systems that require them.<\/p>\n<ol>\n<li><strong>Scope limits<\/strong>:\u00a0 What the agent cannot touch is more important than what it can. Use the least-privilege principle from information security: read-only credentials where reads suffice, restricted folders and channels, OAuth scopes pared down to the actual task. An agent that *can* delete the shared drive will eventually try to, given enough turns and a confused prompt.<\/li>\n<li><strong>Cost and resource caps<\/strong>:\u00a0 A runaway ReAct loop or a malfunctioning agent can retry the same task indefinitely and silently burn through your budget, API spend, compute, third-party service fees. Hard limits per turn, per session, and per day let agents fail safely instead of failing expensively. This is the agentic equivalent of a circuit breaker.<\/li>\n<li><strong>Confirmation patterns for irreversible actions<\/strong>:\u00a0 Anything in the third category of the reversibility taxonomy should require an explicit human approval step, ideally with a clear summary of what is about to happen and what cannot be undone. Two-step flows, the agent prepares the action, a human commits it, are unfashionable but they are why aviation and surgery still work.<\/li>\n<li><strong>Dry-run by default:<\/strong> For unfamiliar tasks, have the agent describe what it would do before doing it. This catches the majority of misinterpretations at no cost. It also surfaces hidden assumptions the prompt did not specify.<\/li>\n<li><strong>Reversibility-graded friction:<\/strong> Do not apply the same approval flow to a draft email as to a wire transfer. Match the friction to the blast radius. Otherwise people either turn off the friction because most actions are low-stakes, or they burn out approving everything.<\/li>\n<li><strong>Audit trail:<\/strong> Logs, version control, immutable history, dated snapshots. The question is not whether something will go wrong but how long it takes you to notice. Three minutes of agent misbehaviour is recoverable. Three weeks is a project. Three months is a regulatory event.<\/li>\n<li><strong>Sandbox where possible:<\/strong> Test accounts, staging environments, separate workspaces, scoped containers. The cost of a sandbox is almost always lower than the cost of one wrong action in production.<\/li>\n<\/ol>\n<p>The novelty in 2026 is not the practices themselves. It is that small and medium businesses now operate systems requiring them, without the infrastructure teams that grew up around them in larger organizations.<\/p>\n<h2>What This Means for Knowledge Workers and SMEs<\/h2>\n<p>The skill that distinguishes good agentic AI users in 2026 is not prompting. It is **authorization design**: deciding what the agent is allowed to do, under what conditions, with what verification, in whose name.<\/p>\n<h3>Four questions to ask before delegating any task to an agent:<\/h3>\n<p><strong>1. What is the worst outcome if this goes wrong?<\/strong><br \/>\n<strong>2. Is it reversible? In what timeframe? At what cost?<\/strong><br \/>\n<strong>3. Whose name is on the action when it happens?<\/strong><br \/>\n<strong>4. What is my detection latency \u2014 how long before I notice something is wrong?<\/strong><\/p>\n<p>For small and medium businesses, this is knowledge management work, not IT work. The agent is operating on your processes, your client relationships, your data, your reputation. The architecture decisions, what the agent can touch, who confirms what, where the audit trail lives, are decisions about how your business operates, not just how your tools are configured.<\/p>\n<p>If your company is rolling out agentic AI without explicit answers to those four questions for each use case, you are not deploying technology. You are placing a bet, blind, on a behaviour you have not characterized.<\/p>\n<h3>The Honesty Throughline<\/h3>\n<p>In the last article, I argued that smarter models are not more honest. That meant verification was still your job when the model talked. <strong>Agentic models extend the argument: verification can no longer happen after the fact, because the act has already happened.<\/strong><\/p>\n<p>So structural verification, the guardrail stack above, becomes the new honesty mechanism. Not because the model has become trustworthy, but because you have built an environment in which its untrustworthy moments cost less than its trustworthy ones save.<\/p>\n<p>This is what <strong>calibrated trust<\/strong> actually means in practice. It is not &#8220;trust more&#8221;. It is &#8220;design a system in which the consequences of misplaced trust are bounded, observable, and recoverable&#8221;.<\/p>\n<h3>The Closer<\/h3>\n<p><strong>The AI is not your partner. It is your highly capable delegate. You are still the principal, and in 2026, that distinction is beginning to matter in ways the vendors are in no hurry to explain.<\/strong><\/p>\n<p><strong>Authorize accordingly.<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The 2026 shift isn&#8217;t from prompting to partnership. It&#8217;s from conversation to delegation \u2014 and delegation comes with strings the vendors aren&#8217;t talking about. Imagine a 40-person consultancy that wires up an automated assistant to read incoming support emails, pull relevant context from the company wiki, and draft replies. To save time, the team flips &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[102,55,68],"tags":[35,104,57,59,63],"class_list":["post-255","post","type-post","status-publish","format-standard","hentry","category-agentic-ai","category-ai-en","category-ai-in-practice","tag-advice","tag-agenticai-en","tag-ai-en","tag-gpt-en","tag-llm-en"],"_links":{"self":[{"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/posts\/255","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=255"}],"version-history":[{"count":2,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/posts\/255\/revisions"}],"predecessor-version":[{"id":262,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=\/wp\/v2\/posts\/255\/revisions\/262"}],"wp:attachment":[{"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=255"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=255"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/knowtech.waszmann.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=255"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}