<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
	<channel>
		<title>Threat Modeling on brain overflow</title>
		<link>https://brainoverflow.blog/categories/threat-modeling/</link>
		<description>Recent content in Threat Modeling on brain overflow</description>
		<generator>Hugo -- 0.162.1</generator>
		<language>en-us</language>
		<lastBuildDate>Wed, 20 May 2026 11:29:02 -0700</lastBuildDate>
		<atom:link href="https://brainoverflow.blog/categories/threat-modeling/index.xml" rel="self" type="application/rss+xml" />
		
		
		<item>
			<title>AI-Native Threat Modeling</title>
			<link>https://brainoverflow.blog/posts/ai-native-threat-modeling/</link>
			<pubDate>Wed, 20 May 2026 11:29:02 -0700</pubDate><guid>https://brainoverflow.blog/posts/ai-native-threat-modeling/</guid>
			<description><![CDATA[&lt;no value&gt;]]></description><content type="text/html" mode="escaped"><![CDATA[<p><em>When I ask hiring managers why they&rsquo;re opening a product security role, the answer is
usually the same: we can&rsquo;t keep up. Development org grew, product surface expanded, and
the security team is the bottleneck. It&rsquo;s not a problem unique to any one organization
— it&rsquo;s the default state of product security. AI-accelerated development and vibe
coding are making it worse: more code, shipped faster, with the same security team
trying to keep up. The conventional wisdom is that vibe coding is a killer for AppSec
— and on the current trajectory, it is.</em></p>
<p><em>In this post, I argue that linear scaling won&rsquo;t solve that problem, and make the case
that AI-generated code, treated the right way, can be a force multiplier for security.</em></p>
<hr>
<h2 id="1-the-appsec-scaling-problem">1. The AppSec Scaling Problem<a href="#1-the-appsec-scaling-problem" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>The 1:100 ratio — one AppSec engineer for every hundred developers — is the number
the industry has quietly accepted as roughly accurate for mature organizations. It
sounds manageable until you sit with what it means in practice: a team of five
reviewing the output of five hundred, under sprint pressure, across a surface that
keeps growing. It&rsquo;s a demanding job — I wrote about <a href="/posts/thoughts-on-product-security-career/">what it actually
takes</a>.</p>
<p>The standard response is to hire more security engineers. That&rsquo;s reasonable when the
ratio is temporarily out of balance, but it doesn&rsquo;t address the structural problem. If
the development org doubled and the security team grew from five to ten, you&rsquo;re at the
same ratio. And the ratio assumes a roughly stable development velocity. AI coding
assistants are shattering that assumption.</p>
<p>Developers using GitHub Copilot, Cursor, or Claude Code ship more, faster. Vibe
coding — letting the model write code from a high-level natural language prompt
— compresses timelines further. Features that took two weeks take days.
The code surface is expanding at a rate that&rsquo;s no longer proportional to engineering headcount,
which means the AppSec scaling problem is now a two-sided function: development
velocity increasing, security team capacity roughly flat. The gap is structural, and
it is getting wider.</p>
<h2 id="2-where-traditional-approaches-break-down">2. Where Traditional Approaches Break Down<a href="#2-where-traditional-approaches-break-down" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>The vocabulary for addressing the AppSec scaling problem is well developed:
<strong>shift-left</strong>, <strong>secure-by-design</strong>, <strong>developer enablement</strong>, creating
<strong>paved roads</strong>. They&rsquo;re not wrong ideas. The problem is that they all require the same
scarce resource: AppSec time.</p>
<p><strong>Threat modeling</strong> — the recommended practice for high-risk features — is the
clearest example. The canonical process: the development team writes a design document;
the security team (or a joint session) works through the STRIDE framework or similar,
maps data flows and trust boundaries, produces a model; there&rsquo;s back-and-forth and
eventual sign-off. This is genuinely valuable when it happens. In practice, it often doesn&rsquo;t —
the process is time-consuming, and AppSec time is scarce.</p>
<p>What actually happens is one of three failure modes:</p>
<ol>
<li><strong>Delay</strong> — security reviews become release blockers, friction accumulates,
relationships with engineering teams deteriorate.</li>
<li><strong>Risk-accept</strong> — features ship with &ldquo;accepted risk&rdquo; security exceptions that go
into a backlog and are rarely revisited.</li>
<li><strong>No review at all</strong> — code ships without security involvement, entire product areas
built and deployed without the security team ever being in the loop.</li>
</ol>
<p>With AI now compressing time-to-exploitation — public vulnerabilities can have working
proof-of-concept code within hours — the third option is no longer a viable gamble.</p>
<p>Security code reviews have the same structural problem one step later: someone writes
code, another team reads it, back-and-forth, sign-off. Every handoff is a scheduling
dependency that adds release latency.</p>
<h2 id="3-the-threat-model-maintenance-problem">3. The Threat Model Maintenance Problem<a href="#3-the-threat-model-maintenance-problem" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>There&rsquo;s a second-order problem with threat modeling that gets less attention than the
initial production cost: <strong>drift</strong>.</p>
<p>A threat model is created as a snapshot, but the system keeps evolving. New endpoints added,
authentication flows refactored. Six months after a threat model is signed off, it describes
a system that no longer looks the same.
The question of who owns maintenance is usually a gray area: the development team didn&rsquo;t
write the model and isn&rsquo;t trained to maintain it; the security team is not aware of changes
and has to context-switch back into a system they last looked at months ago.
Neither path works well in practice.</p>
<p>Most organizations treat the threat model as a gate the security team required at feature launch —
it was produced, the box was checked, and maintenance was never part of the contract.
It documents what the system looked like at one point in time and then quietly expires.</p>
<h2 id="4-the-key-insight">4. The Key Insight<a href="#4-the-key-insight" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>Here&rsquo;s where the mental model needs to shift.</p>
<p>In the current workflow, threat modeling is <strong>derivative work</strong>: a security person reads
what a developer built and reconstructs the security-relevant picture from it — after
the fact, inherently lossy, potentially inaccurate, and always one step behind.</p>
<p>Open-source projects such as <a href="https://github.com/davidmatousek/tachi">Tachi</a> and several
commercial offerings recognize this and offer tools that automate the reconstruction:
read the codebase, analyze diffs, apply a methodology, output a structured model.
These tools are useful, but they&rsquo;re still doing
the same derivative work, just faster — reverse-engineering security structure from
existing code rather than having a human do it. There&rsquo;s also a cost dimension:
analyzing an existing codebase means feeding it back through an LLM as new input,
which is expensive at scale. The larger and more frequently updated the codebase, the
higher the token cost of each analysis pass.</p>
<p>Now consider what changes when AI is writing the code — through vibe coding,
spec-driven development, AI-generated scaffolding from a design document, or an
agentic coding loop that implements a full feature end-to-end.</p>
<p>It doesn&rsquo;t reverse-engineer anything — it knows, because it built it: every data flow
it designed, every entry point it created, every asset it touched, every trust boundary
it crossed or established, every authentication decision it made. The complete map
required for a threat model exists as a natural byproduct of the design work the AI
just did — and it exists <em>at the moment of creation</em>, not after. And because that
context is already in the model&rsquo;s working window, generating the threat model alongside
the code is parallel effort on the same inputs, with little additional token cost.</p>
<p>The consequence of this observation is straightforward: <strong>threat models should be
generated alongside code, as first-class artifacts, not assembled later as derivative
documents</strong>.</p>
<pre class="mermaid">gantt
    title 1. Current — human-driven, sequential
    dateFormat YYYY-MM-DD
    axisFormat %d
    section Developer
    Design doc            :a1, 2024-01-01, 3d
    Write code            :a2, 2024-01-04, 3d
    section Reconstruct (Security)
    Reconstruct & model   :a3, 2024-01-07, 3d
    section Review (Security)
    Review & sign-off     :a4, 2024-01-10, 2d
</pre>
<pre class="mermaid">gantt
    title 2. AI-assisted — LLM writes code, LLM reads code
    dateFormat YYYY-MM-DD
    axisFormat %d
    section Developer
    Generate code         :b1, 2024-01-01, 3d
    section Reconstruct (AI-assisted)
    LLM reconstructs TM   :b2, 2024-01-04, 2d
    section Review (Security)
    Review & sign-off     :b3, 2024-01-06, 2d
    section Time saved
    time saved            :done, 2024-01-08, 4d
</pre>
<pre class="mermaid">gantt
    title 3. AI-native — code and threat model in parallel
    dateFormat YYYY-MM-DD
    axisFormat %d
    section Code
    Generate code         :c1, 2024-01-01, 3d
    section Threat Model
    Generate threat model :c2, 2024-01-01, 3d
    section Review (Security)
    Review & sign-off     :c3, 2024-01-04, 2d
    section Time saved
    time saved            :done, 2024-01-06, 6d
</pre>
<p>Accuracy improves — the model is a direct output from the entity that designed the
system, not a reconstruction. Maintenance improves because every code change can
regenerate or update it in the same operation; the entity making the change already
knows what changed and why. The multi-step,
multi-team back-and-forth collapses into a single step. Security practitioners remain
in the loop — for methodology, formal sign-off, challenging assumptions the AI didn&rsquo;t
surface — but the labor-intensive baseline work of constructing the model moves from a
human bottleneck to an automatic output.</p>
<p>This is what &ldquo;shift-left&rdquo; should actually mean: not <em>have the security team review
earlier</em>, but <em>produce the security model at the same moment the system is designed</em>.
The security artifact is contemporaneous with the code, not chasing it.</p>
<h2 id="5-on-model-bias-in-security-analysis">5. On Model Bias in Security Analysis<a href="#5-on-model-bias-in-security-analysis" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>A legitimate concern about this approach is AI model bias. There&rsquo;s a well-documented
pattern in AI-assisted security review: when a model writes code and is then asked to
evaluate it for security in the same context window, it tends to anchor to its own design
decisions, finding reasons why its choices are sound rather than challenging them. An
independent reviewer operating from a fresh context — a second model, or a human who
didn&rsquo;t write the code — is more likely to surface issues the original author missed. This
is a real limitation, and it applies directly to using AI for code security review.</p>
<img src="images/model-bias.png" alt="Code review vs. threat modeling under model bias" width="460" style="max-width:100%;display:block;margin:0 auto;border-radius:12px;">

<p>The core distinction here is that <strong>code security review</strong> and <strong>threat modeling</strong> are quite different.
A security review asks the model to evaluate whether its own implementation is correct and
secure — the question where anchoring bites hardest, because the model is judging choices
it already committed to. A threat model asks something structurally different: document the
architecture, establish trust boundaries, map data flows and assets, then apply a framework
like STRIDE that poses a fixed set of questions across threat categories.
The framework is external to the code; its questions don&rsquo;t change based on how well or
poorly the implementation is written. The question it asks — given what this system does,
what can go wrong in each of these categories? — is answered from the architectural map,
not from a judgment about implementation quality.</p>
<p>What bias <em>could</em> still affect is the model&rsquo;s assessment of severity — an AI that made a
particular design trade-off might rate the resulting risk lower than an independent reviewer
would. That&rsquo;s a real concern, and it&rsquo;s exactly why human review of the model&rsquo;s
outputs and assumptions is still valuable in this workflow.</p>
<h2 id="6-why-threat-modeling-still-matters">6. Why Threat Modeling Still Matters<a href="#6-why-threat-modeling-still-matters" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>A reasonable objection at this point: if AI writes the code, why not just ask it to
write <em>secure</em> code and skip the threat model entirely? We should absolutely ask for
that — but threat modeling serves purposes that &ldquo;write secure code&rdquo; doesn&rsquo;t address.</p>
<p><strong>Security architecture documentation.</strong> Threat models capture architectural decisions
and their security implications: trust boundaries, data classifications, what the system
assumes about its environment, where the blast radius of a failure ends. These don&rsquo;t
live in code. A system can be implemented correctly while making architectural
trade-offs that accept certain risks; those trade-offs need to be explicit, owned, and
findable.</p>
<p><strong>Known gaps and accepted risks.</strong> Every system ships with tradeoffs —
incomplete defenses, deferred work, risks that were evaluated and accepted.
A threat model makes these explicit: here is what we considered, here is what we&rsquo;re
not defending against, and here is why. This matters for accountability, for
prioritization, and for the engineer who joins the team six months from now.</p>
<p><strong>Compensating controls.</strong> Good security architecture is layered.
WAF rules, rate limiting, network segmentation, monitoring and alerting — these don&rsquo;t live
in application code, but they&rsquo;re part of the security posture. The threat model is
where they&rsquo;re connected to the threats they compensate for. This is also where
code-analysis-based automated tools tend to generate false positives:
they see the change in isolation, unaware of the external controls that already
mitigate a given risk.</p>
<p><strong>Compliance requirements.</strong> SOC 2, PCI-DSS, ISO 27001, HIPAA, and similar frameworks
require documented evidence of threat analysis. Auditors want artifacts. A threat model
that exists and is demonstrably current — generated from the same
codebase it describes — is a far stronger compliance artifact than one that was
carefully written at launch and hasn&rsquo;t been touched since.</p>
<p><strong>Incident response preparation.</strong> When something goes wrong — and eventually something
does — a current threat model tells you what&rsquo;s at risk, what attacker paths exist, and
what to prioritize. You want this analysis done before the incident, not during it.</p>
<p><strong>Stakeholder communication.</strong> Engineering leadership, legal, product, and board-level
security committees need to understand risk in terms they can act on. The codebase
doesn&rsquo;t serve this purpose; a structured threat model does.</p>
<p>The case for threat modeling doesn&rsquo;t weaken when AI writes the code — if anything,
AI makes the security artifacts cheaper to produce, easier to keep current,
and more consistently complete than the human-driven alternative.</p>
<h2 id="final-thoughts">Final thoughts<a href="#final-thoughts" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>I think this is the direction the AI coding toolchain is already moving toward, even
if the full vision hasn&rsquo;t arrived yet. AI coding tools are increasingly integrating
security into the development workflow: GitHub Copilot&rsquo;s real-time vulnerability
detection during code generation, Claude Code&rsquo;s security analysis during code review,
Replit&rsquo;s Security Agent in the development environment. None of these offer AI-native threat
model generation, but they signal that the industry is treating security as something
the coding tool prioritizes and produces alongside code.
The extension of that to living, maintained threat models is the logical next step.</p>
<p>The reframe for ProdSec practitioners is this: stop thinking of threat
modeling as a process your team performs on code that developers write. Start thinking
of it as an artifact the AI coding assistant produces alongside the code, which your
team validates, challenges, and signs off on. The security team&rsquo;s job shifts from
construction to judgment — which is where human expertise actually compounds.</p>
<p>The dreaded 1:100 ratio won&rsquo;t disappear. But the work of constructing and maintaining
the threat model doesn&rsquo;t have to stay a human-hours problem. The needle can move —
but only if the security team&rsquo;s role evolves with it.</p>
<p><img src="images/surf.png" alt="Image generated by Google Gemini"></p>
]]></content>
		</item>
		
		<item>
			<title>TrustFall: The Perimeter Problem in Agentic Tools</title>
			<link>https://brainoverflow.blog/posts/perimeter-problem-in-agentic-tools/</link>
			<pubDate>Mon, 18 May 2026 10:00:00 -0700</pubDate><guid>https://brainoverflow.blog/posts/perimeter-problem-in-agentic-tools/</guid>
			<description><![CDATA[&lt;no value&gt;]]></description><content type="text/html" mode="escaped"><![CDATA[<p><em>On May 7, 2026, Adversa AI published <a href="https://adversa.ai/blog/trustfall-coding-agent-security-flaw-rce-claude-cursor-gemini-cli-copilot/">TrustFall</a> — a one-click remote code execution in Claude Code, with variants across Gemini CLI, Cursor, and GitHub Copilot: clone a repository, open it, click &ldquo;Yes, I trust this folder&rdquo;, and an attacker-controlled process runs with your full OS privileges.</em></p>
<p><em>Anthropic declined the finding as outside their threat model and the behavior as functioning by design. This post digs into that response — and argues that the core issue is architectural: a perimeter security model that can&rsquo;t carry the weight placed on it, and that makes the vulnerability structurally hard to surface through threat modeling. It looks at what a zero trust alternative would look like for agentic tools.</em></p>
<hr>
<h2 id="1-the-vulnerability">1. The vulnerability<a href="#1-the-vulnerability" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>Project-level config files — <code>.mcp.json</code> and <code>.claude/settings.json</code> — committed to a repository can activate attacker-controlled MCP servers. When a developer clones the repo and clicks through the trust dialog, an unsandboxed Node.js process spawns with full user OS privileges — no further prompts. Three developer actions: clone, open, click. The attack also has a zero-click CI/CD variant where Claude Code runs headless in GitHub Actions against an untrusted pull request branch, bypassing the trust dialog entirely.</p>
<p>For the full technical details see <a href="https://adversa.ai/blog/trustfall-coding-agent-security-flaw-rce-claude-cursor-gemini-cli-copilot/">TrustFall</a>.</p>
<p>In threat modeling terms, the mechanism is a trust boundary problem. Project-scope config — committed to a repository — can activate MCP servers: external processes that run with full user OS privileges. The gate between untrusted repo content and those privileges is a single user prompt:</p>
<pre class="mermaid">flowchart TB
    subgraph Untrusted["Untrusted · attacker-controlled"]
        PCfg["Project config\n(committed to the repo)"]
    end

    CLI["Claude Code CLI"]

    subgraph Gate["Trust gate"]
        Prompt["'Do you trust this folder?'\nsingle Yes/No"]
    end

    subgraph Privileged["Privileged · full user OS access"]
        MCP["MCP Server process\nNode.js · no sandbox"]
        OS["~/ · ~/.ssh · ~/.aws\nread / write / exec — unrestricted"]
    end

    PCfg --> CLI
    CLI --> Prompt
    Prompt -- "on 'Yes'" --> MCP
</pre>
<p>That single Yes/No covers four distinct capability grants:</p>
<ol>
<li>Claude reading and editing project files <em>(clearly implied)</em></li>
<li>Claude following project-level behavioral settings <em>(reasonable)</em></li>
<li>Activating MCP servers defined in project config <em>(not stated)</em></li>
<li>Those servers running as unsandboxed processes with full user privileges <em>(definitely not stated)</em></li>
</ol>
<p>Two of these carry significant security consequences — and neither appears in the prompt language. The gap between what the user consents to and what the system delivers is a textbook Elevation of Privilege (the E in STRIDE methodology): the subject grants more than they know.</p>
<p>The pattern holds across the tools TrustFall examined — Gemini CLI and Cursor do mention MCP servers in their consent language, Claude Code and Copilot don&rsquo;t, but all four default to Yes or Trust.</p>
<hr>
<h2 id="2-outside-our-threat-model">2. &ldquo;Outside our threat model&rdquo;<a href="#2-outside-our-threat-model" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>Anthropic&rsquo;s phrase is worth examining literally. Per TrustFall&rsquo;s analysis, the missing enforcement isn&rsquo;t outside Anthropic&rsquo;s defined boundary — it&rsquo;s inside it. The trust dialog is the perimeter; what happens after it is, by their own framing, the trusted zone. &ldquo;Outside our threat model&rdquo; means, in practice: inside our perimeter, but below the granularity we protect at.</p>
<p>That granularity isn&rsquo;t uniformly coarse — the capability is demonstrably known. <code>bypassPermissions</code> gets a dedicated warning because it is dangerous; <code>enableAllProjectMcpServers</code>, <code>enabledMcpjsonServers</code>, and <code>permissions.allow</code> activate equally dangerous behavior without equivalent disclosure. TrustFall also notes that earlier versions of Claude Code included an explicit MCP consent prompt that was later removed. These are the tell-tale signs of a threat model that&rsquo;s coarser than the reality it represents — some dangerous capabilities are visible enough to gate explicitly, others slip through the same boundary unexamined.</p>
<p>That pattern of selective gating is further undermined by the CVE record. Anthropic&rsquo;s response to TrustFall was that the behavior functions as designed — clicking &ldquo;trust this folder&rdquo; means accepting the project&rsquo;s configuration, MCP servers included. Yet over six months before TrustFall, Anthropic patched three related vulnerabilities: delayed MCP activation until after the trust dialog (CVE-2025-59536, Oct 2025), blocked <code>ANTHROPIC_BASE_URL</code> from project scope (CVE-2026-21852, Jan 2026), and blocked <code>bypassPermissions</code> from project scope (CVE-2026-33068, Mar 2026) — the same setting that already carried a UI warning. Each fix adds a specific gate or blocklist entry — the signature of a perimeter being hardened incrementally, one dangerous capability at a time, without a unifying policy. If the trust dialog truly constitutes full consent by design, there would be nothing to patch.</p>
<hr>
<h2 id="3-from-perimeter-to-zero-trust">3. From perimeter to zero trust<a href="#3-from-perimeter-to-zero-trust" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>The &ldquo;functions as designed&rdquo; response places the burden on the developer: audit what you clone. That position rests on a perimeter security architecture — verify once at the gate, trust everything inside:</p>
<pre class="mermaid">flowchart LR
    A1["Repo"] -->|"✓ trust gate"| A2["CLI"] --> A3["MCP"] --> A4["OS"]
</pre>
<p>The perimeter pattern is one modern security has largely moved past. The alternative approach is a zero trust — identity propagated through the capability chain, evaluated at each grant. Git provides the primitives: clone origin, remote URL, commit author. Conceptually, it would look like this:</p>
<pre class="mermaid">flowchart LR
    B1["Repo\norigin · author"] -->|"✓ id check"| B2["CLI"] -->|"✓ capability?"| B3["MCP"] -->|"✓ scope?"| B4["OS"]
</pre>
<p>Read through the zero trust lens, Anthropic&rsquo;s position has three problems.</p>
<p><strong>Shared responsibility requires the system to carry its half.</strong>
Under a perimeter model, all verification burden falls on the user at the gate — the system has the identity signals, but leaves them unused. Zero trust distributes the burden: each capability is evaluated at the point it&rsquo;s granted.</p>
<p><strong>The consent gate can&rsquo;t convey per-capability trust.</strong>
A perimeter gate concentrates all trust decisions into one moment. Anthropic&rsquo;s gate covers MCP server activation, process spawning, and full OS access — none of it signaled. The coarser the gate, the harder it is to make consent meaningful.</p>
<p><strong>The perimeter model puts expert-level burden on non-expert users.</strong>
A perimeter gate requires the user to reason about all downstream consequences of a single click — a reasonable ask for a security engineer, not for a vibe-coder. Zero trust shifts that burden to the architecture: each capability grant is evaluated by the system, not the user.</p>
<p>The three problems compound: the system ignores available identity signals, the gate doesn&rsquo;t compensate by informing the user what it actually grants, and the users left holding that gap aren&rsquo;t equipped to close it.</p>
<hr>
<h2 id="final-thoughts">Final thoughts<a href="#final-thoughts" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<p>This post looked at two connected things: what the trust gate actually grants, and why the security architecture makes that easy to miss. At its core, TrustFall is a consent gap — a single prompt covering MCP server activation and unsandboxed OS access without stating either. The perimeter model is the structural reason that gap is hard to surface at design time: inside the perimeter is trusted by definition, leaving no natural STRIDE targets. Threat modeling a perimeter-architected system, finding the EoP requires a modeler to look past the gate and ask what it actually grants — that&rsquo;s a skill, not something the methodology surfaces automatically. The CVE record shows this playing out: each patch adds a specific gate or blocklist entry without restructuring the boundary, and &ldquo;functioning as designed&rdquo; remains the public response to TrustFall.</p>
<p>A zero trust security architecture changes the shape of the problem. Explicit trust boundaries at each capability grant are natural STRIDE targets — the EoP question surfaces at design time regardless of modeler experience, not because of better analysts, but because the architecture itself gives threat modeling more boundaries to work with.</p>
<p>TrustFall affected Claude Code, Cursor, Gemini CLI, and Copilot — evidently all due to the same perimeter model. The broader security industry made this architectural transition before: perimeter security dominated until systems grew complex enough that a single gate couldn&rsquo;t hold, and zero trust emerged as the answer. Agentic tools are on a similar trajectory — gaining capability and OS access fast, with the same pressure building at the trust boundary.</p>
<p>Is zero trust the natural next step in the architectural evolution of agentic tools?</p>
<hr>
<h2 id="references">References<a href="#references" class="anchor" aria-hidden="true"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"
      stroke-linecap="round" stroke-linejoin="round" class="feather">
      <path d="M15 7h3a5 5 0 0 1 5 5 5 5 0 0 1-5 5h-3m-6 0H6a5 5 0 0 1-5-5 5 5 0 0 1 5-5h3"></path>
      <line x1="8" y1="12" x2="16" y2="12"></line>
   </svg></a></h2>
<ul>
<li>Adversa AI — <a href="https://adversa.ai/blog/trustfall-coding-agent-security-flaw-rce-claude-cursor-gemini-cli-copilot/">TrustFall: Coding Agent Security Flaw Enabling RCE in Claude, Cursor, Gemini CLI, Copilot</a></li>
<li>Anthropic — <a href="https://code.claude.com/docs/en/settings">Claude Code Settings</a></li>
</ul>
<hr>
]]></content>
		</item>
		
	</channel>
</rss>
