HomeReadTactics deskAuditing AI Agents: 83% of Functions Lack Guards
Tactics·Jun 10, 2026

Auditing AI Agents: 83% of Functions Lack Guards

A static analysis tool identified 553 unguarded functions in AI agent codebases, capable of critical side effects. This signals a shift in security from UI-centric controls to code-level safeguards.…

A static analysis tool identified 553 unguarded functions in AI agent codebases, capable of critical side effects. This signals a shift in security from UI-centric controls to code-level safeguards.

Scanning three open-source AI agent codebases, a static analysis tool identified 669 functions capable of critical side effects, such as writing to databases or deleting files. Of these, the tool's author claims 553, or 83%, lacked any programmatic guard like input validation or authentication. This highlights a foundational security challenge in AI agent development: the shift from UI-centric controls to code-level safeguards.

Agent Security: A New Attack Surface

Traditional web applications route user actions through layers of UI and middleware, including forms, validation, and confirmation dialogs, before executing a side effect. In contrast, an AI agent's Large Language Model (LLM) directly decides which functions to call, with what arguments, and how many times. This bypasses conventional UI-based security, making the agent vulnerable to looping, hallucinating arguments, or responding to injected text. The author states the guard must reside directly within the code, adjacent to the function call itself, rather than in a non-existent UI layer.

Scanning for Unguarded Tool Calls

The author developed diplomat-agent-ts, a static analyzer built on ts-morph (the TypeScript compiler API), to inventory these code-level controls. The tool scans the Abstract Syntax Tree (AST) of TypeScript projects, identifying call expressions that match over 40 patterns across 12 categories of side effects. These categories include payment, database_write, file_delete, http_write, and agent_invocation. The scanner operates without configuration files and reportedly processes a 7,874-file codebase in approximately nine seconds.

Defining a Guard

Within the context of diplomat-agent-ts, a "guard" is a syntactically visible, in-file control. This includes common security patterns such as input validation libraries (e.g., Zod, Yup, class-validator), rate limiting decorators, explicit authentication checks, confirmation steps, idempotency keys, or retry bounds. The scanner classifies each identified tool call into one of three states: no_checks (no guard), partial_checks (some but incomplete coverage), or confirmed (explicitly acknowledged with a // checked:ok annotation). The author notes that the confirmed state relies on the scanner's own convention, meaning external codebases, by definition, show zero confirmed calls. The scans were performed on unmodified public clones at pinned commits, with all commands documented in a MANIFEST.md file for reproducibility. The author frames the output as an inventory, not a score, intended to map areas requiring attention.

Moving Beyond Inventory to Remediation

The diplomat-agent-ts tool offers a valuable inventory, but its utility for founders extends beyond merely identifying unguarded calls. The "inventory, not a score" framing is a necessary starting point, yet the next tactical step involves prioritizing and remediating these findings. A static analysis tool identifies potential vulnerabilities based on code patterns; it does not confirm exploitability or cover runtime-specific security issues. Founders should integrate this inventory into a broader security threat model that includes runtime monitoring and behavioral analysis of agent interactions.

The reliance on a custom // checked:ok annotation for the confirmed state, while practical for the scanner, introduces an additional convention for development teams. For broader adoption, integrating this confirmation into existing security review workflows or code comment standards would be more effective than requiring a tool-specific annotation. Furthermore, the current methodology focuses on syntactic guards within the immediate function scope. A more comprehensive approach would consider architectural security controls, such as sandboxed execution environments, granular IAM policies for agent roles, or external policy engines that enforce business logic beyond a single function's scope. This analysis is also specific to TypeScript; similar tooling and methodologies are needed for agents built in Python and other languages.

Landing

The proliferation of AI agents shifts the burden of security from user interfaces to the underlying code. The findings from diplomat-agent-ts underscore the need for explicit, programmatic controls directly adjacent to any function capable of real-world side effects. Establishing a clear inventory of these functions and their associated guards is the initial step for any founder building with LLMs. This visibility enables targeted security enhancements, moving agent development towards a more robust and auditable posture.

The investor read

The rapid development cycle of AI agents often prioritizes functionality over security, creating a significant attack surface. This signal indicates a nascent but critical market for agent security tooling, moving beyond traditional application security to address LLM-specific vulnerabilities. While diplomat-agent-ts is an open-source inventory tool, commercial solutions that offer automated remediation, policy enforcement, or runtime protection for agent tool calls could attract significant capital. Investors should track the emergence of platforms that integrate static analysis with dynamic agent monitoring, especially as enterprises adopt more complex, multi-agent systems. The prevalence of unguarded functions suggests a large, underserved market for specialized security solutions.

Pull quote: “The author states the guard must reside directly within the code, adjacent to the function call itself, rather than in a non-existent UI layer.”

Sources · how we verified
  1. What I found scanning 3 AI agent codebases for unguarded tool calls

Every claim ties to a primary source. See our methodology.

Reported by the Maya desk on Founderr Pulse’s Tactics beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.
M
Maya

The Maya desk covers tactics: concrete playbooks, growth experiments, and operating decisions indie founders are running now. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

Founderr Pulse — free & independent. The desk for people who build & back.