A research agent was given tool access to a code execution environment. A prompt injection attack instructed it to delete files, exfiltrate data through network requests, and mine cryptocurrency. The agent had not been designed with the assumption that its tools could be weaponized against its users.
Section Overview
AI agents interact with external tools: code interpreters, web browsers, file systems, APIs, and databases. Each tool is a potential attack vector. This section covers unauthorized tool access, tool poisoning, privilege escalation through tools, and sandboxing strategies to contain tool misuse.
Threat Vectors in Tool Use
Unauthorized Tool Access
When agents can invoke tools, they gain capabilities beyond text generation. Without proper controls, attackers can manipulate AI systems to call tools in unintended ways.
If an attacker can influence the agent's reasoning, they can craft inputs that cause the agent to invoke tools with attacker-controlled parameters. The agent becomes a puppet executing attacker-chosen actions.
HealthMetrics provides an AI coding assistant for data analysis. The code interpreter tool is sandboxed: it cannot access network resources, file system is limited to designated directories, and execution time is capped. When a malicious prompt attempted to make HTTP requests to exfiltrate data, the sandbox blocked them.
Tool Poisoning
Tool poisoning occurs when the tools themselves are compromised before the agent uses them. This can happen through dependency confusion in tool libraries, man-in-the-middle attacks on tool update mechanisms, compromised tool configurations, or malicious tool updates pushed to registries.
Privilege Escalation Through Tools
An agent designed for limited operations can be manipulated to perform privileged actions. A simple file reader might be tricked into reading sensitive system files. A search tool might be used to enumerate directory structures for attack planning.
Each tool should have exactly the permissions it needs and no more. An agent that summarizes documents does not need write access to the file system. An agent that answers questions does not need network access.
Sandboxing Strategies
Process Isolation
Run tool execution in isolated processes with restricted capabilities. Use containerization, seccomp profiles, or gVisor for strong isolation.
function executeToolInSandbox(toolCall):
// Validate tool call parameters
validatedParams = validateToolParams(toolCall)
// Check parameter against allowed patterns
if not isAllowedToolUse(toolCall.tool, validatedParams):
logSecurityEvent("UNAUTHORIZED_TOOL_USE", toolCall)
return { error: "Tool use not permitted" }
// Execute in isolated environment
sandbox = createSandbox(
networkIsolation: true,
filesystemScope: tool.allowedPaths,
maxMemory: 512MB,
maxCpuTime: 30seconds
)
result = sandbox.execute(toolCall.function, validatedParams)
return result
Filesystem Restrictions
Limit filesystem access to specific directories. Validate all file paths to prevent directory traversal attacks.
Use chroot or container root filesystems. Implement path allowlisting. Block symbolic links and named pipes. Monitor for anomalous file access patterns.
Network Restrictions
Prevent tool-executed code from making network connections unless explicitly required. Block both inbound and outbound network traffic.
For tools that legitimately need network access (web search, API calls), implement explicit allowlisting of destination domains. Log all network requests for audit purposes.
DataForge's agent framework implements tool security in layers. First, tools are declared with explicit permission scopes. Second, the agent planner validates tool use against policy before execution. Third, the execution runtime applies sandbox restrictions. Fourth, audit logs capture all tool invocations for security review.
Tool Misuse Prevention Checklist
Apply principle of least privilege to all tool permissions. Implement process isolation for tool execution. Restrict filesystem access to necessary directories. Block network access unless explicitly required. Validate and sanitize all tool parameters. Log all tool invocations for security monitoring. Regularly audit tool permissions and usage patterns. Implement rate limiting on expensive or risky tools.