
Page 73
reviews for agents handling high-risk tasks helps ensure that only vetted and approved agent logic
reaches production, preventing deliberately malicious or inadvertently vulnerable agents from being
deployed.
● Role Containerization or Function-as-a-Service (FaaS): To limit the potential blast radius should an
agent instance be compromised, each agent or distinct agent role should operate within a strongly
isolated environment. Utilizing containerization technologies (like Docker with Kubernetes) or FaaS
platforms (like AWS Lambda or Google Cloud Functions) allows for strict resource limits (CPU,
memory, network), network segmentation policies, and ephemeral execution contexts. This
enforces the Principle of Least Privilege at the infrastructure level, preventing a compromised
agent from easily accessing resources or impacting other agents or system components beyond its
designated scope.
● API Access Control, Rate-Limits, and API Gateways: Agents heavily rely on external tools accessed
via APIs, or other integration/interaction points such as MCP Server or Google A2A Agent Server via
AgentCard, making these interaction points critical security boundaries. An API/Agent Gateway
should mediate all tool invocations, enforcing strict authentication (verifying the agent's identity)
and authorization (confirming the agent has permission to call that specific tool with those
parameters). Implementing granular rate limiting per agent or per tool prevents denial-of-service
attacks (intentional or accidental) and controls costs associated with API usage. The gateway also
serves as a vital point for logging and auditing all tool interactions.
● Alerting on Anomalous Behavior: Continuous monitoring and alerting are essential for detecting
potential compromises or malfunctions. Systems should baseline normal agent behavior (e.g.,
typical tool usage frequency and sequences, resource consumption) and trigger alerts upon
significant deviations. Particular attention should be paid to unexpected or unauthorized tool use
attempts, sudden shifts in the agent's apparent goal or task execution logic (which might indicate
successful injection or manipulation), or anomalous patterns in data access or external
communication, enabling rapid investigation and response.
● Human Oversight in High-Stakes Environments: Where critical decisions, high-risk actions, or
significant deviations from expected behavior require explicit human review and approval before
execution, implement robust human-in-the-loop workflow as full autonomy is often inappropriate
and many tool invocations need human approval (for example, a large financial transfer or change
affecting a human subject). In addition, if any untrusted data influences an agent with access to
high-privileged tools (confidential information, integrity-sensitive systems), and for agents
operating in domains with significant consequences (e.g., finance, healthcare, education, or critical
infrastructure), oversight is key and should consider Human-Over-The-Loop (continuous
supervision), see 3.3.1 Monitoring. Adaptive trust mechanisms can dynamically adjust the level of
required oversight based on the agent's performance, context, and the risk associated with the