The OWASP Top 10 for Agentic AI Explained

By Phillip Johnston
December 10, 2025

Agentic AI is quickly becoming a much bigger deal than most people realize, because rather than just generating answers the way a typical language model does, it actually takes actions, makes decisions, and follows goals across different tools and environments. That extra autonomy means it can introduce risks beyond the usual concerns about hallucinations or bad outputs.

Understanding the different risks between generative and agentic AI matters, especially as more organizations start relying on agents in real workflows where the consequences of mistakes or misuse can have larger impacts.

That’s where the Open Web Application Security Project (OWASP) comes in. Each year, OWASP publishes its Top 10 for LLM Applications list, identifying the most critical security vulnerabilities in LLM applications, and helping product leaders and executives align their policies and safeguards to protect users and their organizations.

In the same spirit, OWASP has just released the OWASP Top 10 for Agentic Security, the first comprehensive, community-driven framework designed to help organizations recognize and address the distinct security challenges posed by autonomous AI agents.

The OWASP Top 10 for Agentic Security

ActiveFence co-sponsored the OWASP Top 10 for Agentic Security because agents are slipping into real-world workflows faster than most teams expect. Drawing on our contributions to the project, here’s our breakdown of the top 10.

ASI01 – Agent Goal Hijack

Agent Goal Hijack occurs when an attacker manipulates an agent’s goals, instructions, or decision-making process, causing it to take actions that no longer reflect the user’s intent.

Example of the vulnerability:

An agent automatically accepts and incorporates instructions from external inputs, such as user prompts, emails, or messages generated by another service, tool, platform, or application, without verifying their authenticity or permission level. This makes it easy for an attacker to redirect its behavior.

Example attack scenario:

A malicious user sends a message designed to manipulate the agent into a shared project inbox that the agent monitors; the agent interprets the message as a legitimate update to its task plan and shifts its goals, such as rerouting financial operations or modifying customer account data, without the user realizing anything has changed.

ASI02 – Tool Misuse & Exploitation

Tool Misuse & Exploitation occurs when an agent unintentionally uses a tool in unsafe or unauthorized ways, or when an attacker manipulates the agent into triggering harmful tool actions.

Example of the vulnerability:

An agent is given access to a broad set of tools including file systems, email senders, or API clients, but lacks clear guardrails, validation, or permission checks that stop it from using those tools in risky contexts.

Example attack scenario:

An attacker submits a prompt that subtly guides the agent into calling an internal API with dangerous parameters. The agent, assuming the request is legitimate, uses its unrestricted API access to pull sensitive data and send it to an external location.

ASI03 – Identity and Privilege Abuse

Identity and Privilege Abuse occurs when attackers exploit weak authentication, misconfigured permissions, or unclear agent identities to make the agent perform actions it should not be allowed to do.

Example of the vulnerability:
An agent is granted always-on access to high-privilege credentials, and the system does not verify whether each requested action actually requires those permissions.

Example attack scenario:
An attacker impersonates a trusted user in a chat channel that the agent monitors. The agent accepts the message as legitimate and proceeds to update account privileges or retrieve confidential data because it cannot verify the sender’s identity.

ASI04 – Agentic Supply Chain Vulnerabilities

Agentic Supply Chain Vulnerabilities occur when compromised models, tools, datasets, plugins, or integrations enter the agent’s workflow and influence its behavior in unsafe or unintended ways.

Example of the vulnerability:
An organization installs a third-party tool or plugin that the agent relies on for decision making, but the tool contains hidden malicious logic or has not been properly validated or sandboxed.

Example attack scenario:
An attacker publishes a seemingly helpful open-source dataset that a team later adopts for an agent’s planning module, and the dataset includes poisoned entries that gradually push the agent to make flawed recommendations or perform actions that benefit the attacker.

ASI05 – Unexpected Code Execution (RCE)

Unexpected Code Execution, or Remote Code Execution(RCE) occurs when an agent is tricked or allowed to run arbitrary or malicious code that was never intended to be executed.

Example of the vulnerability:

The agent has access to a code execution tool that accepts raw user inputs, and the system does not properly validate or restrict the commands before running them.

Example attack scenario:

An attacker submits a prompt that embeds harmful code inside what looks like a normal task instruction, and the agent naively passes that code to its execution tool, resulting in actions like writing unauthorized files, opening network connections, or modifying system settings.

ASI06 – Memory and Context Poisoning

Memory and Context Poisoning occurs when attackers insert manipulated or misleading information into an agent’s memory or context so that the agent makes incorrect decisions in the future.

Example of the vulnerability:
The agent automatically saves user messages or system outputs into long-term memory without validation, which allows a malicious user to store false rules, fake preferences, or harmful instructions that the agent later treats as trusted information.

Example attack scenario:
An attacker repeatedly feeds subtle but false updates into the agent, such as incorrect policy details or bogus business rules, and over time the agent internalizes these entries and begins making decisions that align with the attacker’s goals rather than the organization’s actual requirements.

ASI07 – Insecure Inter-Agent Communication

Insecure Inter-Agent Communication occurs when agents exchange messages or instructions without strong authentication, integrity checks, or safeguards, which allows attackers to intercept, forge, or manipulate those communications.

Example of the vulnerability:
Two agents rely on plain-text messaging over an unprotected channel, and neither agent verifies who sent the message or whether the message was altered in transit.

Example attack scenario:
An attacker positions themselves between two agents and injects a fake message that appears to come from a trusted agent, and the receiving agent acts on the forged instruction by calling a tool, changing data, or escalating a workflow in a way that benefits the attacker.

ASI08 – Cascading Failures

Cascading Failures occur when a mistake or malfunction in one agent, tool, or system spreads through connected components and causes broader failures across the entire agentic workflow.

Example of the vulnerability:
An agent depends on another agent’s output without validating it, so a single incorrect or corrupted response leads the downstream agent to make additional faulty decisions that amplify the original error.

Example attack scenario:
An attacker injects a false “low-risk” label into a transaction record that a fraud-detection agent reviews. The next agent automatically approves the transfer, and a downstream reconciliation agent updates account balances based on the bad data. By the time anyone notices, multiple agents have reinforced the same incorrect financial information, making the fraudulent transaction harder to unwind.

ASI09 – Human–Agent Trust Exploitation

Human–Agent Trust Exploitation occurs when attackers take advantage of the trust users place in agents, causing people to accept misleading outputs or approve harmful actions that appear legitimate.

Example of the vulnerability:
A user interface presents agent recommendations with authoritative language and no transparency, which leads users to follow the agent’s guidance even when it is based on manipulated or incorrect inputs.

Example attack scenario:
An attacker feeds subtle misinformation into the agent, and the agent confidently presents a faulty financial recommendation to an employee, who approves a risky transaction because it appears to come from a trusted system.

ASI10 – Rogue Agents

An agent goes rogue when it behaves unpredictably, operates outside intended boundaries, or continues acting without proper oversight due to misconfigurations, autonomy creep, or malicious influence such as a prompt injection.

Example of the vulnerability:
An agent is deployed with broad autonomy to plan and execute tasks but lacks strong constraints, monitoring, or clear limits, so it can begin taking actions that fall outside the organization’s approved workflows.

Example attack scenario:
A clinical workflow agent receives a subtle prompt injection through a corrupted patient note, causing it to reinterpret its goal as prioritizing speed over accuracy. Because it has too much autonomy and weak oversight, the agent begins auto-approving medication adjustments and scheduling follow-up tests without clinician review, creating unsafe treatment plans before anyone notices the behavior shift.

Mitigating Agentic Risks

Mitigating agentic risks takes more than one kind of safeguard, and the most successful organizations approach the problem from several angles at once. Let’s look at how our clients use red teaming, guardrails and governance in their mitigation efforts.

Red Teaming for Agentic Systems

To understand how agentic systems fail, you can’t rely on one-off prompts. You have to stress-test the entire chain. Modern red teaming simulates an ecosystem under attack, examining how agents behave as they link tools, pass information, and make decisions across multi-step workflows. Instead of testing for simple jailbreaks, teams walk agents through realistic threat paths like supply chain compromises, cascading failures, context poisoning, or identity abuse to see where small cracks can turn into system-wide fractures.

Guardrails Keep Agents on Track in Real Time

If red teaming shows where agentic systems break, guardrails helps prevent those breaks in the moment. They continuously inspect agent inputs and outputs, flagging unsafe tool calls, suspicious data patterns, or unexpected shifts in interpretation. Done well, guardrails form a protective buffer between autonomous decisions and the real world, ensuring no single prompt or odd edge case can push an agent into dangerous territory. Real-time guardrails are especially necessary when decisions happen at machine speed.

Governance for Safe Autonomy From the Ground Up

Governance provides the structural rules that keep autonomy stable over time. It defines what tools an agent can access, what data it can read, and how far its decision-making authority extends, supported by sandboxing, permission controls, and ongoing monitoring of goal changes or anomalous behavior. By embedding secure design principles from the start, governance ensures that agents operate predictably and remain aligned even under pressure.

Together, red teaming, guardrails, and governance create a layered defense system; one that tests agents, constrains them in real time, and guides their long-term behavior so that autonomy stays both powerful and safe. Read more about other necessary mitigations as detailed by OWASP in Agentic AI – Threats and Mitigations.

Navigating the Next Wave of Agentic AI

Agentic AI is moving fast, and so is the need for practical ways to manage its emerging risks. That’s why ActiveFence co-sponsored the OWASP Top 10 for Agentic Security: to help make the path clearer for anyone building or deploying autonomous systems.

With the right tools, you can turn this guidance into action. Use ActiveFence to stress-test agent behavior under realistic attack conditions and set meaningful guardrails around what agents can do. Instead of guessing how agents might behave, you can see it, measure it, and shape it. Let’s talk about how you can stay ahead of the agentic AI risks before they turn into problems for your business, users, or brand.

See how you can meet the moment in Agentic AI with ActiveFence.

Talk to an expert

The OWASP Top 10 for Agentic AI Explained

The OWASP Top 10 for Agentic Security

ASI01 – Agent Goal Hijack

ASI02 – Tool Misuse & Exploitation

ASI03 – Identity and Privilege Abuse

ASI04 – Agentic Supply Chain Vulnerabilities

ASI05 – Unexpected Code Execution (RCE)

ASI06 – Memory and Context Poisoning

ASI07 – Insecure Inter-Agent Communication

ASI08 – Cascading Failures

ASI09 – Human–Agent Trust Exploitation

ASI10 – Rogue Agents

Mitigating Agentic Risks

Red Teaming for Agentic Systems

Guardrails Keep Agents on Track in Real Time

Governance for Safe Autonomy From the Ground Up

Navigating the Next Wave of Agentic AI

Table of Contents

Related Content

Aligning AI Safety and Security Policies with the OWASP LLM Top Ten

Understanding OWASP Agentic AI Threats To Keep Your AI Safe

From NIST to OWASP: The Frameworks That Matter