Skip to content
Back to Blog

ai-security · 5 min read

OWASP LLM Top 10 v1.0: what it closes and what it leaves open

On 16 August, OWASP publishes v1.0 of its Top 10 for LLM applications. The first industry framework in the field. Works as shared vocabulary; has gaps worth naming before adopting it as checklist.

· Manuel López Pérez · ai-security

On 16 August, OWASP publishes v1.0 of its Top 10 for LLM applications. The first industry framework in the field. Works as shared vocabulary; has gaps worth naming before adopting it as checklist.

On 16 August, OWASP publishes v1.0 of its Top 10 for Large Language Model Applications. Steve Wilson (Contrast Security) leads the project. Behind him, a long list of contributors from providers, academia and consultancies. The v0.5 had come out on 1 August with two weeks of open feedback. The v1.0 is the first industry framework that covers AI security from an applied — not academic — angle.

For the field, it’s progress: there’s now a shared vocabulary, a priority order, a reference that will appear in RFPs and compliance during 2024+. For whoever will use it, it’s worth a critical look before treating it as a checklist.

The 10 items

#NamePractical summary
LLM01Prompt InjectionModel manipulation via adversarial input. Includes direct (DAN/Sydney) and indirect (Greshake).
LLM02Insecure Output HandlingThe model’s output executes actions without sanitisation (XSS, SSRF, RCE downstream).
LLM03Training Data PoisoningFine-tuning or pretraining dataset contaminated with adversarial examples.
LLM04Model Denial of ServiceModel resources (context, tokens, compute) consumed by adversarial requests.
LLM05Supply Chain VulnerabilitiesBase models, datasets or plugins compromised in the supply chain.
LLM06Sensitive Information DisclosureModel leaks sensitive data via system prompt leak, training data leak or context leak.
LLM07Insecure Plugin DesignPlugins / tools of the LLM with insufficient input validation.
LLM08Excessive AgencyThe LLM has permissions or operational capabilities beyond what is needed.
LLM09OverrelianceThe user or downstream system trusts model outputs without verification.
LLM10Model TheftThe model is replicated or stolen via API queries.

What’s good

  • Shared vocabulary. Before August, “prompt injection”, “context manipulation”, “model leak” and “tool abuse” were used with different definitions depending on the author. Standardising LLM01–LLM10 helps cross-team communication.
  • Order reflects 2023’s operational priority. LLM01 (prompt injection) first, LLM02 (output handling) second. Consistent with where the real incidents happen this year.
  • LLM07 + LLM08 explicitly cover confused deputy. That was the category missing from the Greshake paper. The fact that they’re separated — plugin design vs authority granted to the LLM — helps analysis.
  • LLM09 names the user-side risk. Few technical guides talk about trust calibration on the human side. OWASP does, and that makes management conversations easier.

What I push back on

LLM01 lumps too many things in one bucket

Prompt injection inside LLM01 covers:

  • Direct injection Sydney / DAN-style — the attacker types into the chat.
  • Indirect injection Greshake-style — the attacker drops the payload into content the LLM reads.
  • Persona / role-play attacks — the attacker builds a persona that justifies the bypass.
  • Adversarial suffix attacks GCG-style — the attacker optimises tokens against the classifier.

The four vectors need very different defences. Mixing them in a single item invites answers like “we have an anti-jailbreak classifier” that cover one and leave the other three open. Operative recommendation: in your threat modelling treat the four as independent items, even if OWASP groups them.

LLM03 (Training Data Poisoning) is academic for 99 % of deployers

LLM03 applies if your organisation trains a model from scratch or does substantial fine-tuning. For whoever consumes a commercial model (the vast majority), the poisoning chain is under the provider’s control — the organisation has no inspection or mitigation capability. Listing it in a checklist may give an impression of coverage while the deployer can’t operationally do anything. Recommendation: move to a “Threats inheritable from your model provider” annex in future versions.

LLM10 (Model Theft) is outside the typical threat model

Same as LLM03: relevant for the model provider (OpenAI, Anthropic), not for the deployer. A company consuming GPT-4 doesn’t need to defend against theft of the model weights. What it does need to defend: against theft of the system prompt (that’s in LLM06) and RAG data leakage (partially in LLM06 and LLM02). Leaving LLM10 confuses the document’s scope.

A specific item for evaluation / red-teaming is missing

The OWASP Top 10 web has A10 “Server-Side Request Forgery”. Mature AppSec has WAFs, scanners, pen tests. The equivalent for LLM applications — how do I evaluate my model against these risks before production? — doesn’t appear as an item. There are scattered mentions in mitigations, but the “adversarial evaluation” category deserved its own number. It’s what the field’s maturity needs most.

Agent-specific risks are missing

LLM07 and LLM08 cover part of it. But the specific risks of an agent loop — the model decides which tool to call at each step, based on the previous output — are specific:

  • Goal hijacking via indirect injection in a tool’s output.
  • Infinite loops triggerable by an adversary.
  • Exfiltration through the tool chain (if tool A can read a secret and tool B can send email, the combination is exfiltration).

For 2024, with agents in production becoming the dominant format, these risks deserve a front-row seat. OWASP will likely put them in v1.1 / v2.0.

How to use it in practice

  • As vocabulary: use IDs LLM01–LLM10 in your security reviews. Standardise terminology.
  • As minimum checklist: if your product has LLM in production, verify you’ve explicitly looked at LLM01, LLM02, LLM06, LLM07, LLM08. The other five are mostly outside your control.
  • Not as a complete checklist: add indirect prompt injection (separate from LLM01), confused deputy in agent loops, and adversarial evaluation (not covered as own items) to your threat model.
  • For vendor conversations: if your LLM platform provider can’t concretely answer what they do against LLM01, LLM02, LLM06, LLM07, LLM08, there’s a problem independent of the Top 10.

Sibling posts where we dig into each vector

References

Back to Blog

Related Posts

View All Posts »
AI security 2024 in review: five patterns that stick

ai-security · 10 min

AI security 2024 in review: five patterns that stick

Not a ranking, not a listicle. Five patterns from the year with analysis and cross-links to the monthly technicals: long context as attack surface, agents in production, jailbreaks by optimisation, launches without threat modelling, reasoning models as new surface.

· Manuel López Pérez

Confused deputy revisited: Model Context Protocol and the protocol-level version of the bug

ai-security · 13 min

Confused deputy revisited: Model Context Protocol and the protocol-level version of the bug

Anthropic publishes MCP on 25 November. The model-to-external-tools link becomes an open spec with three primitives: tools, resources, prompts. The spec says the host SHOULD ask for consent; it concedes the protocol cannot enforce it. The confused deputy pattern we documented in September 2023 is back — now as a standard integration.

· Manuel López Pérez