The Agentic Kill Chain: Live Exploitation and Forensic Response for AI Agents in 2026

In January 2026, Microsoft did something the security community had quietly been waiting for: it assigned CVE-2026-21520 to an indirect prompt injection in Copilot Studio. It was the first time a major vendor formally tracked a prompt injection in an agentic platform as a CVE — a watershed moment that signaled prompt injection is no longer a research curiosity but a vulnerability class your incident response team must now own.

The problem: nobody has written the IR playbook for it.

This 2-hour hands-on workshop walks security practitioners through the full lifecycle of an AI agent compromise — from the attacker's perspective and from the defender's. We exploit CVE-2026-2256 live on stage against a vulnerable MS-Agent deployment, demonstrating how attacker-controlled content in a single document can pivot through an LLM's tool-calling logic into arbitrary shell command execution as the agent's host process.

We then flip the perspective: given log data from a compromised agent, how do you reconstruct what happened? What evidence must be preserved? How do you attribute an action when the "user" is non-deterministic? When can you trust the agent again?

Drawing on BlackPerl DFIR's incident response work, we present a structured playbook covering the six gaps in current AI agent telemetry, an evidence preservation checklist for agentic incidents, and a containment workflow that does not destroy forensic state.

Attendees leave with a working understanding of the agentic attack surface, a hands-on reproduction of two real 2026 CVEs, and a practical IR framework they can adapt for their own AI deployments.

This is the talk we wish existed when our first agent compromise engagement landed on our desk.

What Participants Will Experience

Participants will work through a complete AI-agent compromise lifecycle including:

Live exploitation of vulnerable AI agent deployments
Prompt injection and tool-calling manipulation techniques
Agent telemetry analysis and forensic reconstruction
Incident response workflows for AI systems
Evidence preservation and containment procedures
Detection engineering strategies for agentic environments

Who Should Attend

Incident responders

SOC analysts

DFIR practitioners

Detection engineers

AI security researchers

Threat hunters

Blue teamers

Security architects

Learning Objectives

Understand AI agent attack surfaces

Analyze prompt injection chains

Investigate tool-calling abuse

Reconstruct agent execution flows

Preserve forensic evidence in AI systems

Perform AI-agent incident response

Identify telemetry blind spots

Design detection strategies for agentic systems

Session Timeline

Module 1 — The Watershed

Framing the problem. Why January 15, 2026 was the moment AI agent security became an enterprise IR discipline. Walkthrough of CVE-2026-21520 and the architectural confused-deputy problem at the heart of the agent vulnerability class. Introduction of OWASP ASI01 (Agent Goal Hijack) and the lethal-trifecta model.

No demo in this module — this is the conceptual scaffold for the rest of the session.

Module 2 — Anatomy of an Agent Exploit: CVE-2026-2256 Live

Hands-on reproduction of CVE-2026-2256 against a vulnerable MS-Agent v1.5.2 deployment running in a sandboxed container. We walk the attack chain in five stages:

Initial influence: attacker-controlled content delivered via a document the agent is asked to summarise.
Tool selection manipulation: content crafted to push the agent's planning loop into selecting the Shell tool.
Parameter injection: how the agent constructs a shell command string containing attacker text without ever recognising it as a command.
Execution: arbitrary command runs as the agent process.
Post-exploitation: persistence via workspace state modification, lateral movement to cloud metadata endpoints, supply-chain impact through poisoned artifacts.

Attendees following along reproduce the chain in their own container.

Module 3 — Break + Audience Q&A

15-minute structured break. Attendees who hit lab issues get one-on-one help. Open questions on Modules 1–2 answered.

Module 4 — The IR Reality Check: What You Cannot See

We take the compromised lab from Module 2 and ask: now what? Walkthrough of the six telemetry gaps that block effective incident response in current agent deployments:

Prompt provenance: which content in the context window came from which source?
Tool invocation justification: why did the agent decide to call this tool?
Model state at decision time: what was in memory, what was retrieved from RAG, what was in conversation history?
Tool output handling: was the result shown to the user truthfully, or summarised in a way that hid an action?
Session boundary integrity: did instructions from a previous session persist into this one?
Side-channel actions: did the agent take an action that left evidence only outside the agent's own logs?

For each gap, we show the corresponding artifact in the compromised lab, what is captured today, and what is missing.

Module 5 — The BlackPerl AI-Agent DFIR Playbook

The constructive half of the talk. We present a structured playbook covering:

Evidence preservation: the seven-item checklist for snapshotting agent state at incident detection — full conversation history, system prompt, tool registry, RAG retrievals, model and version, environment variables, and external resource fetches.
Containment without forensic destruction: how to halt an agent in a way that preserves volatile state, including memory dumping techniques for in-process agent runtimes.
Attribution decision tree: distinguishing prompt injection from model error from legitimate-but-misjudged action using log triangulation.
Root cause reconstruction: mapping a confirmed malicious action backwards through tool calls, retrievals, and prompts to identify the injection vector.
Recovery and trust restoration: what must be rotated, what must be replayed, what must be redesigned before the agent returns to production.

The playbook is delivered as a single-page reference card distributed to all attendees.

Module 6 — Detection Engineering Forward + Closing Q&A

Five-minute close: three concrete detection engineering recommendations for SOC teams running agents in production today. Q&A continues offline at the BlackPerl table in the village area.

Wrap-Up & Discussion

Participants leave with a practical understanding of the AI-agent attack surface, incident response workflows, and evidence preservation requirements for modern agentic environments.

The session concludes with actionable guidance for detection engineering, AI security operations, and enterprise readiness for autonomous systems.