
The Agentic Kill Chain: Live Exploitation and Forensic Response for AI Agents in 2026
In January 2026, Microsoft did something the security community had quietly been waiting for: it assigned CVE-2026-21520 to an indirect prompt injection in Copilot Studio. It was the first time a major vendor formally tracked a prompt injection in an agentic platform as a CVE — a watershed moment that signaled prompt injection is no longer a research curiosity but a vulnerability class your incident response team must now own.
The problem: nobody has written the IR playbook for it.
This 2-hour hands-on workshop walks security practitioners through the full lifecycle of an AI agent compromise — from the attacker's perspective and from the defender's. We exploit CVE-2026-2256 live on stage against a vulnerable MS-Agent deployment, demonstrating how attacker-controlled content in a single document can pivot through an LLM's tool-calling logic into arbitrary shell command execution as the agent's host process.
We then flip the perspective: given log data from a compromised agent, how do you reconstruct what happened? What evidence must be preserved? How do you attribute an action when the "user" is non-deterministic? When can you trust the agent again?
Drawing on BlackPerl DFIR's incident response work, we present a structured playbook covering the six gaps in current AI agent telemetry, an evidence preservation checklist for agentic incidents, and a containment workflow that does not destroy forensic state.
Attendees leave with a working understanding of the agentic attack surface, a hands-on reproduction of two real 2026 CVEs, and a practical IR framework they can adapt for their own AI deployments.
This is the talk we wish existed when our first agent compromise engagement landed on our desk.
What Participants Will Experience
Participants will work through a complete AI-agent compromise lifecycle including:
- Live exploitation of vulnerable AI agent deployments
- Prompt injection and tool-calling manipulation techniques
- Agent telemetry analysis and forensic reconstruction
- Incident response workflows for AI systems
- Evidence preservation and containment procedures
- Detection engineering strategies for agentic environments
Who Should Attend
Learning Objectives
Session Timeline
01Module 1 — The Watershed
Framing the problem. Why January 15, 2026 was the moment AI agent security became an enterprise IR discipline. Walkthrough of CVE-2026-21520 and the architectural confused-deputy problem at the heart of the agent vulnerability class. Introduction of OWASP ASI01 (Agent Goal Hijack) and the lethal-trifecta model.
No demo in this module — this is the conceptual scaffold for the rest of the session.
02Module 2 — Anatomy of an Agent Exploit: CVE-2026-2256 Live
Hands-on reproduction of CVE-2026-2256 against a vulnerable MS-Agent v1.5.2 deployment running in a sandboxed container. We walk the attack chain in five stages:
- Initial influence: attacker-controlled content delivered via a document the agent is asked to summarise.
- Tool selection manipulation: content crafted to push the agent's planning loop into selecting the Shell tool.
- Parameter injection: how the agent constructs a shell command string containing attacker text without ever recognising it as a command.
- Execution: arbitrary command runs as the agent process.
- Post-exploitation: persistence via workspace state modification, lateral movement to cloud metadata endpoints, supply-chain impact through poisoned artifacts.
Attendees following along reproduce the chain in their own container.
03Module 3 — Break + Audience Q&A
15-minute structured break. Attendees who hit lab issues get one-on-one help. Open questions on Modules 1–2 answered.
04Module 4 — The IR Reality Check: What You Cannot See
We take the compromised lab from Module 2 and ask: now what? Walkthrough of the six telemetry gaps that block effective incident response in current agent deployments:
- Prompt provenance: which content in the context window came from which source?
- Tool invocation justification: why did the agent decide to call this tool?
- Model state at decision time: what was in memory, what was retrieved from RAG, what was in conversation history?
- Tool output handling: was the result shown to the user truthfully, or summarised in a way that hid an action?
- Session boundary integrity: did instructions from a previous session persist into this one?
- Side-channel actions: did the agent take an action that left evidence only outside the agent's own logs?
For each gap, we show the corresponding artifact in the compromised lab, what is captured today, and what is missing.
05Module 5 — The BlackPerl AI-Agent DFIR Playbook
The constructive half of the talk. We present a structured playbook covering:
- Evidence preservation: the seven-item checklist for snapshotting agent state at incident detection — full conversation history, system prompt, tool registry, RAG retrievals, model and version, environment variables, and external resource fetches.
- Containment without forensic destruction: how to halt an agent in a way that preserves volatile state, including memory dumping techniques for in-process agent runtimes.
- Attribution decision tree: distinguishing prompt injection from model error from legitimate-but-misjudged action using log triangulation.
- Root cause reconstruction: mapping a confirmed malicious action backwards through tool calls, retrievals, and prompts to identify the injection vector.
- Recovery and trust restoration: what must be rotated, what must be replayed, what must be redesigned before the agent returns to production.
The playbook is delivered as a single-page reference card distributed to all attendees.
06Module 6 — Detection Engineering Forward + Closing Q&A
Five-minute close: three concrete detection engineering recommendations for SOC teams running agents in production today. Q&A continues offline at the BlackPerl table in the village area.
Wrap-Up & Discussion
Participants leave with a practical understanding of the AI-agent attack surface, incident response workflows, and evidence preservation requirements for modern agentic environments.
The session concludes with actionable guidance for detection engineering, AI security operations, and enterprise readiness for autonomous systems.
Workshop Speakers
Experts & Mentors

Arpit Kumar
Sr. Security Engineer @ BlackPerl DFIR
He has led IR engagements spanning ransomware, cloud identity compromise, and emerging AI/agent incidents across financial services, manufacturing, and technology sectors in India and the wider APAC region.