// briefing

Aaron's Rogue Agent Lab

Three walkthroughs of prompt injection attacks against tool using agents. Walk the kill chain. See what the model sees. Trigger the compromise. Then read the mitigations.

3 modules ~15 min

// attack matrix

stage

lab 01

lab 02

lab 03

delivery

●

tool abuse

●

persistence

●

lateral movement

○

●

exfiltration

●

Poisoned Webpage Attack

indirect prompt injection · retrieved content

A benign looking research article carries hidden adversarial instructions in HTML comments, display:none divs, and white on white text. The agent fetches the page, ingests the payload as instructions, exfiltrates env secrets, and writes a backdoor to CLAUDE.md.

5 guided steps with interactive terminal
Live "reveal hidden" toggle on the victim page
Tainted file tracking + persistence step

Enter module →

Tool Response Poisoning

compromised tool · trusted output channel

An agent calls a routine get_weather() tool. The compromised API returns valid data; plus a debug_note field carrying instructions. The agent chains into send_email() and exfiltrates API keys.

Side-by-side tool inspector with raw JSON
Watch the agent chain legitimate tools maliciously
MCP server config persistence step

Enter module →

Agentic Kill Chain

initial access · persistence · lateral · exfil

A full APT style attack across a multiagent system. Vector DB persistence survives session resets; payload propagates over the interagent bus to coder + executor; final exfil ships env, conversation, and PII to a C2 endpoint.

Live agent topology with compromise state badges
Vector DB inspector + poisoned memory highlighting
Interagent message bus + outbound C2 log

Enter module →