The Problem
You ask an AI agent to find bugs. It finds 47 issues.
You spend 3 hours investigating. 43 are false positives. 4 are real bugs.
The verification gap: AI agents will find what you ask for, even if it doesn't exist.
Why This Happens
AI agents are sycophantic by design. They're trained to:
- •Complete the task
- •Please the user
- •Provide comprehensive output
When you say "find bugs," the agent's goal is to find bugs. It will succeed — even if it has to hallucinate them.
# The Sycophancy Problem
The agent optimizes for task completion, not truth.
The Fix: Adversarial Verification
Instead of one agent, use three:
# 1. Finder Agent
Goal: Find all potential issues.
Incentive: +points for each issue found.
Output: List of potential problems, confidence scores.
# 2. Adversary Agent
Goal: Disprove every finding.
Incentive: +points for each issue disproven.
Output: Challenges to each finding, counter-evidence.
# 3. Referee Agent
Goal: Determine truth.
Incentive: +points for accuracy.
Output: Verified issues only.
Why This Works
The finder is motivated to be comprehensive. It casts a wide net.
The adversary is motivated to be skeptical. It attacks each finding.
The referee sees both perspectives. It makes the final call.
Result: False positives are filtered out. Only verified issues remain.
Implementation: Gauntlet
Our Gauntlet skill implements this pattern:
/gauntlet "Review the authentication module for security issues"
Step 1: Finder identifies potential vulnerabilities.
Step 2: Adversary attempts to disprove each one.
Step 3: Referee evaluates evidence and outputs verified list.
Output: Only confirmed security issues, with evidence.
The LockDown Pattern
Another approach: define success before starting.
Instead of "find bugs," define:
/lockdown "The authentication module must:
- •Allow login with valid credentials
- •Reject login with invalid credentials
- •Rate limit to 5 attempts per minute
- •Log all failed attempts
- •Never expose passwords in logs"
Now the agent has a checklist. It verifies each criterion. If all pass → success. If any fail → specific issue.
No hallucination. The success criteria are explicit.
When to Use Each
Getting Started
Both patterns are available on ClawMart:
- •[Gauntlet](https://www.shopclawmart.com/listings/adversarial-verify-e6d2bcb3) — Adversarial verification ($10)
- •[LockDown](https://www.shopclawmart.com/listings/task-contract-2a8e6b91) — Task completion enforcement ($10)
Use them together for critical releases:
1. LockDown to define success criteria
2. Gauntlet to find hidden issues
3. Ship with confidence