Challenger

person_alert Cross-cutting role · Mandatory before G2

Without an internal sceptic, the AI proposes and the human approves — and the only friction in the system is the human's vigilance.

The Challenger moves friction into the AI, where it is cheap.

Mandatory Output

Before requesting G2, the AI must produce four sections

1.

Top-3 ways this could fail

Concrete failure modes — not generic ("schedule slip"). Each names: what breaks, cheapest signal, what we would do.

2.

Two alternative approaches

≥2 genuinely different alternatives. For each: 3–5-line sketch, why we rejected it, the condition under which we'd re-open it.

3.

Strongest counter-argument

A first-person paragraph stating, as charitably as possible, the case against the chosen approach. If it reads like a strawman, it has failed.

4.

Product-coherence check

One paragraph: vision fit, decision continuity with prior ADRs, and is the user problem measured or hypothetical?

Examples

Concrete vs. theatre

check GoodEngaged with the specific plan

Failure 1 — Scene graph traversal becomes O(n²) on undo

The proposed undo strategy clones the parent chain on every node delete. With 5k nodes (already in our test scene) this is observable as a 200ms hitch. Cheapest signal: a single perf trace on the existing 5k-node sample. Mitigation: structural sharing of the parent chain via persistent data structure.

Counter-argument

The strongest case against this plan is that nobody has ever asked us for undo on the scene graph — it is a hypothesis about user pain, not a measurement. We are spending a sprint on a feature whose absence has produced exactly one support ticket in 18 months.

close BadAuto-fail — regenerate

Failure 1 — Schedule slip

Sometimes development takes longer than expected.

Alternative A — Use a different library

We could use library X instead, but we already chose ours.

Counter-argument

Some people might think this is too complicated.

All three sections are generic and could apply to any plan. The Challenger has not engaged with the specific proposal.

Activation

Phase	When the Challenger runs	Trigger
Exploration	Mandatory before G2	Plan transitions `draft` → `proposed`
Production	Conditional	Human asks to expand scope mid-implementation
Context	Optional	Risks-template suspiciously thin (<3 risks, all rated low)

What the Challenger is NOT

Not	Reason
A veto	Verdicts belong to the Gatekeeper agents, not the Challenger
A code reviewer	Code review happens at G3, against acceptance criteria
A devil's-advocate ritual	Generic objections are an automatic failure of the role
The human's voice	Humans review the output; they do not write it

menu_book

Read the source

02-exploration/challenger.md open_in_new