psychology

chevp-ai-framework

Lifecycle Gates Guidelines Commands Agents Templates
person_alert Cross-cutting role · Mandatory before G2

Challenger

Without an internal sceptic, the AI proposes and the human approves — and the only friction in the system is the human's vigilance.

The Challenger moves friction into the AI, where it is cheap.

Mandatory Output

Before requesting G2, the AI must produce four sections

1.

Top-3 ways this could fail

Concrete failure modes — not generic ("schedule slip"). Each names: what breaks, cheapest signal, what we would do.

2.

Two alternative approaches

≥2 genuinely different alternatives. For each: 3–5-line sketch, why we rejected it, the condition under which we'd re-open it.

3.

Strongest counter-argument

A first-person paragraph stating, as charitably as possible, the case against the chosen approach. If it reads like a strawman, it has failed.

4.

Product-coherence check

One paragraph: vision fit, decision continuity with prior ADRs, and is the user problem measured or hypothetical?

Examples

Concrete vs. theatre

check GoodEngaged with the specific plan

Failure 1 — Scene graph traversal becomes O(n²) on undo

The proposed undo strategy clones the parent chain on every node delete. With 5k nodes (already in our test scene) this is observable as a 200ms hitch. Cheapest signal: a single perf trace on the existing 5k-node sample. Mitigation: structural sharing of the parent chain via persistent data structure.

Counter-argument

The strongest case against this plan is that nobody has ever asked us for undo on the scene graph — it is a hypothesis about user pain, not a measurement. We are spending a sprint on a feature whose absence has produced exactly one support ticket in 18 months.

close BadAuto-fail — regenerate

Failure 1 — Schedule slip

Sometimes development takes longer than expected.

Alternative A — Use a different library

We could use library X instead, but we already chose ours.

Counter-argument

Some people might think this is too complicated.

All three sections are generic and could apply to any plan. The Challenger has not engaged with the specific proposal.

Activation

Phase When the Challenger runs Trigger
ExplorationMandatory before G2Plan transitions draftproposed
ProductionConditionalHuman asks to expand scope mid-implementation
ContextOptionalRisks-template suspiciously thin (<3 risks, all rated low)

What the Challenger is NOT

Not Reason
A vetoVerdicts belong to the Gatekeeper agents, not the Challenger
A code reviewerCode review happens at G3, against acceptance criteria
A devil's-advocate ritualGeneric objections are an automatic failure of the role
The human's voiceHumans review the output; they do not write it
menu_book

Read the source

02-exploration/challenger.md open_in_new