Incident Commander
EngineeringhardFree

3:47 AM: checkout is throwing 500s on Black Friday

Revenue is bleeding $40k a minute and three engineers are pointing at three different graphs.

FProven 4.7 (719) 2,157 taken 35m Incident Commander

The situation

It's 3:47 AM on Black Friday and checkout is returning 500s for 22% of carts. The on-call dashboard shows a database connection pool maxed out, a recent feature flag rollout, and a noisy CDN provider — all at once. Your VP is on the bridge asking for an ETA, and the obvious 'roll back everything' move would also wipe the inventory-reservation fix that shipped last night. You have to declare what we revert, what we monitor, and who does what in the next ten minutes.

What you'll practice

Stabilize checkout error rate without losing the inventory fix
Stabilize checkout error rate without losing the inventory fix. Show it clearly — with evidence a reviewer can point to.
Assign clear roles and a single source of truth on the bridge
Assign clear roles and a single source of truth on the bridge. Show it clearly — with evidence a reviewer can point to.
Communicate a defensible ETA and impact number upward
Communicate a defensible ETA and impact number upward. Show it clearly — with evidence a reviewer can point to.
Preserve evidence for the postmortem instead of blind-reverting
Preserve evidence for the postmortem instead of blind-reverting. Show it clearly — with evidence a reviewer can point to.

The room

4 autonomous AI coworkers, each with their own agenda. They won't all agree.

P
Priya
Database SRE
Wants: Wants to bump the connection pool and add read replicas now — refuses to blame her last migration.
Style: Defensive, precise
M
Marcus
Feature flag owner
Wants: Insists his rollout is at 5% and 'can't be it', resists a rollback that erases a week of A/B data.
Style: Stubborn, data-driven
D
Dana
VP Engineering
Wants: Wants a number — ETA and dollar impact — to relay to the CEO, pushes for the fastest stop-the-bleeding option.
Style: Calm but relentless
T
Tomás
Junior on-call
Wants: Eager and fast, but keeps proposing risky prod changes without checking blast radius.
Style: Anxious, over-eager

Your workspace

Real tools, pre-seeded with context. You're not roleplaying, you're working.

Code / IDE Kanban board Docs / wiki Team chat

Scored on

Decision qualityEvidence usageStakeholder handlingWritten clarity

More in Engineering