Why most postmortems are theater
- 73% of engineering teams conduct postmortems after major incidents
- Fewer than 30% have a structured process for tracking whether action items are completed
- The average time from postmortem action item identification to ticket creation is 4.3 days
- At 4.3 days, the engineer who owned the fix has lost almost all context on what the fix entailed
The anatomy of a blameless postmortem: five required sections
- 1. Timeline of events. What happened, in order, with timestamps. Not a narrative — a factual sequence. Include the detection event, escalation events, and resolution event. This section should be reconstructable from CI logs, deployment history, and Slack timestamps — not from memory.
- 2. Contributing conditions. What system, process, or tooling conditions made this incident possible? Not who made a mistake — what made a mistake inevitable. Missing alerts? No staging parity? A test suite that doesn't cover this code path? List them without assigning fault.
- 3. Impact summary. Duration, scope, and severity. Who was affected? What was degraded? What was the business cost (downtime minutes × affected users, or engineering hours spent on response)?
- 4. Specific, ticketed action items. Every action item must have a ticket number before the postmortem meeting ends. Not "someone should improve test coverage" — "JIRA-1234: Add integration tests for payment webhook handler, assigned to Sarah, due next sprint." If it doesn't have a ticket, it doesn't exist.
- 5. Verification criteria. How will you know the action item actually prevents recurrence? "Better tests" is not a verification criterion. "CI pass rate on payment webhook handler tests >95% for 30 days" is. Write the success condition before the meeting ends.
How to auto-generate postmortem tickets from CI signals
When your CI Intelligence detects a production deployment failure, the following should happen automatically — not on the engineer's to-do list:
- A structured ticket is created in Jira or Linear, pre-populated with the CI run link, the failing test suite, the commit SHA, and the engineer who merged
- The ticket is tagged as an incident ticket with severity classification
- The EM is notified with the ticket link — not the raw CI failure link
- If the incident crosses a severity threshold, a Slack notification fires with the ticket context included
Linking deployment failures to postmortem history
The postmortem follow-through checklist
During the incident
- Incident ticket created automatically from CI failure (don't rely on manual creation)
- Timeline begins populating from CI logs and deployment history in real time
- EM notified with structured context, not raw failure dump
Within 24 hours of resolution
- Postmortem meeting scheduled with all contributing parties
- Impact summary drafted (duration, affected users, estimated cost)
- Contributing conditions listed — minimum three, maximum seven
During the postmortem meeting
- Every action item receives a ticket number before the meeting ends
- Every ticket has an owner and a target sprint
- Every ticket has a verification criterion (how you'll know it worked)
- No blame language — conditions, not culprits
Within 30 days
- All P1 action item tickets resolved or explicitly rescheduled with justification
- Verification criteria checked — did the fix work?
- Postmortem linked to CI failure event in your signal system for future reference
- Runbook updated if the incident revealed a missing response procedure
Auto-create incident tickets from CI failures. Try Deviera free.
