Sprint estimation is the ritual that every engineering team performs and almost
none perfects. The failure mode is predictable: the team estimates based on
planned work and ignores the unplanned overhead that will consume 20–40% of
the sprint. CI failures, production incidents, review bottlenecks, stale PRs —
none of these appear on the sprint board, but all of them eat into the sprint's
effective capacity. Until you measure them, you can't account for them.
The real reason sprint estimates are wrong
The standard diagnosis for estimation failure is story points: teams use them
inconsistently, velocity isn't stable, point inflation happens over time.
Story points are genuinely problematic. But they're not the primary cause of
sprint misses in most teams.
The primary cause is unplanned interrupt rate — the percentage
of sprint capacity consumed by work that wasn't on the board when planning happened.
For a typical 10-person engineering team, the interrupt breakdown per sprint looks like this:
- CI failures requiring investigation: 2–4 hours per sprint per engineer (varies by pipeline health)
- Production incidents: 4–8 hours per sprint for the team when they occur (1–2 incidents/sprint is typical for teams without strong CI gates)
- PR review requests outside planned work: 3–6 hours per engineer per sprint
- Dependency unblocking: 2–4 hours per sprint per engineer (waiting on or unblocking other teams)
- Context switches from stale PR escalations: 1–2 hours per sprint
Sum these up: the average engineering team is losing 30–40% of their
sprint capacity to unplanned interrupts that don't appear anywhere in
their sprint planning estimates. They plan 100% of capacity for planned work,
then wonder why 60–70% of it gets done.
The solution is not to estimate better. It's to measure your interrupt rate
and plan capacity accordingly — then work to reduce the interrupt rate over time.
How to measure your team's true interrupt rate
You can't plan around interrupt rate you haven't measured. The first step is
establishing a baseline over 3–4 sprints. You don't need perfect precision —
you need a directionally accurate picture of where the unplanned time is going.
The four interrupt categories to track:
- CI failure response time. How many CI failures occurred in the sprint, and what was the total engineering time spent investigating them? A failed CI run that gets fixed in 15 minutes is different from one that requires 2 hours of root cause analysis. Track count and time separately.
- Production incident time. Total hours the team spent on production incidents — from first alert to resolution. Include both the primary responder time and the secondary time of people who were pulled in to assist.
- Review queue overflow. How many PRs per engineer per sprint arrived as review requests outside their planned sprint work? This is the "teammates asking for reviews" interrupt that's easy to overlook because it feels like normal work — but it's not in the sprint estimate.
- Coordination overhead. Time spent unblocking dependencies, attending cross-team syncs that weren't in the sprint plan, or investigating shared infrastructure issues. This is the hardest to measure but often the largest category in multi-squad organizations.
After 3 sprints of tracking, calculate your interrupt rate: total interrupt
hours ÷ total available sprint hours. A rate of 25–35% is typical for teams
without strong CI automation. A rate above 40% indicates a structural problem —
the team is spending nearly half their time on work that doesn't appear in planning.
The four metrics that make sprint capacity predictable
Once you know your interrupt rate, you can plan around it. But the deeper goal
is to reduce it — which requires tracking the metrics that predict interrupt
rate changes before the sprint starts.
- 1. CI pass rate (rolling 30-day average). Your CI pass rate is your single best predictor of CI-related interrupt time next sprint. A team at 95% CI pass rate has predictable, low interrupt time from CI. A team at 82% CI pass rate will spend significantly more sprint capacity on CI failure investigation than their estimates account for. Track the trend, not just the current number.
- 2. Open production incidents at sprint start. Any production incident that's open when sprint planning happens will consume unplanned capacity during the sprint. "We'll finish the sprint work and get to the incident" is how incidents stay open for three sprints. Count open incidents at planning time and factor in at least 4 hours per open incident as reserved sprint capacity.
- 3. PR review lag (90th percentile). If 90% of PRs get reviewed within 4 hours, review queue interrupts are a minor overhead. If 90th percentile review lag is 2+ days, engineers are being pulled out of planned work to review PRs that sat too long. High review lag compresses into interrupt bursts — people suddenly needing 4 reviews in one day rather than steady-state one per day.
- 4. Stale PR count at sprint start. Every stale PR (>3 days without review activity) at sprint start represents pending context reload for the author and pending review time for a senior engineer. A sprint starting with 8 stale PRs will resolve most of them in the first 2 days — burning capacity that wasn't planned for.
Building a feedback loop between sprint outcomes and future estimates
The most important practice in improving estimation accuracy is the sprint
capacity retrospective — a dedicated 15-minute review at sprint close that
answers one question: where did the time actually go?
Structure it around four buckets:
- Planned work completed: story points or tickets finished
- Planned work not completed: what was estimated but not delivered, and why
- Unplanned work completed: CI failures, incidents, reviews, coordination — estimate hours
- Process improvement impact: did last sprint's process changes reduce interrupt time?
After 3–4 sprints of this retrospective, two things become clear: your interrupt
rate baseline (with enough confidence to factor it into planning), and whether
the process changes you're making are actually reducing it.
Teams that do this consistently for 2 quarters typically improve sprint estimate
accuracy by 35–45%. Not because they got better at estimating features — but
because they stopped underestimating the overhead that was always there.
The sprint planning calibration checklist
Before sprint planning
- Pull CI pass rate for the last 2 weeks — factor in CI interrupt rate accordingly
- Count open production incidents — reserve capacity for each
- Count stale PRs — estimate review time to clear the backlog in sprint week 1
- Check previous sprint interrupt rate — use as the baseline overhead deduction
During sprint planning
- Explicitly name planned capacity as: total available hours × (1 - interrupt rate)
- Reserve a "buffer pool" of hours (10–15%) for unplanned interrupts — make it visible, not hidden
- If CI pass rate is below 90%, add explicit CI remediation work to the sprint board (not just buffer)
At sprint close
- Record actual interrupt hours by category (CI, incidents, reviews, coordination)
- Compare to the buffer pool — did actual interrupts exceed the reserve?
- Update the interrupt rate baseline for next sprint planning
- Identify the highest-interrupt category and add one process improvement action to the next sprint
The teams that plan most accurately aren't clairvoyant — they're instrumented.
They know their interrupt rate because they measured it. They know their CI pass
rate because they track it weekly. They know their stale PR count because it's
on their dashboard, not hiding in a GitHub list they haven't checked this week.
Estimation accuracy is a data quality problem. Fix the data, and the estimates follow.
Before your next sprint planning, get a baseline on the two metrics that matter most:
- CI Health Score Calculator — score your pipeline failure rate and recovery time, the primary source of unplanned sprint interrupts
- PR Cycle Time Calculator — measure your review lag and merge gap, which drive the review queue overflow that inflates your interrupt rate
Get the weekly engineering metrics that make sprint capacity predictable.