What to Do After an Incident Resolves
The incident is resolved. You deployed the fix. Errors are back to baseline. The Slack channel has gone quiet. Someone is already closing tabs and grabbing lunch.
For most teams, this is where the incident ends. You post a "resolved" update, move on, and treat the whole thing as done. It isn't. The 15 minutes after resolution are a surprisingly underweighted part of the incident arc. Customers who sat through the problem are still watching. The update you write here is the one they'll remember — more than any of the ones during. And the hygiene you apply to the status page now is what makes your track record look professional in aggregate over time.
Here's what actually needs to happen.
The resolution update most teams write is useless
You've probably seen this exact update a hundred times:
"The issue has been resolved. Thanks for your patience."
It technically closes the incident. It communicates nothing. No customer reads that and comes away with more information than they had five minutes before.
The resolution update is your last chance to tell the story of what happened. You have the full timeline in front of you — root cause, fix, impact. Putting none of that into writing means every future incident starts from zero trust, because customers have no evidence you actually understand what you fix.
A better template:
"This incident has been resolved as of 14:20 UTC. Between 13:42 and 14:20, some customers couldn't complete checkout because our payment system stopped accepting new requests when it hit an internal capacity limit. We've raised that limit, added automatic alerts before it can happen again, and are monitoring closely. A full write-up will be published within 48 hours."
That's not long. It reads in plain language a non-technical customer can follow — "capacity limit" rather than "connection pool exhaustion," "our payment system" rather than naming an internal service. It tells a customer three things they want to know: how long were they affected, what broke, and what's next. The detailed version, with root-cause technical depth, belongs in the post-mortem where engineers who want it can read it. The status page resolution update is for everyone.
Rule of thumb: the update should make sense to someone who doesn't know what a database is. If a reader needs to Google a term to understand what happened, the update is written for the wrong audience.
Leave the incident visible
The temptation after a messy incident is to hide it. Mark it resolved, collapse it, get it off the main page as fast as possible. Small teams especially feel this pull — nobody wants the current state of their service defined by the last thing that went wrong.
Resist. A status page is a trust-over-time instrument, not a "right now" display. The customers most likely to evaluate your reliability are the ones checking whether you handle incidents well when they happen. A clean status page that shows no incidents reads as either a dishonest page or a brand-new service. A page that shows handled incidents — with clear titles, reasonable durations, and thoughtful resolution updates — reads as a team that operates under real conditions and does so competently.
How long to leave things visible:
- Minor incidents: visible on the main page for ~48 hours, then in the incident history indefinitely.
- Major incidents: visible on the main page for 5–7 days. The longer window lets customers who checked in mid-incident come back and see how it concluded.
- Critical / data-related: stay on the main page until you've published a public post-mortem (or a week, whichever is shorter).
Nothing should ever be deleted. Your incident history is the receipt. Customers evaluating vendors regularly scroll back months to see what kinds of problems you've handled and how.
Close the loop — selectively
Every incident doesn't need a follow-up beyond the resolution update. But some do, and skipping follow-up on the ones that need it is where trust gets lost.
Thresholds for when to reach out individually to affected customers:
- The incident directly prevented a customer from completing a transaction (payment, checkout, signup).
- The incident lasted longer than ~30 minutes and affected the core product surface.
- A customer sent you a support ticket about it during the incident.
- Any data was affected — delayed, temporarily incorrect, or unavailable.
A short individual email within a day or two — not an apology form-letter, an actual human note acknowledging what they experienced and what was fixed — lands differently than any status page update can. It's also where customers often reply with additional context or feedback, which is how you learn things your internal monitoring missed.
For high-severity incidents, a public post-mortem within a few days cements that the team is taking it seriously without making every customer ask. You don't need one for every incident — the threshold for "this warrants a public write-up" is roughly "the incident lasted long enough that customers noticed and remembered." A 15-minute blip rarely qualifies. A four-hour degradation affecting logins does.
The internal cleanup that customers eventually feel
The post-resolution work that matters most to customers is the work you do on your own systems, not your status page. Customers can't see it directly, but they feel it the next time something similar happens and it doesn't:
- The specific failure has monitoring now. If this was a silent failure mode, you caught it via customer report — that gap is your highest-priority fix.
- The runbook is updated. The next person oncall when this recurs should find a documented response, not re-derive it.
- The action items are committed, not aspirational. "We're going to look into better monitoring" is noise. "We've added a synthetic probe for the checkout endpoint and paged oncall at the 2-minute error threshold" is real.
Each of those is how repeat incidents get shorter over time. Customers notice when your second and third outage of a similar type are 20% the duration of the first — they may not have language for it, but they feel it. That pattern of shrinking recurrence is one of the quieter ways reliability actually gets built.
The ghost "resolved" update
The worst post-resolution pattern — and the most common one — is the silent close. The incident gets marked resolved in your internal dashboard. The status page shows the green checkmark. No final update gets written, or the update is a one-liner with nothing in it.
From the customer side, this reads terribly. They sat through three hours of "investigating" and "identified" updates. They were told a fix was rolling out. And then... nothing. Or a two-word "it's fixed." The investigation narrative you were building during the incident ends without a conclusion.
The fix is simple: write the resolution update before you close anything. Treat it as the one update you spend actual time on. Even if you wrote terse during-incident updates because you were heads-down on the fix, the resolution update is where you catch up and tell the complete story.
The checklist
Before marking an incident fully done, run through this:
- Resolution update posted with timestamp, impact summary, root cause, and what's next.
- Affected customers individually contacted (if thresholds met).
- Post-mortem scheduled for the public side (if warranted).
- Monitoring added for the specific failure mode.
- Runbook updated with the response pattern.
- Follow-up ticket or action filed for the prevention work.
- Status page incident left visible for the appropriate window based on severity.
None of those take long individually. Skipping them all is what makes an otherwise well-handled incident feel like it just petered out.
The compounding effect
Teams that do post-resolution work well don't necessarily have fewer incidents — they have incidents that age better. Six months later, when a prospective customer scrolls through the incident history, the incidents read as things that were handled, explained, and prevented from recurring. Teams that skip this beat end up with incident histories that look like a random log of problems with no narrative thread.
That's the quiet compounding benefit. Every resolution update you write well adds to a track record. Every one you phone in subtracts from it. Over a year of incidents — even for a small, reliable service — the difference between the two tracks is how your status page reads when someone is evaluating whether to trust you.
The incident isn't over when it's fixed. It's over when you've told the story of how you fixed it, and made sure the same one doesn't happen the same way twice.
PageCalm is a status page tool with an AI writer for the moments you don't have time to wordsmith. Try it free — no credit card required.