How to Handle an Outage That Isn't Yours
A support ticket comes in: a customer can't complete checkout. You open your dashboard. Everything looks green. You open your payment processor's status page. It's yellow. Somewhere on the other side of a few network hops, someone else's incident has become yours.
This is the reality of running a modern SaaS. A typical product sits on top of a CDN, an edge compute platform, a database provider, an auth provider, an email sender, a payment processor, an analytics pipeline, and six or seven other "we'd never build this ourselves" services. Any one of them having a bad day can make your product look broken to your customers. And when that happens, the customer doesn't care where the problem actually lives. They bought from you. You have to say something.
Here's how to say it well.
The distinction that matters
There's a difference between whose fault an incident is and whose problem it is. Those two almost never coincide in third-party-caused incidents.
Fault is a question about which code base is misbehaving. That's your vendor's problem.
Problem is a question about who customers expect a credible response from. That's yours. Always. Even when zero lines of your code are implicated.
Most of the bad communication during third-party outages comes from teams conflating these two. They see that the fault is elsewhere and conclude that the problem is also elsewhere — that the correct move is to wait quietly for the vendor to fix it and never mention it on their own status page. That's a mistake. Your customers are your customers. A silent "we're waiting on someone else" is indistinguishable from a silent "we don't know what's happening."
When to open an incident for a third-party problem
Not every vendor wobble warrants a public incident on your side. A useful filter: does this vendor's issue change what your customers experience in your product?
- Your payment processor is degraded → checkout is broken for your customers → open an incident.
- Your auth provider is failing some requests → some customers can't log in → open an incident.
- Your CDN is serving stale content in one region → customers in that region see old data → open an incident.
- Your analytics pipeline is behind → nothing customer-visible is affected → internal note, not an incident.
- Your billing provider's dashboard is slow but their API is fine → customers unaffected → internal note, not an incident.
The test isn't "is our vendor having an incident." It's "is a customer, right now, seeing something that looks broken in our product."
The wording pattern
The update has to do four things at once: acknowledge the impact, name the dependency honestly, set expectations, and not sound like you're deflecting. A template that works:
"Some customers may be experiencing [specific symptom]. The underlying cause is an issue with our [vendor type — e.g., payment processor], which is also tracking the incident on their status page: [link]. We're monitoring and will update as things change."
That pattern is deliberately boring. Every clause is doing work:
- "Some customers may be experiencing [specific symptom]" — concrete impact from the customer's point of view. Not "elevated error rates" (meaningless to most readers). The actual thing they can't do.
- "The underlying cause is an issue with our [vendor type]" — vendor type, not vendor name in the first breath. "Our payment processor" is more useful to a reader than "Stripe" — they know what a payment processor does, they might not know what Stripe is. You can name the vendor further down if helpful.
- "Which is also tracking the incident on their status page: [link]" — link out. This is the honest move. Customers who want the gory detail can follow the link. Customers who don't can stay.
- "We're monitoring and will update as things change" — sets expectation that you're on it, without promising a fix you can't deliver.
When the vendor hasn't acknowledged it yet
This is the hard case. Your metrics say something is wrong. Your investigation points at a vendor. But the vendor's status page is still green, and you don't have official confirmation.
The temptation is to wait — for the vendor to post, for your team to be 100% sure, for the picture to clarify. That waiting is the same mistake as silence during any ambiguous incident. Your customers are affected now. They don't care about the confirmation handshake.
The move is to post, but scope your language to what you can actually prove:
"Some customers may be experiencing [symptom]. We're investigating, and the issue appears to be upstream of our systems. We'll update as we learn more."
"Appears to be upstream" is honest. You're telling readers what you believe without claiming certainty you don't have. Update again when the vendor confirms — or when it turns out not to be them after all.
What not to do
There are three common failure modes worth naming:
Pure blame. "Stripe is broken, nothing we can do, wait for them to fix it." This reads as shirking even when it's technically accurate. Your customers don't have a relationship with your vendor. You do. Owning the communication — even when you're not owning the fault — is the minimum bar.
Pretending you don't have a dependency. "We're experiencing an issue with our payment system" is fine if you built the payment system. If you didn't, and the reason is your vendor, dodging around that fact is worse than naming it. Readers can tell. They will be less charitable next time.
The performative apology. "We are deeply sorry for the inconvenience caused by this unforeseen event." That sentence contains zero information. Customers don't want apologies during incidents. They want to know what's broken, what you know, and what you're doing. Skip the apology theater and tell them something useful.
After it's resolved
When the third-party issue clears and your product is back to normal, the resolution update can be short:
"The underlying [vendor type] issue has been resolved and all services are operating normally. See [vendor's status page link] for their details on what happened."
That's it. You don't need to write a post-mortem about someone else's infrastructure. The vendor will, if the incident was significant enough. Your customers who want that level of detail can follow the link. Yours is closed.
The harder question, after the dust settles
There's a conversation worth having internally — not on the status page — after a third-party outage. If your product was completely offline for two hours because one vendor had a bad afternoon, that's a business-architecture question, not a communication one. The communication question is solved by what's above. The architecture question is:
- Is there a fallback for this dependency? A cached mode, a read-only mode, a secondary provider?
- Is the blast radius appropriate? Should a failure in an analytics vendor have any customer-visible effect at all?
- Are the critical dependencies actually the critical ones, or has scope crept over time?
Those questions don't belong in public. But they're the ones that determine whether your next vendor-caused outage is a two-minute degradation or another two hours of full-product downtime. The communication template is the band-aid. The dependency audit is the cure.
The reframe
Your customers aren't infrastructure tourists. They don't know — and shouldn't need to know — that your login page depends on an auth provider, that your dashboard is rendered on an edge compute platform, that your checkout flow traverses four vendors before it reaches a bank. They bought a product. When that product isn't working, they want a credible response from the company they paid.
Credibility in these moments comes from one thing: showing that you understand what's happening and you're treating it as yours to communicate about, even when it isn't yours to fix. The vendor will fix their part. You own the part your customers actually see, which is: the update on your status page, posted by you, acknowledging the problem in their language, with a useful link and a realistic expectation.
Do that consistently and customers trust you more after third-party outages than before. The ones who don't do it — who go silent, or blame, or post hollow corporate prose — get less trust every time it happens.
Ownership of the message is free. Spend it.
PageCalm helps small teams run status pages with AI-powered incident updates that sound human and ship fast. Try it free — no credit card required.