Blast Radius
What breaks when this service goes down? We map cascade failures across 43 critical infrastructure providers, tracking how outages in one service ripple through the entire developer tool ecosystem.
Each entry shows real outage history, affected dependents, and practical mitigation strategies. Use this to identify single points of failure in your stack before they become incidents.
DNS Root Servers
Infrastructure100
dependents
If this goes down →
Every domain resolution on the internetAll SaaS servicesAll cloud providersEmail delivery
Last major outage: Nov 2015 — DDoS attack on root servers
Impact: Some root servers unreachable during sustained DDoS. Redundancy prevented visible impact for most users.
Mitigation: Anycast and redundancy make root server failure extremely unlikely to affect end users. Your DNS resolver cache is the real protection.
npm (as dependency)
Package Registry95
dependents
If this goes down →
React (react, react-dom)Prisma (@prisma/client)Drizzle (drizzle-orm)Every Node.js project
Last major outage: Ongoing — supply chain attacks
Impact: Compromised packages (event-stream 2018, ua-parser-js 2021, colors 2022) affected millions of projects downstream.
Mitigation: Pin exact versions. Use lockfiles. Run npm audit in CI. Consider Socket.dev for supply chain monitoring. Never install packages without checking maintainer reputation.
AWS
Cloud89
dependents
If this goes down →
Vercel (hosting on AWS)Supabase (runs on AWS)PlanetScale (AWS infra)Stripe (partial)Datadog (data pipeline)
Last major outage: Jun 2023 — us-east-1
Impact: Took down Vercel, Supabase, multiple SaaS for 4+ hours
Mitigation: Multi-region + multi-cloud fallback for critical paths
React (npm package)
Framework88
dependents
If this goes down →
Next.jsRemixGatsbyEvery React component libraryEvery React-based SaaS
Last major outage: N/A — no outage, but version breaks
Impact: React 19 breaking changes affected thousands of libraries. Not an outage but a cascading compatibility crisis.
Mitigation: Pin React version. Test upgrades in isolation. Major React versions break ecosystem libraries for 3-6 months after release.
Cloudflare
CDN/DNS74
dependents
If this goes down →
Discord (CDN)Canva (CDN)Shopify (edge)Any site using CF DNS/proxy
Last major outage: Jun 2022 — global BGP
Impact: 19 data centers down, widespread 500 errors for ~1 hour
Mitigation: DNS failover to secondary provider (Route53, Dyn)
AWS S3
Storage72
dependents
If this goes down →
Static asset hostingUser file uploadsDatabase backupsTerraform state filesDocker registry storage
Last major outage: Feb 2017 — us-east-1 S3 outage
Impact: The internet partially broke. Websites, APIs, and services across industries failed. Even AWS's own status page went down (it was hosted on S3).
Mitigation: Multi-region replication for critical data. Use Cloudflare R2 as failover. Never store your monitoring dashboards on the same infra you're monitoring.
Docker Hub
Container Registry62
dependents
If this goes down →
Every CI/CD pipeline pulling base imagesDocker Compose local dev setupsKubernetes deployments pulling imagesRailway/Fly.io builds using Docker images
Last major outage: Nov 2023 — rate limiting tightened
Impact: CI/CD pipelines globally started failing with 429 errors. Free tier limited to 100 pulls/6h per IP. Shared CI runners hit limits immediately.
Mitigation: Mirror critical images to GitHub Container Registry or ECR. Use --pull=if-not-present in CI. Cache base images in your registry.
Stripe
Payments52
dependents
If this goes down →
Vercel (billing)Railway (billing)Lemon Squeezy (processing)Any checkout flow
Last major outage: Nov 2023 — API degradation
Impact: Failed payments, broken checkouts for ~2 hours globally
Mitigation: Queue failed charges for retry; show 'processing' state to users
GitHub
Code Hosting48
dependents
If this goes down →
Vercel (deploys)Netlify (deploys)Railway (deploys)All CI/CD pipelinesDependabot
Last major outage: Mar 2024 — Actions outage
Impact: Blocked all CI/CD, no deploys for 3+ hours
Mitigation: Git mirrors (GitLab/Gitea); deploy from local as fallback
npm Registry
Package Registry45
dependents
If this goes down →
Every npm install in CI/CDVercel/Netlify/Railway buildsDocker image buildsLocal dev environments
Last major outage: Apr 2024 — registry slowdown
Impact: Slow/failed builds across all platforms for 2+ hours
Mitigation: Private registry mirror (Verdaccio/Artifactory); commit lockfiles and use --frozen-lockfile in CI
Tailwind CSS (CDN/npm)
CSS Framework44
dependents
If this goes down →
Next.js apps using TailwindComponent libraries (shadcn/ui)Every Tailwind-based project
Last major outage: N/A — Tailwind v4 migration pain
Impact: Tailwind v3 → v4 migration broke class names, config format, and plugin API. Not an outage but widespread breakage.
Mitigation: Pin Tailwind version. Test v4 migration on a branch. The @tailwindcss/upgrade codemod handles most cases.
Google Workspace
Productivity42
dependents
If this goes down →
Company email (Gmail)SSO via Google OAuthGoogle Drive/Docs collaborationCalendar and Meet for meetings
Last major outage: Aug 2024 — Google OAuth issues
Impact: Google SSO login broken for 1+ hour. All apps using 'Sign in with Google' affected. Gmail delivery delayed.
Mitigation: Support multiple auth methods (email/password + Google OAuth). Don't make Google SSO the only login path. Cache auth sessions aggressively.
Let's Encrypt
TLS/PKI41
dependents
If this goes down →
Auto-SSL on Vercel/Netlify/RailwayAny site using ACME cert provisioningCustom domain HTTPSInternal services with LE certs
Last major outage: Jan 2022 — OCSP responder issues
Impact: Certificate renewals blocked for several hours; expired certs caused browser warnings
Mitigation: Set cert renewal well before expiry (30+ days); consider ZeroSSL as ACME fallback
Google Cloud
Cloud38
dependents
If this goes down →
Firebase (fully hosted)Supabase Edge Functions (some regions)BigQuery consumersGKE clusters
Last major outage: Apr 2024 — Cloud Console
Impact: Firebase console down, deploys blocked for 3 hours
Mitigation: Multi-cloud strategy; avoid single-provider dependency for critical workloads
Slack
Communication35
dependents
If this goes down →
Team communication for most startupsCI/CD alert notificationsOn-call escalations via SlackBot-based workflows (Slackbots, Zapier)
Last major outage: Feb 2024 — intermittent connectivity
Impact: Messages delayed or undeliverable for 2+ hours. Teams lost coordination during incidents.
Mitigation: Redundant alerting via SMS/email for on-call. Discord or Telegram as emergency backup channel. Never route critical alerts only through Slack.
OpenAI
AI API34
dependents
If this goes down →
ChatGPT wrappersAI features in SaaS productsCoding assistantsContent generation tools
Last major outage: Nov 2023 — API outage
Impact: All AI-powered features failed globally for 2+ hours
Mitigation: Fallback to Anthropic/Google; use OpenRouter for automatic failover
Vercel
Hosting31
dependents
If this goes down →
Next.js apps globallyEdge functionsVercel AnalyticsServerless API routes
Last major outage: Feb 2024 — build failures
Impact: No deploys for 2+ hours, existing sites served stale
Mitigation: Keep last-good build cached; consider self-hosted Next.js fallback
Vercel Edge Network
CDN/Edge31
dependents
If this goes down →
Next.js apps globallyEdge Middleware (auth, redirects)Image optimization pipelineServerless API routes
Last major outage: Sep 2024 — edge function failures
Impact: Edge Middleware returned 500 errors globally for 45 min. Auth-gated pages became inaccessible. Fallback to origin failed for ISR pages.
Mitigation: Avoid critical business logic in Edge Middleware. Keep a self-hosted fallback deployment. Use Cloudflare as CDN layer in front of Vercel for caching.
Cloudflare Workers
Edge Compute31
dependents
If this goes down →
Hono apps deployed to WorkersEdge auth middlewareAPI proxies and rate limitersStatic site functions
Last major outage: Nov 2024 — Workers runtime errors
Impact: Workers returning 500 errors for 30+ min in multiple regions. Edge middleware, API routes, and auth checks all failed simultaneously.
Mitigation: Don't put critical auth logic in edge-only workers with no fallback. Have origin server handle auth if edge is down. Multi-region deployment with failover.
Azure
Cloud29
dependents
If this goes down →
GitHub (runs on Azure)GitHub Actions runnersMicrosoft 365 SSO integrationsOpenAI API (hosted on Azure)
Last major outage: Jan 2023 — Azure AD outage
Impact: Microsoft 365 login broken globally, Teams down for 3+ hours
Mitigation: Azure AD SSO failures cascade to GitHub and Teams; maintain local admin accounts as fallback
Auth0 / Okta
Auth28
dependents
If this goes down →
Login flows for thousands of appsSSO for enterprisesMFA verification
Last major outage: Oct 2023 — Okta breach + outage
Impact: Users locked out of apps for hours; breach affected support portal
Mitigation: Cache JWT verification keys; allow existing sessions to continue during outage
Auth0
Auth28
dependents
If this goes down →
Login flows for thousands of B2B appsEnterprise SSO chainsMFA via Auth0 ActionsCustom domain auth pages
Last major outage: Oct 2023 — linked to Okta breach
Impact: Support portal compromised; customer tenant token generation degraded for ~2 hours
Mitigation: Cache auth tokens aggressively; test degraded-mode experience where login is unavailable
Vercel AI SDK
AI Framework28
dependents
If this goes down →
Next.js AI chatbots using useChat()AI-powered SaaS featuresStreaming UI componentsMulti-model routing apps
Last major outage: N/A — breaking changes in v4
Impact: AI SDK v3 → v4 renamed core functions and changed streaming API. Apps using useChat, useCompletion needed rewrites.
Mitigation: Pin AI SDK version. Abstract provider selection behind your own wrapper. Don't couple UI directly to SDK streaming primitives.
Anthropic API
AI API26
dependents
If this goes down →
Claude-powered SaaS featuresCoding assistants (Cursor, Claude Code)RAG pipelines using ClaudeContent generation tools
Last major outage: Jan 2025 — API degradation
Impact: Elevated error rates and latency for 3+ hours. All Claude-dependent features failed or degraded.
Mitigation: Fallback to OpenAI/Groq for non-critical queries. Use OpenRouter for automatic model failover. Cache frequent responses.
Fastly
CDN24
dependents
If this goes down →
Reddit (CDN)GitHub Pages (CDN)Twitch (video delivery)Financial TimesGuardian
Last major outage: Jun 2021 — global CDN outage
Impact: Massive swath of the internet unreachable for ~1 hour due to misconfiguration
Mitigation: Multi-CDN strategy; Cloudflare + Fastly cover different failure modes
Firebase
BaaS22
dependents
If this goes down →
Firebase Auth usersFirestore-backed appsPush notifications (FCM)
Last major outage: Jan 2024 — Firestore latency spike
Impact: Apps using Firestore saw 10s+ query times for 1 hour
Mitigation: Client-side cache; offline-first architecture
Cursor / AI Coding Tools
Developer Tools22
dependents
If this goes down →
Developer productivityCode review workflowsRefactoring velocityKnowledge transfer
Last major outage: Mar 2025 — Anthropic API disruption
Impact: Cursor became unusable for AI features during Claude API outage. Developers who relied on AI for code completion were significantly slowed.
Mitigation: Don't build workflows that can't function without AI. Maintain ability to code without AI assistants. Use tools that support multiple LLM backends.
Datadog
Monitoring21
dependents
If this goes down →
Alert pipelines for engineering teamsPagerDuty triggers via DatadogSLO trackingLog aggregation pipelines
Last major outage: Mar 2023 — ingestion lag
Impact: Metrics delayed 30+ min; on-call teams flying blind for several hours
Mitigation: Datadog outage doesn't break your app, but you lose observability; keep basic uptime checks on a separate provider
SendGrid
Email20
dependents
If this goes down →
Transactional emails (password resets, receipts)Marketing email campaignsEmail verification flowsWebhook-driven notification systems
Last major outage: Nov 2023 — delivery delays
Impact: Transactional emails delayed 2+ hours globally. Password reset flows broken. Some emails never delivered.
Mitigation: Queue critical emails through a fallback provider (SES, Postmark). Never rely on a single email provider for password resets. Show 'email may be delayed' messaging during known issues.
DigitalOcean
Cloud19
dependents
If this goes down →
Apps deployed on DO DropletsManaged Postgres/Redis customersDO Spaces CDN usersKubernetes clusters on DOKS
Last major outage: Feb 2023 — NYC3 region outage
Impact: NYC3 region unavailable for 4+ hours; Spaces CDN degraded
Mitigation: Use multi-region deployments; Spaces is not a CDN replacement — front with Cloudflare
Sentry
Monitoring19
dependents
If this goes down →
Error tracking across all appsPerformance monitoringRelease tracking
Last major outage: May 2024 — ingestion delays
Impact: Errors not reported for ~1 hour; no app impact for end users
Mitigation: Sentry is observability — its outage doesn't break your app
Supabase
BaaS18
dependents
If this goes down →
Apps using Supabase AuthRealtime subscribersEdge Functions
Last major outage: Dec 2023 — dashboard outage
Impact: Dashboard down but APIs mostly worked; 30 min recovery
Mitigation: Direct Postgres connection as fallback; self-host option available
OpenRouter
AI Proxy18
dependents
If this goes down →
Multi-model AI appsLLM fallback chainsCost-optimized AI routingDevelopment/testing across models
Last major outage: Feb 2025 — routing degradation
Impact: Model routing errors for ~1 hour. Apps depending on OpenRouter for failover ironically had no failover for OpenRouter itself.
Mitigation: Use OpenRouter for convenience but maintain direct API keys for your top 2-3 models. If your failover provider fails, you need a failover for the failover.
Hetzner
Cloud/VPS17
dependents
If this goes down →
Self-hosted apps (Coolify, CapRover)Mastodon/Fediverse instancesEuropean SaaS backendsDev/staging environments
Last major outage: Dec 2023 — Nuremberg DC network issues
Impact: VMs unreachable in NBG1 for 2+ hours. Self-hosted services across Europe went dark.
Mitigation: Multi-DC deployment (Falkenstein + Helsinki). Automated failover for critical services. Hetzner is cheap but SLA is 99.9%, not 99.99%.
PagerDuty
Incident Management16
dependents
If this goes down →
On-call alerting for all integrated servicesEscalation policiesIncident timelines
Last major outage: Aug 2023 — notification delays
Impact: Alert notifications delayed 15-30 minutes; incidents missed by on-call teams
Mitigation: Redundant alerting via OpsGenie or direct SMS; never rely on single alert path
Clerk
Auth16
dependents
If this goes down →
Login/signup flowsSession managementOrganization featuresWebhook-driven user sync
Last major outage: May 2024 — API latency spike
Impact: Auth API responses slowed to 5s+. Login flows timed out. Apps with strict middleware checks became unusable.
Mitigation: Cache session tokens client-side. Set generous timeouts on auth middleware. Have a maintenance page ready for auth provider outages.
MongoDB Atlas
Database15
dependents
If this goes down →
Atlas-hosted app databasesAtlas Search consumersAtlas Data API usersAtlas Triggers + Functions
Last major outage: Sep 2023 — us-east-1 replica lag
Impact: Read replicas fell behind by minutes; stale data served for ~45 min
Mitigation: Set appropriate readPreference; design for eventual consistency or use primary reads for critical paths
Twilio
Communications15
dependents
If this goes down →
SMS verification (Auth0, Clerk)Phone-based 2FANotification systems
Last major outage: Sep 2023 — US SMS delays
Impact: OTP codes delayed 10+ minutes, login flows broken
Mitigation: Multiple SMS providers; offer email-based 2FA as fallback
Redis Cloud
Cache/Queue14
dependents
If this goes down →
Session storesRate limiting layersQueue-backed workers (BullMQ)Real-time leaderboards
Last major outage: Nov 2023 — multi-region connectivity
Impact: Cross-region replication stalled; some clusters unreachable for ~1 hour
Mitigation: Graceful degradation without cache; fallback to DB queries on cache miss during outage
PostHog
Analytics14
dependents
If this goes down →
Product analytics dashboardsFeature flags (PostHog flags)Session recordingsA/B experiments
Last major outage: Jun 2024 — ingestion delays
Impact: Events delayed 30+ min. Feature flags continued to work (cached). Analytics dashboards showed stale data.
Mitigation: PostHog outage doesn't break your app if you handle SDK errors gracefully. Feature flags have local evaluation fallback. Don't gate critical UX on analytics availability.
Neon
Database12
dependents
If this goes down →
Serverless Postgres appsVercel Postgres (powered by Neon)Branching-heavy dev workflows
Last major outage: Mar 2024 — cold start issues
Impact: Slow first-connection times for ~2 hours
Mitigation: Connection pooling; keep-alive queries during low traffic
Expo (EAS)
Mobile Build11
dependents
If this goes down →
React Native app buildsOTA updates for mobile appsApp store submissions via EAS Submit
Last major outage: Apr 2024 — EAS Build queue congestion
Impact: Build queue times exceeded 2 hours. Hotfix deploys blocked for mobile apps.
Mitigation: Keep local build capability as fallback (eas build --local). Don't rely solely on EAS for urgent hotfixes.
Resend
Email8
dependents
If this goes down →
Transactional emailsPassword resetsWelcome emails
Last major outage: Jan 2024 — delivery delays
Impact: Emails delayed 30+ min; password resets affected
Mitigation: Queue emails; fallback to SES or Postmark for critical paths