ChatGPT goes down. A lot. And every time it does, millions of people search the same phrase: "is ChatGPT down right now?" The frustration is real — but so is the technical explanation, and it is more interesting than you might expect.
AI inference is fundamentally different from serving a webpage. Understanding why explains not just ChatGPT outages, but every AI service outage you will ever experience.
Why AI services are harder to keep up than normal websites
A normal website serves files. A CDN caches them. You add more servers. Scaling is well-understood.
An AI inference endpoint is different. Every request requires a large language model — often with billions of parameters — to run on specialized GPU hardware. GPUs are expensive, scarce, and cannot be spun up in seconds the way virtual machines can. When demand spikes unexpectedly, there is no quick way to add capacity. You either queue requests (causing slowdowns) or reject them (causing errors).
This is why ChatGPT shows "at capacity" messages. It is not a bug. It is the honest consequence of finite GPU infrastructure meeting unpredictable demand spikes — often caused by a viral tweet, a new product launch, or a school deadline.
The three types of ChatGPT outages
**Capacity overload** — The most common. Request volume exceeds GPU availability. New conversations queue, slow down, or get rejected. Usually resolves within 30–90 minutes as OpenAI scales up or demand drops.
**Model switchover disruptions** — When OpenAI deploys a new model version (GPT-4o, GPT-4 Turbo, etc.), the transition can cause brief instability as traffic is rerouted between model versions. These are usually short but can cause confusing errors where some requests succeed and others fail.
**Infrastructure failures** — Less common but more severe. A failure in Azure's GPU clusters (OpenAI runs on Microsoft Azure), a networking issue, or a database problem affecting conversation history. These can last hours and affect features unevenly — some users lose chat history while still being able to generate responses.
How to tell which type is happening
**Capacity issues:** You see "ChatGPT is at capacity" or very slow response times. API requests return 429 (Too Many Requests). Status page shows "High traffic." Resolution: wait 20–60 minutes or try at off-peak hours (typically 6–9 AM UTC).
**Model/deployment issues:** Some requests work, others fail. You may see one conversation succeed while another returns an error on the same account. Certain features (plugins, image generation, code interpreter) fail while basic chat works. Resolution: wait — usually resolves in under an hour.
**Infrastructure failures:** Nothing works. Login fails, existing conversations don't load, API returns 500 or 503. OpenAI's status page shows "Investigating" or "Identified." Resolution: could take 1–4 hours. Watch their status page.
The alternatives when ChatGPT is down
The healthy move when ChatGPT is down is having alternatives already in your workflow rather than scrambling for them during an outage.
**Claude (Anthropic)** — Runs on separate infrastructure. When ChatGPT is down, Claude is typically unaffected. claude.ai is the direct route.
**Gemini (Google)** — Google's infrastructure is separate from Microsoft/Azure. Gemini Advanced is a legitimate ChatGPT substitute for most tasks.
**Perplexity** — Runs on a mix of models but on its own infrastructure. Often available when OpenAI's endpoints are overloaded.
The key insight: diversify your AI tools. Depending on a single provider for critical work creates a single point of failure that you have no control over.
When will ChatGPT come back?
The honest answer: check the live status on WebsiteDown. But based on historical patterns:
— Capacity issues: 20–90 minutes — Model deployment disruptions: 30–120 minutes — Infrastructure failures: 1–6 hours (rarely longer)
OpenAI has gotten better at handling capacity issues since 2022. Full infrastructure failures affecting all users are now rare — but not impossible. The 2023 and 2024 outages that left users locked out of accounts for hours were genuine infrastructure incidents, not just traffic spikes.