Since early August 2025, users began reporting a noticeable decline in the response quality of Claude AI. By the end of the month, the issue had grown significantly worse, eventually drawing Anthropic’s attention and prompting an investigation. The company has now published a detailed blog post explaining the causes behind these problems.
In the post, Anthropic repeatedly emphasized that this was absolutely not a case of intentionally degrading performance. The company stated unequivocally that it would never reduce model quality due to demand, time constraints, or server load. Instead, the problems stemmed entirely from bugs within the AI infrastructure rather than deliberate intervention.
Issue 1: Context Window Routing Error
On August 5, 2025, a subset of Claude Sonnet 4 requests—approximately 0.8%—were mistakenly routed to servers configured for an upcoming 1M token context window.
By August 29, an unexpected change in the load balancer caused a greater share of requests to be misrouted to this server. On August 31, the problem worsened significantly, with nearly 16% of requests affected, leading to a clear decline in response quality.
Because Claude AI servers use sticky routing, once a user’s request was incorrectly handled by the misconfigured server, subsequent requests were likely to be routed there as well. This meant that the most affected users consistently experienced the sharpest drop in quality.
Anthropic deployed a fix on September 4 to correct the routing logic, ensuring that future requests would be assigned to the proper servers.
Issue 2: Output Corruption
Beginning August 25, a misconfigured update was deployed on Claude API TPU servers. This error disrupted token generation, producing anomalies such as Chinese or Thai characters in English responses, or obvious syntax errors in code generation.
The issue affected Claude Opus 4/4.1 requests between August 25 and 28, and Claude Sonnet 4 requests between August 25 and September 2. Third-party platforms were not impacted.
On September 2, Anthropic rolled back the faulty change to resolve the issue. Additionally, the company introduced new deployment tests designed to detect abnormal character output, helping to prevent a recurrence.
Issue 3: XLA:TPU Miscompilation
On August 25, Anthropic deployed a code improvement designed to enhance Claude’s token selection during text generation. However, the change inadvertently triggered a latent bug in the XLA:TPU compiler, affecting Claude Haiku 3.5 requests.
Anthropic noted that this issue may also have impacted a subset of Claude API requests using Sonnet 4 and Opus 3, though third-party platforms were unaffected.
The solution was to roll back the optimization, which resolved the issue for Haiku 3.5. Following additional user reports, Anthropic also reverted changes for Opus 3, and, out of caution, for Sonnet 4—even though Sonnet 4 had not been directly affected.
Related Posts:
- Beyond the Terminal: Anthropic Launches a Web-Based Editor for Claude Code
- GPUHammer: First Rowhammer Attack on GDDR6 GPU Memory Induces Bit Flips, Degrades AI Models
- Anthropic Unveils Claude 4: New AI Models Excel in Programming & Extended Thinking
- Claude Sonnet 4 Gets a Massive 1-Million-Token Context Window for Enterprise Developers