
Anthropic has officially unveiled its latest generation of AI models under the Claude 4 series, comprising the flagship Claude Opus 4 and the performance-optimized Claude Sonnet 4. These models excel in programming capabilities and sustained execution of complex tasks, with Anthropic positioning them as industry-leading AI assistants designed to rival OpenAI’s ChatGPT and Google’s Gemini.
Claude Opus 4 represents Anthropic’s most powerful model to date, particularly excelling in the realm of software engineering. According to the company’s official blog, Opus 4 achieved a remarkable 72.5% on the SWE-bench benchmark and 43.2% on the Terminal-bench—outperforming its predecessors as well as Google’s Gemini 2.5 Pro.
A distinctive advantage of Opus 4 lies in its “Extended Thinking” capability, which enables the model to pause during intricate tasks, retrieve additional data from search engines or external tools, and resume execution seamlessly. This empowers Opus 4 to handle highly elaborate workflows requiring thousands of steps over several hours—ranging from code debugging and problem decomposition to successfully running vintage video games like Pokémon Red by accessing files and navigating custom guides.
While Claude Sonnet 4 is a more compact model, it delivers a substantial performance leap over its predecessor, Sonnet 3.7—particularly in instruction adherence and coding tasks. Anthropic revealed that Sonnet 4 now powers GitHub’s next-generation Copilot coding assistant. As the default model for Claude’s free-tier chatbot, Sonnet 4 boasts immense potential for widespread adoption.
- Parallel Tool Utilization: Both Opus 4 and Sonnet 4 are equipped to simultaneously leverage multiple third-party tools, switching fluidly between reasoning and search to enhance efficiency.
- Memory System: By accessing external files, the models can store and retrieve key information, sparing users the need for repetitive inputs.
- Thought Summarization: To circumvent verbose process descriptions, Claude 4 employs auxiliary AI to generate concise “thought summaries,” distilling thousands of task steps into digestible overviews, thus illuminating the model’s decision-making process for users.
Anthropic also noted significant algorithmic improvements in Claude 4 that mitigate tendencies to “shortcut” tasks or generate fabricated answers, thereby enhancing output reliability and transparency.
- Claude Sonnet 4: Striking a balance between performance and cost, priced at $3 per million input tokens and $15 per million output tokens—ideal for developers and general users.
- Claude Opus 4: As a premium-tier model, it commands higher rates ($15 input / $75 output per million tokens), but its exceptional capacity for complex task handling makes it a prime choice for professionals and enterprise clients.
Both models offer a 50% discount for batch processing, further reducing the cost for large-scale deployments.
Anthropic’s pricing architecture underscores its strategy to attract a wide spectrum of users—from individual developers to large enterprises—via the free Sonnet 4 tier and paid subscription plans encompassing Opus 4 under Claude Pro, Max, Team, and Enterprise packages.
Despite Claude 4’s outstanding performance in programming and extended task execution, its context window remains capped at 200K tokens—a constraint that lags behind Google Gemini 2.5 Pro’s 1 million tokens (with plans to support 2 million), and OpenAI’s ChatGPT 4.1, also supporting 1 million tokens. This limitation may pose challenges for ultra-large projects, particularly those involving vast codebases or lengthy documents.