Image Source: https://cdn.arstechnica.net/wp-content/uploads/2025/09/sonnet45-1152×648.jpg
On September 29, 2025, Anthropic announced the release of its highly anticipated AI model, Claude Sonnet 4.5, which boasts the remarkable capability to maintain focus for over 30 continuous hours on complex multi-step tasks. This groundbreaking advancement positions Claude Sonnet 4.5 as the company’s most capable model to date, specifically enhancing coding and cognitive operations.
The new model is complemented by the debut of Claude Code 2.0, a command-line AI agent catering specifically to developers, and the Claude Agent SDK, which allows developers to create their own AI coding agents. Anthropic’s claims suggest significant progress in AI reliability, as previous models typically struggled with coherence over extended periods, often losing track of context and context windows during lengthy tasks.
Unmatched Performance on Coding Benchmarks
Anthropic asserts that Claude Sonnet 4.5 stands out as “the best coding model in the world,” powered by its innovative design tailored for building complex AI agents. In head-to-head comparisons, Claude Sonnet 4.5 has surpassed notable competitors such as OpenAI and Google. Notably, the model achieved a score of 77.2 percent on the SWE-bench Verified benchmark, which evaluates real-world coding abilities. It also led the OSWorld benchmark with an impressive score of 61.4 percent, outperforming OpenAI’s GPT-5 Codex at 74.5 percent and Google’s Gemini 2.5 Pro at 67.2 percent.
Beyond coding, Claude Sonnet 4.5 delivered strong performances in various assessments, including the AIME 2024 math competition benchmark and the MMMLU, which evaluates knowledge across multiple languages. For financial tasks, it scored a remarkable 92 percent on Vals AI’s Finance Agent benchmark, showcasing its adaptability and versatility in diverse domains.
Enhanced User Experience and Features
The Claude Sonnet 4.5 also features improved functionalities, especially in computer usage, where it increased its score from 42.2 percent in its predecessor to 61.4 percent. This enhancement signifies a notable leap in its ability to navigate and perform tasks related to software.
The AI model has integrated features that allow users to execute code and create files directly within conversations, significantly streamlining workflow and productivity. Users can generate complex spreadsheets, slides, and documents without switching interfaces, making it a robust tool for both developers and business professionals alike.
Addressing Concerns with AI Behavior
In a landscape riddled with challenges regarding AI interactions, Anthropic aims to alleviate concerns surrounding AI outputs. The company reports that Claude Sonnet 4.5 shows reduced tendencies for âsycophancy, deception, power-seeking, and encouraging delusional thinking,â compared to earlier models. These improvements are crucial as users increasingly rely on AI chatbots for a multitude of tasks beyond mere coding assistance.
Future Outlook and Model Usage
Simon Willison, an established software developer, expressed his favorable initial assessment of Claude Sonnet 4.5, suggesting it feels superior to previous models like GPT-5-Codex for coding tasks. With the rapid pace at which AI technology evolves, it’s clear that the quest for the best coding model remains competitive, especially with the impending introduction of Gemini 3.
Developers can access Claude Sonnet 4.5 through the existing API at the same pricing as its predecessor, maintaining affordability for users at three dollars per million input tokens and fifteen dollars per million output tokens.
Conclusion: A Notable Step Forward in AI Development
While itâs important to remain cautious about AI benchmarksâgiven their potential for bias and inaccuraciesâSonnet 4.5 represents a significant advancement within the Claude lineup. Its ability to maintain a longer focus on intensive tasks is commendable and may redefine user expectations in AI assistants.
Frequently Asked Questions
What is Claude Sonnet 4.5?
Claude Sonnet 4.5 is Anthropic’s latest AI language model designed to improve coding capabilities and sustain focus on complex tasks for extended periods.
How does Claude Sonnet 4.5 perform compared to previous models?
Comparatively, Claude Sonnet 4.5 shows significant enhancements, maintaining focus for over 30 hours and outperforming predecessor models in various coding benchmarks.
What features does the new AI model offer developers?
Developers can benefit from features like direct code execution, file creation, and a command-line AI agent that streamlines the development process.
Is Claude Sonnet 4.5 available for public use?
Yes, Claude Sonnet 4.5 is available to developers through the Claude API, maintaining the same pricing structure as its predecessor.
What improvements have been made to user interactions?
Anthropic’s latest model focuses on reducing undesirable AI behaviors like sycophancy, resulting in more reliable and contextually appropriate responses.