Anthropic's Claude Sonnet 4 model has received a significant upgrade, now boasting a one million token context window, a fivefold increase from its previous limit of 200,000 tokens. This enhancement allows the model to process substantially larger amounts of information in a single request, opening up new possibilities for developers and enterprises. The expanded context window is currently in public beta and accessible through the Anthropic API and Amazon Bedrock, with support for Google's Vertex AI coming soon.
A context window of one million tokens is roughly equivalent to 750,000 words. This capacity enables Claude Sonnet 4 to reason over extensive datasets without requiring developers to employ more intricate techniques like retrieval-augmented generation (RAG). With this upgrade, the model can now evaluate larger codebases, synthesize comprehensive document sets, and construct AI agents capable of maintaining context across numerous tool interactions. For instance, Claude Sonnet 4 can process codebases with over 75,000 lines of code or dozens of research papers in one API request.
The extended context window facilitates more data-intensive projects, such as large-scale code analysis and document synthesis. It also supports the development of context-aware agents that need substantial material to manage complex workflows. According to Anthropic, the increased context window allows the model to thoroughly understand project architecture, identify dependencies across files, and propose improvements that consider the entire system design. Moreover, it enables the analysis of relationships within extensive document sets, like legal contracts or technical specifications, while preserving complete context.
Several companies are already experiencing the benefits of Claude Sonnet 4's enhanced capabilities. Bolt.new, a web development platform, utilizes Claude Sonnet 4 for code generation workflows, noting its superior performance compared to other leading models. iGent AI, a software development company, is using Claude Sonnet 4 with a 1M token context to power Maestro, an AI partner that transforms conversations into executable code.
However, the increased context window comes with a higher price. Prompts exceeding the 200,000 token limit are charged at premium rates, with input costs doubling to $6 per million tokens and output costs increasing by 50%. Anthropic suggests using prompt caching to reduce both cost and latency. They also highlight that their batch processing mode can further decrease costs by 50%.
This upgrade intensifies the competition in the AI coding market. While OpenAI and Google also provide million-token context windows, Anthropic asserts that Claude Sonnet 4 outperforms them. Anthropic has also matched OpenAI's pricing for government use, offering Claude to federal agencies for a nominal fee.
Despite the advantages, there are discussions about the effectiveness of large language models when dealing with extremely large context windows. While models generally perform well in "needle-in-a-haystack" tests, some researchers argue that this doesn't necessarily reflect how developers utilize context windows in practice.