Claude Sonnet 4 has been upgraded, and it can now remember up to 1 million tokens of context, but only when it’s used via API. This could change in the future.
This is 5x more than the previous limit. It also means that Claude now supports remembering over 75,000 lines of code, or even hundreds of documents in a single session.
Previously, you were required to submit details to Claude in small chunks, but that also meant Claude would forget the context as it hit the limit. With up to a 1 million context limit, you can build better apps, and Claude can remember more of your code than ever.
It is worth noting that the 1 million context limit is limited to Sonnet 4. Opus 4.1 still has the old limitations because it’s an expensive model.
Only API gets 1 million tokens context limit
The new context limit is rolling out via the Anthropic API for customers with Tier 4 and custom rate limits, with broader availability rolling out over the coming weeks.
“Long context is also available in Amazon Bedrock and is coming soon to Google Cloud’s Vertex AI,” Anthropic noted.
“With 1M tokens you can: load entire codebases with all dependencies, analyze hundreds of documents at once, and build agents that maintain context across hundreds of tool calls. Pricing adjusts for prompts over 200K tokens, but prompt caching can reduce costs and latency.”
Claude’s mobile and web apps will be getting the 1 million token context limit at some point in the future.
46% of environments had passwords cracked, nearly doubling from 25% last year.
Get the Picus Blue Report 2025 now for a comprehensive look at more findings on prevention, detection, and data exfiltration trends.
 
													 
													




