A New Claude 3.5 Sonnet and Claude 3.5 Haiku
Generative AI
Boonyawee Sirimaya
2
min read
October 25, 2024

A New Claude 3.5 Sonnet and Claude 3.5 Haiku

In a significant update, the latest iteration of Claude 3.5 introduces two exciting advancements: an upgraded version of Claude 3.5 Sonnet and the all-new Claude 3.5 Haiku. These new models are designed to bring a host of improvements, particularly in coding, making them even more effective in delivering AI-powered solutions. Alongside these, Anthropic has also launched a groundbreaking feature in public beta—computer use—allowing Claude to navigate and interact with computers in ways similar to humans.

Enhanced Claude 3.5 Sonnet: Boosting AI Capabilities

Claude 3.5 Sonnet builds on its predecessor with enhanced performance, particularly excelling in coding tasks. It has shown notable improvements on several industry benchmarks, establishing itself as a leading AI model for complex, multi-step tasks. One area of focus is agentic coding, where it has achieved a substantial performance increase, boosting its score on SWE-bench Verified from 33.4% to 49.0%. These advancements position it ahead of competing models like OpenAI’s o1-preview and specialized agentic coding systems.

The updated Sonnet also excels in agentic tool-use benchmarks, significantly improving in the retail and airline sectors. Despite these upgrades, the model remains priced the same and delivers results with no added latency. Early feedback from clients such as GitLab and Cognition highlights its strengths in planning, problem-solving, and coding, positioning it as a top choice for developers looking for advanced AI tools.

Comparison table showing performance metrics across AI models including Claude 3.5 Sonnet (new), Claude 3.5 Haiku, and other models, with scores for various tasks like reasoning, coding, and math.
Performance comparisons across AI models.

Claude 3.5 Haiku: Affordability Meets High Performance

Anthropic has also unveiled Claude 3.5 Haiku, a new model that rivals its predecessor, Claude 3 Opus, on many intelligence benchmarks, while maintaining similar speed and cost. It shines in coding, scoring 40.6% on SWE-bench Verified, surpassing previous generations and even outperforming the original Claude 3.5 Sonnet in some aspects. With its low latency and enhanced instruction-following abilities, Claude 3.5 Haiku is an excellent fit for customer-facing applications and tasks that require managing large data sets, such as inventory tracking or personalized user experiences.

Claude 3.5 Haiku will be available later this month via Anthropic’s API and platforms like Amazon Bedrock and Google Cloud’s Vertex AI.

A New Frontier: Computer Use in Public Beta

Anthropic is breaking new ground by teaching Claude how to use computers like a human would—navigating interfaces, moving cursors, and typing text. This capability is still in its early stages and can be error-prone at times, but the potential for automating repetitive tasks or performing complex workflows is immense.

Asana, Canva, DoorDash, and other companies are already experimenting with this feature, using Claude 3.5 Sonnet’s computer navigation to perform multi-step processes, such as evaluating apps or managing web-based workflows. On OSWorld's AI benchmark for computer navigation, Claude 3.5 Sonnet outperformed other models, showcasing its promise in this innovative space.

While the current capabilities may present some limitations—such as challenges with actions like scrolling or zooming—Anthropic is focused on refining the model's abilities. Developers are encouraged to experiment with low-risk tasks as feedback is gathered to enhance the feature further. Safety remains a priority, and steps are in place to prevent misuse, such as fraud or misinformation.

Looking Forward

As these innovations evolve, the potential for AI models to perform more intricate tasks will continue to expand. The latest developments with Claude 3.5 Sonnet, Haiku, and computer use will undoubtedly provide new ways for businesses to streamline operations, automate workflows, and drive productivity.

This is just the beginning for Claude’s capabilities, and Anthropic is eager to learn from early users to refine and enhance its features. Be sure to explore these exciting new tools and share your insights to shape the future of AI-powered solutions.

Consult with our experts at Amity Solutions for additional information here