Artificial intelligence and machine learning breakthroughs, models, and research.
26 articles

OpenAI released GPT-5.5 on Thursday, just six weeks after GPT-5.4. Codenamed "Spud" internally, it's reportedly the company's first fully retrained LLM since GPT-4.5, and the pitch is agentic work — hand it a messy, multi-step task and trust it to finish. On Terminal-Bench 2.0 it scores 82.7%, a clear jump over GPT-5.4's 75.1% and well ahead of Claude Opus 4.7 at 69.4%.



Anthropic's new flagship AI model ships today with better software engineering, a new xhigh effort tier, and unchanged pricing.

Anthropic's newest flagship Claude model ships with better agentic coding, a new effort tier, and a visible shadow from the unreleased Mythos Preview.

Alibaba's HappyHorse-1.0 debuted anonymously, shot to the top of global video AI benchmarks, and was confirmed as Alibaba's best-performing video model to date.

From Anthropic's Claude Mythos to OpenAI's Codex Security, AI agents are discovering critical vulnerabilities faster and at a scale no human team can match.


Anthropic's Claude Code now ships a native /loop command that schedules recurring prompts on a cron-based interval, giving developers a built-in autonomous monitoring layer without any external tooling.

Anthropic's new $100M cybersecurity alliance gives vetted partners exclusive access to Claude Mythos Preview, an AI model that can find and exploit software vulnerabilities better than most humans.

A new Epoch AI and Ipsos survey of 2,000 U.S. adults finds AI is actively displacing work tasks for 20% of full-time employees, with replacement outpacing new job creation.

The two tech giants are aligning their hardware and software roadmaps to challenge Nvidia's dominance in the data center.