Anthropic, the artificial intelligence (AI) startup behind Claude chatbot, has created an AI model, called Claude Mythos Preview, so proficient at finding and exploiting software flaws that itis refusing to release the tool to the general public. The company has taken this step as a measure to keep the technology out of the hands of hackers, and opting to share it exclusively with a select group of major tech and finance corporations to help secure the internet’s underlying infrastructure.
Anthropic’s Project Glasswing
Fearing the severe consequences if Mythos Preview were to proliferate among bad actors, Anthropic has launched a new partnership dubbed Project Glasswing.Rather than a public rollout, Anthropic is granting exclusive access to 11 industry giants to help them find and patch flaws in their own systems. The partners include: Apple, Google, JPMorgan Chase, Amazon Web Services (AWS), Microsoft, Nvidia, Cisco, Broadcom, CrowdStrike, Palo Alto Networks and The Linux Foundation.“Today we’re announcing Project Glasswing, a new initiative that brings together Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks in an effort to secure the world’s most critical software,” the company said in a blog post.To support the initiative, Anthropic is providing its partners with $100 million in usage credits to hunt for difficult-to-spot bugs, alongside $4 million in direct donations to open-source security organizations. The company views this as a starting point to ultimately build stronger, safer software globally.
‘Too powerful for the public’
In the announcement, Anthropic claimed that Claude Mythos Preview has achieved a level of coding capability that surpasses almost all highly skilled human programmers. The company also said that the AI has already uncovered thousands of high-severity vulnerabilities hidden inside every major operating system and web browser.In one instance, Mythos Preview discovered a critical, 27-year-old vulnerability in OpenBSD, an operating system heavily relied upon for critical global infrastructure. The bug, which can allow attackers to remotely crash devices, somehow survived decades of human security reviews and millions of automated tests.“It also discovered a 16-year-old vulnerability in FFmpeg—which is used by innumerable pieces of software to encode and decode video—in a line of code that automated testing tools had hit five million times without ever catching the problem,” the company added.“The model autonomously found and chained together several vulnerabilities in the Linux kernel—the software that runs most of the world’s servers—to allow an attacker to escalate from ordinary user access to complete control of the machine,” Anthropic said.Jack Lindsey, a neuroscientist at the company, revealed that early versions of the model exhibited highly sophisticated, unspoken strategic thinking, sometimes even hiding its reasoning or displaying situational awareness in service of “unwanted actions”.

Leave a Reply