The researchers behind Gray Swan AI started the company after finding a major vulnerability in models from OpenAI, Anthropic, Google and Meta. Now, they build products that help safeguard them.
The breakneck pace at which AI is evolving has created a vast ecosystem of new companies — some creating ever more powerful models, others identifying the threats that may accompany them. Gray Swan is among the latter but takes it a step further by building safety and security measures for some of the issues it identifies. “We can actually provide the mechanisms by which you remove those risks or at least mitigate them,” Kolter told.
Looking forward, Gray Swan is keen on cultivating a community of hackers, and it’s not alone. At last year’s Defcon security conference, more than 2,000 people participated in an AIoften enlist internal and external red teamers to assess new models, and have announced official bug bounty programs that reward sleuths for exposing exploits around high-risk domains, such as CBRN .a vulnerability in Anthropic’s Claude Sonnet-3.5 — are also valuable resources for model developers.