Cleanlab founders Curtis Northcutt, Anish Athalye and Jonas Mueller are hoping to solve the data problem of"garbage in, garbage out."The startup based on a popular open-source project for fixing data problems in AI models now counts cloud heavyweight Databricks as an investor and partner.
“The reality is that every single solution that’s data-driven — and the world has never been more data-driven — is going to be affected by the quality of the data,” said Northcutt, who ran into the problem in stints at Amazon, Google, Meta and Microsoft. “It was ridiculous that there was no solution for this, no company filling the gap.”
Cleanlab is a young startup, but its underpinnings date back to 2013, when Northcutt — the son of three generations of mailmen in rural Kentucky — graduated from Vanderbilt and began a PhD program in computer science at MIT. While there, he built a cheating detection system for validating online course certificates used by the university and Harvard.
While teams at big companies like Chase and Tesla have used the open-source version, cleanlab, for years, Cleanlab’s paying customers are much newer. One tech giant that Northcutt said he couldn’t disclose is already paying $600,000 per year to improve its data for both its core product analytics as well as AI models, the CEO claimed.