Will probably raise $9 million to build a more reliable type of AI

As LLMs have become more powerful, hallucinations have proven stubbornly difficult to avoid. Bugs appear in even the smartest models, and while there are ways to detect them, the industry is still figuring out the best way to do so.

Probablywhich just raised $9 million in seed funding from Andreessen Horowitz, is trying to create a more rigorous way to detect those errors.

As founder Peter Elias (pictured above) puts it, the company’s goal is to prevent hallucinations and simple factual errors from reaching the user, and to achieve the kind of 99.99% accuracy that is common in deterministic systems but much more difficult to achieve with AI. It turns out that bringing LLMs to that level of precision requires rethinking many of the basic assumptions of AI engineering.

The first product will likely be a data science tool, built to produce quick answers from complex data sets. Each result comes with a citation and an audit trail of how it was developed, an increasingly common practice among AI tools.

But preventing errors from creeping into those summaries required an elaborate harness system that Elias describes as a “data science mechanical suit.” First-pass responses from the LLM are compared to a deterministic validation system, which returns any results that do not match the data set. Crucially, the LLM has been trained against the validator and the entire system is optimized for fast and accurate responses, the company said.

“What we learned when building this was that the better the engineering of the harness, the weaker the model can be,” Elias says. “If you can refine the context enough, the model doesn’t have to work hard to get it right. Basically, it’s an exercise in reducing ambiguity.”

That allows Probably’s data science tool to run on significantly smaller AI models. Elias says the current version runs on a model that is “four classes weaker than frontier models,” meaning it can run on local hardware (i.e., a desktop computer instead of a data center), which reduces a huge amount of symbolic costs associated with using AI.

It’s a welcome idea at a time when token costs are rising and many customers are Reevaluating your AI budgets. And Elias’ idea does not end with data science, since the same engine can be extended to cover use cases such as accounting or medical services; as Elias says, “any precision-sensitive use case.”

“I think it’s really interesting that the big AI labs haven’t even tried to do this,” Elias says. “They are incentivized not to do it, because they make money the more times the model has to be corrected.”

When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.

Source link

Will probably raise $9 million to build a more reliable type of AI

Leave a ReplyCancel Reply

OnePlus 16 leaked specs hint at one of the sleekest displays yet

The next rival to Vivo’s Samsung Galaxy Z Fold has just revealed its greatest strengths

Ars Technica 2026 Reader Poll: Let Your Voice Be Heard!

Leave a ReplyCancel Reply

Trending now

OnePlus 16 leaked specs hint at one of the sleekest displays yet

The next rival to Vivo’s Samsung Galaxy Z Fold has just revealed its greatest strengths

Ars Technica 2026 Reader Poll: Let Your Voice Be Heard!