• Probably raises $9M to build a reliability layer for LLMs, targeting 99.99% accuracy
  • Hyperliquid ETFs Attract $172M in Net Inflows as Bitcoin Funds See Continued Outflows
  • TRON (TRX) Price Outlook 2026-2030: Realistic Expectations Amid Network Growth
  • Sterling Steadies as Dollar Ignores Oil Slide Ahead of Fed’s FOMC Decision
  • Trump Announces Plan to Fully Reopen Strait of Hormuz by June 19
2026-06-16
Coins by Cryptorank
  • Crypto News
  • AI News
  • Forex News
  • Sponsored
  • Press Release
  • Media Kit
  • Advertisement
  • More
    • About Us
    • Learn
    • Exclusive Article
    • Reviews
    • Events
    • Contact Us
    • Privacy Policy
  • Crypto News
  • AI News
  • Forex News
  • Sponsored
  • Press Release
  • Media Kit
  • Advertisement
  • More
    • About Us
    • Learn
    • Exclusive Article
    • Reviews
    • Events
    • Contact Us
    • Privacy Policy
Skip to content
Home AI News Probably raises $9M to build a reliability layer for LLMs, targeting 99.99% accuracy
AI News

Probably raises $9M to build a reliability layer for LLMs, targeting 99.99% accuracy

  • by Keshav Aggarwal
  • 2026-06-16
  • 0 Comments
  • 3 minutes read
  • 0 Views
  • 20 seconds ago
Facebook Twitter Pinterest Whatsapp
Data science workstation with a monitor displaying a verified analytics dashboard and a small local AI processing device on a desk.

Large language models have become remarkably powerful, but their tendency to generate confident-sounding falsehoods — known as hallucinations — remains a persistent and costly problem. While the industry has experimented with various error-catching techniques, a new startup called Probably believes it has found a more rigorous solution. The company announced today that it has raised $9 million in seed funding from Andreessen Horowitz to bring its approach to market.

Building a ‘mech suit’ for data science

Probably’s first product is a data science tool designed to produce quick, verifiable answers from complex datasets. Each result comes with a citation and an audit trail, a practice that is becoming more common among AI-powered analytics tools. But the core innovation lies in what founder Peter Elias describes as a “data science mech suit” — an elaborate harness system that prevents errors from ever reaching the user.

The system works by having the LLM generate a first-pass answer, which is then checked against a deterministic validator. Any result that does not match the dataset is bounced back. Crucially, the LLM has been trained specifically to work with this validator, and the entire system is optimized for both speed and accuracy. Elias noted that this approach allows the system to run on significantly smaller AI models, reducing token costs substantially.

Smaller models, lower costs, higher accuracy

One of the most striking findings from Probably’s development process is that the quality of the harness engineering can compensate for the power of the underlying model. “What we learned building this was that the better your harness engineering is, the weaker the model can be,” Elias said. “If you can refine the context enough, the model does not have to work very hard to do the right thing. Basically, it’s an exercise in reducing ambiguity.”

This allows Probably’s tool to run on a model that is “four classes weaker than the frontier models,” meaning it can operate on local hardware — a desktop computer rather than a data center. This dramatically reduces token costs at a time when many enterprises are reassessing their AI budgets amid rising expenses.

Implications for precision-sensitive industries

While the initial product is focused on data science, Elias sees the same engine being extended to other “precision-sensitive use cases,” including accounting and medical services. The approach is notable because it does not require a more powerful LLM, but rather a more disciplined system around it. Elias pointed out that the largest AI labs have not pursued this path, suggesting that their business models may not incentivize reducing the number of corrections a user must make.

“I think it’s really interesting that the big AI labs have not even attempted to do this,” Elias said. “They’re incentivized not to, because they make money the more times you have to correct the model.”

Conclusion

Probably’s approach represents a shift in thinking about LLM reliability: instead of trying to build a perfect model, it focuses on building a perfect system around a good enough model. With $9 million in seed funding and a clear focus on verifiability, the company is positioning itself as a key player in the growing market for trustworthy enterprise AI. The challenge now will be proving that its deterministic validation layer can scale across different industries without introducing new bottlenecks.

FAQs

Q1: What does Probably’s product do?
It is a data science tool that uses a combination of an LLM and a deterministic validator to produce accurate, cited answers from complex datasets, with an audit trail for each result.

Q2: How does Probably reduce hallucinations?
By using a deterministic validator system that checks every LLM-generated answer against the original dataset, rejecting any results that do not match. The LLM is trained to work with this validator, and the whole system is optimized for accuracy.

Q3: Why can Probably use smaller models?
Because the harness engineering around the model reduces ambiguity, allowing a less powerful model to produce accurate results. This also allows the system to run on local hardware, cutting token costs significantly.

Disclaimer: The information provided is not trading advice, Bitcoinworld.co.in holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Tags:

AIAndreessen HorowitzFundinghallucinationsLLMs

Share This Post:

Facebook Twitter Pinterest Whatsapp
Avatar photo

Keshav Aggarwal

Co- Founder
Keshav Aggarwal is the Co-Founder & CEO of BitcoinWorld, a Google News - indexed publication covering crypto, AI, and forex markets since 2020. A blockchain investor and trader with over six years in the digital-asset space, he built one of India's most active crypto investor communities and has guided thousands of retail participants through their first investments in the asset class. At BitcoinWorld, he sets editorial direction across the newsroom and reports on the business of crypto, AI, and Web3 - tracking the funding rounds, product launches, and regulatory shifts shaping the future of finance and frontier technology.
Next Post

Hyperliquid ETFs Attract $172M in Net Inflows as Bitcoin Funds See Continued Outflows

Categories

92

AI News

Crypto News

Bitcoin Treasury Ambition: The Blockchain Group Seeks Staggering €10 Billion

Events

97

Forex News

33

Learn

Press Release

Reviews

Google NewsGoogle News TwitterTwitter LinkedinLinkedin coinmarketcapcoinmarketcap BinanceBinance YouTubeYouTubes

Copyright © 2026 BitcoinWorld | Powered by BitcoinWorld