AI for Business

Poetiq's Small Team and Smart System Outperform Tech Giants on Key AI Test

A startup with only six employees has outperformed the research teams of Google and Anthropic on a demanding test of machine reasoning, achieving the result with a hardware budget of just $40,000....

Share:

A startup with only six employees has outperformed the research teams of Google and Anthropic on a demanding test of machine reasoning, achieving the result with a hardware budget of just $40,000. Poetiq, founded in June 2025 by former Google DeepMind leads Shumeet Baluja and Ian Fischer, announced its emergence alongside a $45.8 million seed funding round.

The company’s method does not involve building new large language models. Instead, it creates a meta-system that sits on top of existing models like GPT, Gemini, Claude, and Llama. This system uses a process of recursive self-improvement to generate specialized expert agents for complex problems, requiring only a few hundred examples from a client rather than thousands.

Its success was measured on the ARC-AGI-2 benchmark, a test designed by François Chollet to evaluate abstract reasoning and generalization—areas where large language models typically struggle. Using Google's Gemini 3 Pro, Poetiq's system achieved 54% accuracy on a semi-private evaluation set, surpassing a more expensive specialized version of Gemini. Later, integrating a new OpenAI model, the system reached 75% accuracy on a public set, setting a new record.

"Large language models are incredible repositories of knowledge, but they aren't inherently built for deep reasoning," Baluja explained to Pulse 2.0. His company's approach uses iterative loops of generation, critique, and verification, averaging fewer than two queries to the underlying model to solve a problem. This efficiency stands in sharp contrast to the massive computational demands of conventional AI training.

The substantial seed round was co-led by FYRFLY Venture Partners and Surface Ventures, with participation from Y Combinator and others. Investors are betting on Poetiq's ability to bring reliable, cost-effective reasoning to enterprise applications like fraud detection and claims processing, addressing a noted gap in current AI pilot projects. The company, headquartered in Miami, is now positioned to expand its work with business and research teams.

Source: Webpronews

Ready to Modernize Your Business?

Get your AI automation roadmap in minutes, not months.

Analyze Your Workflows →