AI models are terrible for betting on football, especially xAI Grok



“All of the frontier models we evaluated lost money during the season and many experienced ruin,” the paper’s authors concluded, and the AI ​​”systematically underperforms humans” in this scenario.

AI model Average return on investment best try worst attempt Middle End Bottom
Anthropic Claude Opus 4.6 –11.0% –0.2% –18.8% £89,035
OpenAI GPT-5.4 –13.6% –4.1% –31.6% £86,365
Google Gemini 3.1 Pro –43.3% +33.7% –100.0% £56,715
Google Gemini Flash 3.1 LP –58.4% +24.7% –100.0% £41,605
Z.AI GLM-5 –58.8% –14.3% –100.0% £41,221
Kimi K2.5 Moonshot –68.3% –27.0% –100.0% £7,420
xAI Grok 4.20 –100.0% –100.0% –100.0% £0
Trinity of Acre –100.0% –100.0% –100.0% £0
Each model started with a standard fund of £100,000. Return on investment and final funds are averaged over three attempts. Grok and Trinity did not complete all attempts.

The results offer some comfort to white-collar professionals and companies who fear AI could take their jobs, as it affects stocks in industries ranging from finance to marketing.

Ross Taylor, one of the study’s authors and CEO of General Reasoning, said: “There’s a lot of hype about automating AI, but not much action to put AI on a long-term horizon.”

He added that many of the benchmarks typically used to test AI are flawed because they are set in “very static environments” that bear little resemblance to the chaos and complexity of the real world.

The General Reasoning paper, which has not yet been peer-reviewed, provides a counterweight to growing enthusiasm in Silicon Valley over recent enormous advances in AI’s ability to complete computer programming tasks with little or no human intervention.

Taylor, a former Meta AI researcher, said: “If you… try AI on some real-world tasks, it performs really poorly… Yes, software engineering is very important and economically valuable, but there are many other activities with longer time horizons that are important to consider.”

© 2026 The Financial Times Ltd.. All rights reserved. It must not be redistributed, copied or modified in any way.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *