News
2dOpinion
Gadget on MSNWhy you can’t trust Grok 4’s benchmarksOn paper, the AI platform created by Elon Musk’s xAI shoots the lights out, but it's a different matter in practice, writes ...
Grok 4 is a huge leap from Grok 3, but how good is it compared to other models in the market, such as Gemini 2.5 Pro? We now have answers, thanks to new independent benchmarks.
Last week, Elon Musk’s xAI released the long-awaited Grok 4. And from our perspective, it likely marked the moment AI ...
Grok 4 by xAI was released on July 9, and it's surged ahead of competitors like DeepSeek and Claude at LMArena, a leaderboard ...
Grok 4 AI model launches with major upgrades; Elon Musk predicts it could invent new technologies and discover physics by 2026.
There is a common problem for all AI companies for overfitting to benchmarks. XAI Grok 4 has some problems with prompt ...
Record benchmarks of xAI's new powerful Grok 4 model are overshadowed by an antisemitic bug, built-in bias, and corporate ...
If these leaked Grok 4 benchmarks are correct, 95 AIME, 88 GPQA, 75 SWE-bench, then XAI has the most powerful model on the market. The GPQA for Grok and SWE Bench rankings for Grok 4 code will also ...
And Grok 4 appears as the top-performing publicly available model on the leaderboards for the Abstraction and Reasoning Corpus, or ARC-AGI-1, and its second edition, ARC-AGI-2—benchmarks that ...
Elon Musk's newly launched AI chatbot, Grok 4, seemed to reference Musk's posts on social media before answering ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results