In AI Chess Showdown, ChatGPT Outplays Grok; Gemini Finishes Third

Grok 4 led other models earlier but was ultimately defeated by ChatGPT.

Aug 12, 2025
In AI Chess Showdown, ChatGPT Outplays Grok; Gemini Finishes Third
Robotic hand with a chess pawn

OpenAI’s ChatGPT o3 model defeated Elon Musk’s xAI model Grok 4 in the final of a Kaggle-hosted tournament that set out to find the strongest chess-playing large language model (LLM). The event, held over three days, pitted general-purpose LLMs from several companies against each other rather than specialised chess engines.

Elon Musk downplayed the defeat, saying Grok’s earlier strong results were a “side effect”.

Eight models took part, including entries from OpenAI, xAI, Google, Anthropic and Chinese developers DeepSeek and Moonshot AI. The contest used standard chess rules but tested multi-purpose LLMs, systems that are not specifically optimised for chess play. BBC coverage of the event noted that Google’s Gemini finished third after beating another OpenAI entry.

Grok 4 led early in the competition but faltered in the final match against o3. Commentators and observers highlighted multiple tactical errors by Grok 4, including repeated queen losses, which swung the match in o3’s favour. Chess.com writer Pedro Pinhata said: “Up until the semi finals, it seemed like nothing would be able to stop Grok 4,” but added that Grok’s play “collapsed under pressure” on the last day. Grandmaster Hikaru Nakamura, who commentated live, noted: “Grok made so many mistakes in these games, but OpenAI did not.”