Grok 4.1 Thinking Steals #1 Spot on LMArena, Surpassing Google Gemini 2.5 Pro

Do Son November 19, 2025 2 minutes read

Elon Musk’s artificial intelligence company, xAI, has launched a surprise offensive with the quiet release of its new Grok 4.1 model series. The update arrives in two variants—the standard Grok 4.1 and the deep-reasoning-enabled Grok 4.1 Thinking—both of which are now freely available to users.

On the LMArena leaderboard, Grok 4.1 Thinking made a dramatic debut at the very top with an Elo score of 1483, while the non-reasoning standard Grok 4.1 swiftly followed, securing second place.

Notably, Google’s previously strong Gemini 2.5 Pro has now slipped to third, trailing the leading Grok 4.1 Thinking by a full 31 points—an unmistakable sign of the pressure mounting ahead of Google’s forthcoming Gemini 3.0 release.

The new models also demonstrate marked improvement in creative writing capabilities. According to Creative Writing v3 benchmark results, both Grok 4.1 Thinking and Grok 4.1 rank just beneath OpenAI’s GPT 5.1, surpassing formidable competitors including OpenAI’s o3, Claude Sonnet 4.5, and Kimi K2 Instruct.

Beyond performance and creativity, xAI has significantly enhanced the models’ accuracy. Data shows that, compared with the previous Grok 4 Fast, Grok 4.1 reduces factual error rates by roughly 70%. Incidents of AI hallucinations have likewise dropped dramatically—from 12.09% to 4.22%—substantially strengthening the system’s practicality and reliability.

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Related Posts:

Get Zero-Hour Vulnerability Alerts

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply