09:44
OpenAI employees publicly criticize the misleading Benchmark test results of the latest AI model Grok3 by XAI
OpenAI employees accused xAI's released Grok3 AI model Benchmark test results of being misleading, but xAI denied any wrongdoing. Grok3 outperformed OpenAI's strongest model at AIME 2025, but it was alleged that the chart did not include all comparison data. Babushkin argued that OpenAI had also released similar charts in the past for comparing model performance.
- 1