O3 Geometry - Search News

News

OpenAI's newest o3 and o4-mini models excel at coding and math – but hallucinate more often

A hot potato: OpenAI's latest artificial intelligence models, o3 and o4-mini, have set new benchmarks in coding, math, and multimodal reasoning. Yet, despite these advancements, the models are ...

Hosted on MSN25d

OpenAI’s o3 model might be costlier to run than originally estimated

When OpenAI unveiled its o3 “reasoning” AI model in December, the company partnered with the creators of ARC-AGI, a benchmark designed to test highly capable AI, to showcase o3’s capabilities.

TechCrunch7d

OpenAI’s o3 AI model scores lower on a benchmark than the company initially implied

A discrepancy between first- and third-party benchmark results for OpenAI’s o3 AI model is raising questions about the company’s transparency and model testing practices. When OpenAI unveiled ...

Yahoo Finance8d

OpenAI's o3 AI model scores lower on a benchmark than the company initially implied

Epoch found that o3 scored around 10%, well below OpenAI's highest claimed score. That doesn't mean OpenAI lied, per se. The benchmark results the company published in December show a lower-bound ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results