Which lab will publish the model leading OpenRouter monthly rankings July 1st 2025?
3
Ṁ204
Jun 30
56%
OpenAI
21%
Google DeepMind
16%
Anthropic
1.6%
DeepSeek
2%
xAI
2%
Meta
2%
Other

At the moment, the leading general-purpose evaluation, LMArena is based on user preferences. This drew criticism after the Llama4 fiasco, in which a model optimized for the arena ranked second, but without the optimization it dropped to rank 38, showing how the leaderboard can be gamed.

Following criticism on LMArena pre-release testing practices, Andrej Karparty suggested to use the OpenRouter rankings, which tally the tokens used per model through the API router.

This market will resolve based on the monthly rankings on https://openrouter.ai/rankings?view=month as of July 1st, 2025. Token counts for model variants with the same core version (e.g., Gemini 2.5 Pro Experimental and Gemini 2.5 Pro Preview, Claude 3.7 Sonnet and Claude 3.7 Sonnet (thinking)) will be aggregated, but substantially different versions will be counted separately (e.g., Gemini 2.0 Flash and Gemini 2.0 Flash Lite).

I may bet on this market, as the resolution criteria are pretty clear.

Get
Ṁ1,000
and
S3.00
Sort by:

The month that counts is about to begin. This week GPT-4o-mini is looking really strong with 483B, Sonnet 4 has takens second place with 229B, with Gemini 2.0 Flash and 2.5 Pro also over 200B tokens!

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules