Who will have the highest ranking model on web.lmarena.ai by EOY 2025?
➕
Plus
17
Ṁ1796
10 hours ago
12%
OpenAI
57%
Anthropic
26%
Google
5%Other

LMSYS released a new leaderboard for ranking ai-models based on their web dev skills. It is super fun to play around with. Give it a shot here.

  • Update 2025-12-04 (PST) (AI summary of creator comment): The market may be resolved approximately 18 days early (around mid-December 2025) due to the leaderboard being sunset before the original close date.

  • Update 2025-12-18 (PST) (AI summary of creator comment): The creator is reconsidering the resolution criteria and is currently leaning towards one of the following options:

    • Resolving based on the last status of the old dashboard.

    • Resolving the market as N/A.

Get
Ṁ1,000
and
S3.00
Sort by:

@Soli It seems this leaderboard will be sunset before the market closes

@SqrtMinusOne i think it is fine, we are almost at the end of the year anyways. @traders any objections to resolving the market ~18 days earlier?

@Soli i would just think this should clearly resolve to https://lmarena.ai/leaderboard/webdev which is clearly meant to be where that leaderboard is moved for consistency around their leaderboard location conventions

@Bayesian makes even more sense, let's do this (FYI @traders )

@Soli That's a different dashboard with a different methodology though. The order is very different :P

@SqrtMinusOne hmmm good point, @Bayesian do you have a counter to this?

good point tbh. I checked the models that were in both leaderboards:

in common:

  1. opus-4-1

  2. sonnet-4-5-thinking

  3. gemini-2.5-pro

  4. minimax-m2

  5. glm-4.6

  6. sonnet-4-5

  7. qwen3-coder

  8. haiku-4-5

  9. grok-code-fast-1

Order in the old version

123456789

Order in the new version

216547839

one is a good proxy of the other, but not perfect and sometimes pretty bad (gemini 2.5 pro)

bought Ṁ40 YES

@Soli Before I buy in more, I wanted to double check my assumption that the dashboard Bayesian linked is the one this market resolves with, since the other one is now unavailable?

@prismatic not sure at this point but leaning towards either (1) resolvong according to last status of old dashboard orr (2) naing the market

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules