When will AI be able to solve 50% of official Jane Street puzzles?

Plus

Ṁ2320

2029

20%

2025

52%

2026

73%

2027

79%

2028

Resolved

2024

Every month for around the last 10 years, Jane Street (a trading firm) has released a difficult puzzle on their website: https://www.janestreet.com/puzzles/archive/.

Right now, the best publicly accessible AI (GPT-4 or Gemini Ultra) is not very good at this. I tried running the February puzzle through both and GPT-4 gave a few definitions and then said it was complex (though it did correctly simulate it afterward, even though the problem asked for an exact answer), and Gemini Ultra wasn't even close.

During which year will a publicly accessible AI be able to solve at least 6 of the 12 puzzles released during the year? (Resolves yes during each year this happens. Multiple years can resolve YES)

Clarifications

Must be a general-purpose AI model, not AlphaGeometry or something
Publicly accessible = reasonably accessible by an average interested member of the public
Puzzles must be solved with minimal human input, aside from maybe "Let's think step by step" or something. I want to basically just copy-paste the puzzle and have it give a solution.
The model is not allowed to search for the solution or copy from a similar puzzle, it must clearly be solving the puzzle.
Different AIs can solve different puzzles, as long as they are released before the end of the month of the puzzle they are solving and are still general-purpose. (If GPT-5 can solve all the puzzles and is released in October of this year, it can't retroactively count for the earlier puzzles)
Resolves N/A if the puzzles stop being published.

This question is managed and resolved by Manifold.

#AI

#Technical AI Timelines

Get

1,000

and

3.00

10 Comments

27 Holders

101 Trades

Sort by:

Wouldn't JS just tweak the puzzles if AI could easily solve them? What would be the point of publishing them and keep a list of solvers etc?

@Lorenzo What tweak do you think would work for this? The humans still need to be able to solve them. You probably get less interesting puzzles if you’re optimizing for AI not being able to solve them, also

@dominic Check if a puzzle is solvable by AI, if so, tweak it a bit until it isn't.

> You probably get less interesting puzzles if you’re optimizing for AI not being able to solve them

Eh, idk, maybe people will find them less interesting if AIs can solve them

reposted

IMO this would require some really high quality planning & CoT architecture that doesn’t seem achievable for a general public AI model in the next 1-2 years. E.g. the Feb 2025 puzzle requires (1) a hypothesis about how the features of the puzzle are related, (2) a lot of exploration to find “the trick” of connecting the clues to the answers, and (3) an intuition for how to stitch the answers together to derive the final puzzle answer. Right now LLM/transformer-based models just don’t seem to have the creative knack to solve more than half of these kinds of problems. Could be wrong.

@pricemaker It's tough, but I do think the advent of reasoning models in 2024 helped the models go from "completely hopeless" to "making some genuine attempts". So who knows what the next generation will be able to do.

o1-mini and o1-preview both fail the most recent puzzle. But they do reasonable enough things, and are good enough at math, that I am not confident full o1 won’t be able to solve it - I think most of these are now underpriced, though I won’t trade because it’s my question

where's 2028+

@ZoravurSingh This is unlinked MC, so if you don’t think it will happen before 2028 you can bet NO on the pre-2028 options. I didn’t want to add too many years because I think it’s more difficult to have a good prediction the farther out you go

Can it run code like in Code Interpreter?

@ahalekelly Sure. It can't be a back and forth thing with the human, but it can use Code Interpreter. I wouldn't expect it to help a ton though, I think the puzzles are mostly not easily brute forceable in that way

Related questions

Related questions