Will there be realistic AI generated video from natural language descriptions by the start of 2026?
Basic
2
Ṁ20
2026
59%
chance

Resolves yes if there is a model that receives a natural language description (e.g."Give me a video of a puppy playing with a kitten") and outputs a realistic looking video matching the description.

It does *not* have to be *undetectable* as AI generated, merely "realistic enough".

It must be able to consistently generate realistic videos >=30 seconds long to count.

DALL-E 2 (https://cdn.openai.com/papers/dall-e-2.pdf) counts as "realistic enough" *image* generation from natural language descriptions (I am writing this before the model is fully available, if it turns out that all the samples are heavily cherry picked DALL-E 2 does not count but a hypothetical model as good as the cherry picked examples would).

Duplicate of https://manifold.markets/vluzko/will-there-be-realistic-ai-generate

  • Update 2024-23-12 (PST) (AI summary of creator comment): - Videos must be coherent throughout the full duration - meaning they must maintain consistency with the original prompt for the entire video without shifting between unrelated scenes

    • Looped scenes do not count

    • A single example of a successful video is not sufficient for resolution

    • The video must show continuous action/motion (like "two people walking down a city street having a conversation") for the full duration

  • Update 2024-24-12 (PST): - The success rate must be at least 66% of DALL-E 2's rate, not a flat rate. (AI summary of creator comment)

  • Update 2025-05-01 (PST) (AI summary of creator comment): Evidence must be publicly available. Having the model publicly available does not suffice.

    • If sample videos meeting the criteria are found, the market will be delayed until more information is available.

Get
Ṁ1,000
and
S3.00
Sort by:

I'm somewhat confused by the criteria. Sora could definitely generate realistic looking videos back in spring 2024. It obviously can do some subjects and actions better than others, and what counts as "realistic enough" is unclear to me. The consistency also depends on what kinds of prompts you use, some things give good results 8/10 times, while others 1/10, so "consistently" isn't well defined either.

What would you say are the specific improvements current video AI would need to reach the required level? E.g.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules