Will there be realistic AI generated video from natural language descriptions by the start of 2026?

Basic

Ṁ186

2026

56%

chance

ALL

Resolves yes if there is a model that receives a natural language description (e.g."Give me a video of a puppy playing with a kitten") and outputs a realistic looking video matching the description.

It does *not* have to be *undetectable* as AI generated, merely "realistic enough".

It must be able to consistently generate realistic videos >=30 seconds long to count.

DALL-E 2 (https://cdn.openai.com/papers/dall-e-2.pdf) counts as "realistic enough" *image* generation from natural language descriptions (I am writing this before the model is fully available, if it turns out that all the samples are heavily cherry picked DALL-E 2 does not count but a hypothetical model as good as the cherry picked examples would).

Duplicate of https://manifold.markets/vluzko/will-there-be-realistic-ai-generate

Update 2024-23-12 (PST) (AI summary of creator comment): - Videos must be coherent throughout the full duration - meaning they must maintain consistency with the original prompt for the entire video without shifting between unrelated scenes
- Looped scenes do not count
- A single example of a successful video is not sufficient for resolution

Update 2024-24-12 (PST): - The success rate must be at least 66% of DALL-E 2's rate, not a flat rate. (AI summary of creator comment)

Update 2025-05-01 (PST) (AI summary of creator comment): Evidence must be publicly available.

Update 2025-18-01 (PST) (AI summary of creator comment): - Models must be able to generate videos consistently and handle a wide variety of prompts
- The video must be produced in a single shot; videos stitched together from multiple segments do not count.

This question is managed and resolved by Manifold.

#AI

#Technical AI Timelines

Get

1,000

and

3.00

5 Comments

7 Holders

11 Trades

Sort by:

I'm somewhat confused by the criteria. Sora could definitely generate realistic looking videos back in spring 2024. It obviously can do some subjects and actions better than others, and what counts as "realistic enough" is unclear to me. The consistency also depends on what kinds of prompts you use, some things give good results 8/10 times, while others 1/10, so "consistently" isn't well defined either.

@ProjectVictory Existing models all fail on the >=30 second video criterion. Sora and Veo are generally realistic enough, they just can't maintain that for more than a few seconds.

What would you say are the specific improvements current video AI would need to reach the required level? E.g.

@TheAllMemeingEye Models need to be able to do this consistently, and with a wide variety of prompts. Also it's hard to tell if this is multiple videos stitched together - that doesn't count, the model must do it single shot.

@vluzko thanks 👍

Related questions

Related questions