Is it possible to align scaffolded LLMs with human values? | Manifold

Is it possible to align scaffolded LLMs with human values?

Basic

5

Ṁ71

2100

70%

chance

1D

1W

1M

ALL

This question is managed and resolved by Manifold.

#️ Technology

Get

1,000

and

3.00

Sort by:

@TomDAVID How might you judge this?

Related questions

Do LLMs experience qualia?

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?

Will LLMs be the best reasoning models on these dates?

are LLMs easy to align because unsupervised learning imbues them with an ontology where human values are easy to express

Will "LLMs for Alignment Research: a safety priority?" make the top fifty posts in LessWrong's 2024 Annual Review?

Will relaxed adversarial training be used in practice for LLM alignment or auditing before 2028?

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?

Related questions

Do LLMs experience qualia?

are LLMs easy to align because unsupervised learning imbues them with an ontology where human values are easy to express

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

Will "LLMs for Alignment Research: a safety priority?" make the top fifty posts in LessWrong's 2024 Annual Review?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will relaxed adversarial training be used in practice for LLM alignment or auditing before 2028?

EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will LLMs be the best reasoning models on these dates?

Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules