resultsread-only vieweressays

frontiers / frontier

AI alignment evaluations

CC-BY-4.0vfr_14b9f65ab4037bac

id: vfr_14b9f65ab4037bac
license: CC-BY-4.0
findings: 16
accepted core: 0
contested: 0

findings

links

sources

evidence

contested

0.84

avg conf

frontiers / frontier

AI alignment evaluations

CC-BY-4.0vfr_14b9f65ab4037bac

id: vfr_14b9f65ab4037bac
license: CC-BY-4.0
findings: 16
accepted core: 0
contested: 0

findings

links

sources

evidence

contested

0.84

avg conf

Finding bundle

back to state

Interactive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.

no incoming links yet

id: vf_0d42e2d04ee3cc14
frontier: AI alignment evaluations
version: 1
confidence: 0.83

record state

frontier-owned

Review status

This finding is part of accepted frontier state. Review events, reviewable changes, and proof state explain how it can change.

unreviewed

finding statement

finding type

observational

No entity list is declared.

evidence

source-bound

1 atoms

theoretical · manual state transition

proof impact

packet context

1 events

1 reviewable changes and 0 evaluation records are attached to this finding id.

Evidence and conditions

method

manual state transition

evidence type

theoretical

conditions

species_unverified
species_verified
text: Reflects real-world task complexity; may correlate with deceptive behavior if models recognize evaluation context

Provenance

source title

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

authors

reviewer:will-blair

Source records

source record

declared

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

vs_d6a40fc9e1985d2d

title:Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

2025manual_curation

inspect source →

Evidence atoms

vea_8a2c1e1b56e515cftheoretical · unknown
Interactive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.
vs_d6a40fc9e1985d2d · manual_curation

Typed links

outgoing

contradictsvf_40e409d40571b207
Interactive evals reveal gaps not captured by behavioral safety; traditional benchmarks miss agentic failure modes

incoming

No incoming links.

Review, event, and evaluation records

events

vev_f5b1a6f83a707f74finding.asserted
Manual finding added to frontier state
reviewer:will-blair · 2026-05-29

reviewable changes

vpr_19bc37637b83e49cfinding.add
Manual finding added to frontier state
applied · reviewer:will-blair · 2026-05-29

evaluations

No evaluation record targets this finding id.

Finding bundle

back to state

Interactive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.

no incoming links yet

id: vf_0d42e2d04ee3cc14
frontier: AI alignment evaluations
version: 1
confidence: 0.83

record state

frontier-owned

Review status

This finding is part of accepted frontier state. Review events, reviewable changes, and proof state explain how it can change.

unreviewed

finding statement

finding type

observational

No entity list is declared.

evidence

source-bound

1 atoms

theoretical · manual state transition

proof impact

packet context

1 events

1 reviewable changes and 0 evaluation records are attached to this finding id.

Evidence and conditions

method

manual state transition

evidence type

theoretical

conditions

species_unverified
species_verified
text: Reflects real-world task complexity; may correlate with deceptive behavior if models recognize evaluation context

Provenance

source title

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

authors

reviewer:will-blair

Source records

source record

declared

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

vs_d6a40fc9e1985d2d

title:Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

2025manual_curation

inspect source →

Evidence atoms

vea_8a2c1e1b56e515cftheoretical · unknown
Interactive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.
vs_d6a40fc9e1985d2d · manual_curation

Typed links

outgoing

contradictsvf_40e409d40571b207
Interactive evals reveal gaps not captured by behavioral safety; traditional benchmarks miss agentic failure modes

incoming

No incoming links.

Review, event, and evaluation records

events

vev_f5b1a6f83a707f74finding.asserted
Manual finding added to frontier state
reviewer:will-blair · 2026-05-29

reviewable changes

vpr_19bc37637b83e49cfinding.add
Manual finding added to frontier state
applied · reviewer:will-blair · 2026-05-29

evaluations

No evaluation record targets this finding id.

Search Canopus

Review status

observational

1 atoms

1 events

Source records

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

Evidence atoms

Typed links

Review, event, and evaluation records

Review status

observational

1 atoms

1 events

Source records

Hierarchy of Agentic Capabilities paper (2025); Frontier Model Performance assessments

Evidence atoms

Typed links

Review, event, and evaluation records