evidence boundary
unknowntheoretical
An evidence atom is an inspectable support unit. It is not a finding by itself; it supports or challenges a finding through review.
frontiers / frontier
Evidence atom
back to sourcesevidence boundary
unknownAn evidence atom is an inspectable support unit. It is not a finding by itself; it supports or challenges a finding through review.
finding binding
boundInteractive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.
inspect finding →
source binding
source-boundvs_d6a40fc9e1985d2d
inspect source →
review context
unverified1 reviewable changes and 0 evaluation records target this atom or its bound objects.
Evidence statement
Interactive evaluation environments (agentic task suites with tool use) reveal capability gaps: frontier models pass only 28% of practical multi-step tasks despite 80th percentile benchmark performance.
extraction method
manual_curation
support relation
unknown
condition refs
vcnd_f4a2cc605f323ad6
Caveats
events
vev_f5b1a6f83a707f74finding.assertedManual finding added to frontier state
reviewer:will-blair · 2026-05-29
reviewable changes
vpr_19bc37637b83e49cfinding.addManual finding added to frontier state
applied · reviewer:will-blair · 2026-05-29
evaluations
No evaluation rows are attached.