record state
frontier-ownedReview status
This finding is part of accepted frontier state. Review events, reviewable changes, and proof state explain how it can change.
frontiers / frontier
Finding bundle
back to staterecord state
frontier-ownedThis finding is part of accepted frontier state. Review events, reviewable changes, and proof state explain how it can change.
finding statement
finding typeNo entity list is declared.
evidence
source-boundtheoretical · manual state transition
proof impact
packet context1 reviewable changes and 0 evaluation records are attached to this finding id.
Evidence and conditions
method
manual state transition
evidence type
theoretical
conditions
Provenance
source title
Frontier AI Safety Frameworks review (multiple labs, 2024)
authors
reviewer:will-blair
Behavioral safety evaluations (refusal-based testing on harmful content categories) show strong surface-level safety but do not assess deeper deception, sandbagging, or scheming capabilities.
vs_0ed2b819f71baff6 · manual_curation
outgoing
No outgoing links.
incoming
contradicts · vf_201b5c921b23410b
supports · vf_c949999dbbad515b
events
vev_f495bad59887515cfinding.assertedManual finding added to frontier state
reviewer:will-blair · 2026-05-29
reviewable changes
vpr_162253ea3f0f4c69finding.addManual finding added to frontier state
applied · reviewer:will-blair · 2026-05-29
evaluations
No evaluation record targets this finding id.