evidence boundary
unknowntheoretical
An evidence atom is an inspectable support unit. It is not a finding by itself; it supports or challenges a finding through review.
frontiers / frontier
Evidence atom
back to sourcesevidence boundary
unknownAn evidence atom is an inspectable support unit. It is not a finding by itself; it supports or challenges a finding through review.
finding binding
boundWhy do circuit faithfulness metrics (KL divergence, logit difference) fail to detect cooperative inhibition heads that individually score near-zero on attribution but prove critical for behavior, and what principled attribution metric would catch such non-additive interactions?
inspect finding →
source binding
source-boundvs_4d88f4c63fd49a20
inspect source →
review context
unverified1 reviewable changes and 0 evaluation records target this atom or its bound objects.
Evidence statement
Why do circuit faithfulness metrics (KL divergence, logit difference) fail to detect cooperative inhibition heads that individually score near-zero on attribution but prove critical for behavior, and what principled attribution metric would catch such non-additive interactions?
extraction method
manual_curation
support relation
unknown
condition refs
vcnd_52be67291d5b4c56
Caveats
events
vev_a861131b43c2f3a2finding.assertedManual finding added to frontier state
reviewer:will-blair · 2026-05-29
reviewable changes
vpr_6a5c8894b2da8c78finding.addManual finding added to frontier state
applied · reviewer:will-blair · 2026-05-29
evaluations
No evaluation rows are attached.