evidence boundary
unknownfrontiers / frontier
AI-for-science benchmark state
- id
- vfr_efc649fd772a1ff1
- license
- CC-BY-4.0
- findings
- 12
- accepted core
- 12
- contested
- 0
- links
- 0
- sources
- 1
- evidence
- 12
- avg conf
- 0.30
e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f
Evidence atom
back to sourcesBENCHMARK META (ProteinGym). ProteinGym benchmarks variant-effect prediction against deep mutational scanning (DMS) assays: a substitution benchmark (~217 assays) and an indel benchmark, with zero-shot and supervised tracks, scored by Spearman correlation (and AUC/MCC). KNOWN TRUST ISSUE: v1.0 vs v1.1 differ in assay set and splits; zero-shot vs supervised numbers are not comparable; MSA-dependent methods vary with the MSA pipeline. STATE: dataset-version + track-conflation hazard.
- id
- vea_ad36ad1c4b0f546f
- frontier
- AI-for-science benchmark state
- source
- vs_066123dd29a9c5b4
- finding
- vf_ec4bb8feca206bf2
finding binding
boundcomputational
BENCHMARK META (ProteinGym). ProteinGym benchmarks variant-effect prediction against deep mutational scanning (DMS) assays: a substitution benchmark (~217 assays) and an indel benchmark, with zero-shot and supervised tracks, scored by Spearman correlation (and AUC/MCC). KNOWN TRUST ISSUE: v1.0 vs v1.1 differ in assay set and splits; zero-shot vs supervised numbers are not comparable; MSA-dependent methods vary with the MSA pipeline. STATE: dataset-version + track-conflation hazard.
source binding
source-boundmanual finding
vs_066123dd29a9c5b4
review context
unverified2 events
2 reviewable changes and 0 evaluation records target this atom or its bound objects.
statement
BENCHMARK META (ProteinGym). ProteinGym benchmarks variant-effect prediction against deep mutational scanning (DMS) assays: a substitution benchmark (~217 assays) and an indel benchmark, with zero-shot and supervised tracks, scored by Spearman correlation (and AUC/MCC). KNOWN TRUST ISSUE: v1.0 vs v1.1 differ in assay set and splits; zero-shot vs supervised numbers are not comparable; MSA-dependent methods vary with the MSA pipeline. STATE: dataset-version + track-conflation hazard.
extraction method
manual_curation
support relation
unknown
condition refs
vcnd_80e110739db57437
caveats
- missing evidence locator
Review, event, and evaluation records
4events
vev_b11de7b18f8b9f24finding.notedHARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.
reviewer:will-blair · 2026-06-10
vev_f17f5a864754e2a0finding.assertedManual finding added to frontier state
reviewer:will-blair · 2026-06-10
reviewable changes
vpr_9496dacae43645bcfinding.noteHARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.
applied · agent:hardening-2026-06-10 · 2026-06-10
vpr_d3f3228bb463c2d9finding.addManual finding added to frontier state
applied · reviewer:will-blair · 2026-06-10
evaluations
No evaluation rows are attached.