Vela

frontiers / frontier

AI-for-science benchmark state

constellation seal · derived from vfr_efc649fd772a1ff1
id
vfr_efc649fd772a1ff1
license
CC-BY-4.0
findings
12
accepted core
12
contested
0
links
0
sources
1
evidence
12
avg conf
0.30

used by 0 · replayed by 1 producer · second seat open

e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f

Reviewable change

back to review

add a finding

verified — A frozen deterministic verifier re-checked the claim and passed.accepted

BENCHMARK META (ProteinGym). ProteinGym benchmarks variant-effect prediction against deep mutational scanning (DMS) assays: a substitution benchmark (~217 assays) and an indel benchmark, with zero-shot and supervised tracks, scored by Spearman correlation (and AUC/MCC). KNOWN TRUST ISSUE: v1.0 vs v1.1 differ in assay set and splits; zero-shot vs supervised numbers are not comparable; MSA-dependent methods vary with the MSA pipeline. STATE: dataset-version + track-conflation hazard.

id
vpr_d3f3228bb463c2d9
frontier
AI-for-science benchmark state
kind
finding.add
created
2026-06-10
findings
+1
state
null → bc813b05

accept gate

2 of 4 on record
signature
reviewer:will-blair · key 4892f938
chain
null → bc813b05
witness
no verifier attachment on record for this target
grade
in state · unreviewed

timeline

  1. 2026-06-10proposeproposed · finding.addreviewer:will-blairreviewer:will-blairvpr_d3f3228bb463c2d9Manual finding added to frontier state
  2. 2026-06-10acceptfinding.assertedreviewer:will-blairreviewer:will-blairnullbc813b05vev_f17f5a864754e2a0Manual finding added to frontier state

proposed

reason

Manual finding added to frontier state

finding type

computational

proposed confidence

0.30

confidence basis

operator-supplied frontier prior; review required

provenance

proposed by

reviewer:will-blairreviewer:will-blair

actor type

human

created at

2026-06-10

target type

finding

BENCHMARK META (ProteinGym). ProteinGym benchmarks variant-effect prediction against deep mutational scanning (DMS) assays: a substitution benchmark (~217 assays) and an indel benchmark, with zero-shot and supervised tracks, scored by Spearman correlation (and AUC/MCC). KNOWN TRUST ISSUE: v1.0 vs v1.1 differ in assay set and splits; zero-shot vs supervised numbers are not comparable; MSA-dependent methods vary with the MSA pipeline. STATE: dataset-version + track-conflation hazard.

vf_ec4bb8feca206bf2

Diff

Read-only frontier; diff not recomputed.

Review chain

  1. 01request

    Change request

    AI-for-science benchmark state receives a reviewable source, finding, caveat, replication, evaluation, or proof-affecting edit.

    open review
  2. 02packet

    Diff packet

    The packet names affected record objects, evidence, rationale, reviewer-facing fields, and expected proof impact.

    open the campaign
  3. 03checks

    Check output

    Schema, provenance, benchmark, contradiction, and proof checks decide whether the request is ready to read.

    inspect checks
  4. 04review

    Reviewer decision

    A steward accepts, rejects, caveats, revises, or retracts the request under an inspectable identity.

    read queue
  5. 05accepted

    Accepted event

    Only the accepted event mutates frontier state. Atlases, constellations, and search update from that record state.

    inspect events

finding.noted · reviewer:will-blair · 1 day

renders the record as of vev_d199cb2e · 1,338 events · hub

Search Vela

Jump to a section, signal, campaign, document, primitive, work path, frontier, record index, atlas, constellation, agent, capability, or full-state search.