record state
frontier-ownedfrontiers / frontier
AI-for-science benchmark state
- id
- vfr_efc649fd772a1ff1
- license
- CC-BY-4.0
- findings
- 12
- accepted core
- 12
- contested
- 0
- links
- 0
- sources
- 1
- evidence
- 12
- avg conf
- 0.30
e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f
Finding bundle
back to stateBENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.
- id
- vf_41030d44f59eae22
- frontier
- AI-for-science benchmark state
- version
- 1
- confidence
- 0.30
no incoming links yet
file
/frontier/benchmark-state#e=16scrub position · after_hash dd9be8a780d7ab98…vf_41030d44f59eae22 · benchmark-state · https://vela-site-next.fly.dev/frontier/benchmark-state#e=16citeraw json · vf_41030d44f59eae22 (2.6 KB)
{
"annotations": [
{
"author": "reviewer:will-blair",
"id": "ann_ab3ecb7585481522",
"text": "HARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.",
"timestamp": "2026-06-10T23:01:45.061196+00:00"
}
],
"assertion": {
"direction": null,
"entities": [],
"relation": null,
"text": "BENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.",
"type": "computational"
},
"conditions": {
"age_group": null,
"cell_type": null,
"clinical_trial": false,
"concentration_range": null,
"duration": null,
"human_data": false,
"in_vitro": false,
"in_vivo": false,
"species_unverified": [],
"species_verified": [],
"text": "Manually added finding; requires evidence review before scientific use."
},
"confidence": {
"basis": "operator-supplied frontier prior; review required",
"extraction_confidence": 1,
"kind": "frontier_epistemic",
"method": "expert_judgment",
"score": 0.3
},
"created": "2026-06-10T06:50:55.944113+00:00",
"evidence": {
"effect_size": null,
"evidence_spans": [],
"method": "manual state transition",
"model_system": "",
"p_value": null,
"replicated": false,
"replication_count": null,
"sample_size": null,
"species": null,
"type": "computational"
},
"flags": {
"contested": false,
"declining": false,
"gap": true,
"gravity_well": false,
"negative_space": false,
"retracted": false
},
"id": "vf_41030d44f59eae22",
"links": [],
"previous_version": null,
"provenance": {
"authors": [
{
"name": "reviewer:will-blair",
"orcid": null
}
],
"citation_count": null,
"doi": null,
"extraction": {
"extracted_at": "2026-06-10T06:50:55.944101+00:00",
"extractor_version": "vela/0.691.0",
"method": "manual_curation",
"model": null,
"model_version": null
},
"journal": null,
"openalex_id": null,
"pmc": null,
"pmid": null,
"review": {
"corrections": [],
"reviewed": false,
"reviewed_at": null,
"reviewer": null
},
"source_type": "expert_assertion",
"title": "manual finding",
"year": null
},
"updated": null,
"version": 1
}Unsealed — 0 attachment(s) on record, awaiting independent verification.
0 attachments · 0 distinct checker actors · 0 methods
blame · custody trail
vev_4869af225af70848history · 2 events
finding statement
finding typecomputational
No entity list is declared.
evidence
source-bound1 atoms
computational · manual state transition
proof impact
packet context2 events
2 reviewable changes and 0 evaluation records are attached to this finding id.
evidence
method
manual state transition
evidence type
computational
conditions
- species_unverified
- species_verified
- text
- Manually added finding; requires evidence review before scientific use.
provenance
source title
manual finding
authors
reviewer:will-blair
Source records
1Evidence atoms
1- vea_3818a2b502e64c42computational · unknown
BENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.
vs_066123dd29a9c5b4 · manual_curation
Typed links
0outgoing
No outgoing links.
incoming
No incoming links.
Review, event, and evaluation records
4events
vev_4869af225af70848finding.assertedManual finding added to frontier state
reviewer:will-blair · 2026-06-10
vev_a73023eb43fa7387finding.notedHARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.
reviewer:will-blair · 2026-06-10
reviewable changes
vpr_adea2a2f9e4ba533finding.addManual finding added to frontier state
applied · reviewer:will-blair · 2026-06-10
vpr_edef714318aa82befinding.noteHARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.
applied · agent:hardening-2026-06-10 · 2026-06-10
evaluations
No evaluation record targets this finding id.