Vela

frontiers / frontier

AI-for-science benchmark state

constellation seal · derived from vfr_efc649fd772a1ff1
id
vfr_efc649fd772a1ff1
license
CC-BY-4.0
findings
12
accepted core
12
contested
0
links
0
sources
1
evidence
12
avg conf
0.30

used by 0 · replayed by 1 producer · second seat open

e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f

Finding bundle

back to state

BENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.

id
vf_41030d44f59eae22
frontier
AI-for-science benchmark state
version
1
confidence
0.30

no incoming links yet

file

/frontier/benchmark-state#e=16scrub position · after_hash dd9be8a780d7ab98…
vf_41030d44f59eae22 · benchmark-state · https://vela-site-next.fly.dev/frontier/benchmark-state#e=16cite
raw json · vf_41030d44f59eae22 (2.6 KB)
{
 "annotations": [
  {
   "author": "reviewer:will-blair",
   "id": "ann_ab3ecb7585481522",
   "text": "HARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.",
   "timestamp": "2026-06-10T23:01:45.061196+00:00"
  }
 ],
 "assertion": {
  "direction": null,
  "entities": [],
  "relation": null,
  "text": "BENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.",
  "type": "computational"
 },
 "conditions": {
  "age_group": null,
  "cell_type": null,
  "clinical_trial": false,
  "concentration_range": null,
  "duration": null,
  "human_data": false,
  "in_vitro": false,
  "in_vivo": false,
  "species_unverified": [],
  "species_verified": [],
  "text": "Manually added finding; requires evidence review before scientific use."
 },
 "confidence": {
  "basis": "operator-supplied frontier prior; review required",
  "extraction_confidence": 1,
  "kind": "frontier_epistemic",
  "method": "expert_judgment",
  "score": 0.3
 },
 "created": "2026-06-10T06:50:55.944113+00:00",
 "evidence": {
  "effect_size": null,
  "evidence_spans": [],
  "method": "manual state transition",
  "model_system": "",
  "p_value": null,
  "replicated": false,
  "replication_count": null,
  "sample_size": null,
  "species": null,
  "type": "computational"
 },
 "flags": {
  "contested": false,
  "declining": false,
  "gap": true,
  "gravity_well": false,
  "negative_space": false,
  "retracted": false
 },
 "id": "vf_41030d44f59eae22",
 "links": [],
 "previous_version": null,
 "provenance": {
  "authors": [
   {
    "name": "reviewer:will-blair",
    "orcid": null
   }
  ],
  "citation_count": null,
  "doi": null,
  "extraction": {
   "extracted_at": "2026-06-10T06:50:55.944101+00:00",
   "extractor_version": "vela/0.691.0",
   "method": "manual_curation",
   "model": null,
   "model_version": null
  },
  "journal": null,
  "openalex_id": null,
  "pmc": null,
  "pmid": null,
  "review": {
   "corrections": [],
   "reviewed": false,
   "reviewed_at": null,
   "reviewer": null
  },
  "source_type": "expert_assertion",
  "title": "manual finding",
  "year": null
 },
 "updated": null,
 "version": 1
}

Unsealed — 0 attachment(s) on record, awaiting independent verification.

0 attachments · 0 distinct checker actors · 0 methods

blame · custody trail

produced byreviewer:will-blairreviewer:will-blairfinding.asserted · 2026-06-10vev_4869af225af70848
checked byno verifier attachment on record
accepted byno accept signed

history · 2 events

record state

frontier-owned

Review status

claimed — no verifier run, no signed judgmentunreviewed

finding statement

finding type

computational

No entity list is declared.

evidence

source-bound

1 atoms

computational · manual state transition

proof impact

packet context

2 events

2 reviewable changes and 0 evaluation records are attached to this finding id.

evidence

method

manual state transition

evidence type

computational

conditions

species_unverified
species_verified
text
Manually added finding; requires evidence review before scientific use.

provenance

source title

manual finding

authors

reviewer:will-blair

Source records

1

Evidence atoms

1
  • vea_3818a2b502e64c42computational · unknown

    BENCHMARK CLAIM (ProteinGym) — ProteinNPT (non-parametric transformer, supervised track) REPORTS gains by attending across labelled neighbours. VERIFICATION STATE: author-reported; SUPERVISED — not comparable to zero-shot numbers; depends on the cross-validation split. NOT re-run here. Open obligation: re-run under the official supervised CV split; never compare against zero-shot rows.

    vs_066123dd29a9c5b4 · manual_curation

Typed links

0

outgoing

No outgoing links.

incoming

No incoming links.

Review, event, and evaluation records

4

events

  • vev_4869af225af70848finding.asserted

    Manual finding added to frontier state

    reviewer:will-blairreviewer:will-blair · 2026-06-10

  • vev_a73023eb43fa7387finding.noted

    HARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.

    reviewer:will-blairreviewer:will-blair · 2026-06-10

reviewable changes

  • vpr_adea2a2f9e4ba533finding.add

    Manual finding added to frontier state

    reviewer:will-blairapplied · reviewer:will-blair · 2026-06-10

  • vpr_edef714318aa82befinding.note

    HARDENING (benchmark-state): label_provenance=attested (records-not-reruns; ground truth is an answer key, not a frozen-verifier rederivation), valid_as_of=2026-06-10, model_cutoff=unknown. Under the trust ladder, attested label provenance caps this record below 'verified' until a deterministic rederivation exists.

    agent — machine actor, no signing keyapplied · agent:hardening-2026-06-10 · 2026-06-10

evaluations

No evaluation record targets this finding id.

finding.noted · reviewer:will-blair · 2 days

renders the record as of vev_d199cb2e · 1,338 events · hub

Search Vela

Jump to a section, signal, campaign, document, primitive, work path, frontier, record index, atlas, constellation, agent, capability, or full-state search.