Vela

frontiers / frontier

AI-for-science benchmark state

constellation seal · derived from vfr_efc649fd772a1ff1
id
vfr_efc649fd772a1ff1
license
CC-BY-4.0
findings
12
accepted core
12
contested
0
links
0
sources
1
evidence
12
avg conf
0.30

used by 0 · replayed by 1 producer · second seat open

e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f

Reviewable change

back to review

add a finding

verified — A frozen deterministic verifier re-checked the claim and passed.accepted

BENCHMARK CLAIM (MiniF2F) — HyperTree Proof Search (HTPS, Lample et al.) REPORTS a miniF2F pass rate via learned best-first proof search. VERIFICATION STATE: author-reported; search budget and version-specific. NOT re-run here. Open obligation: re-run at the stated budget on a pinned split.

id
vpr_8ebb01be4aedad3b
frontier
AI-for-science benchmark state
kind
finding.add
created
2026-06-10
findings
+1
state
null → 42db6392

accept gate

2 of 4 on record
signature
reviewer:will-blair · key 4892f938
chain
null → 42db6392
witness
no verifier attachment on record for this target
grade
in state · unreviewed

timeline

  1. 2026-06-10proposeproposed · finding.addreviewer:will-blairreviewer:will-blairvpr_8ebb01be4aedad3bManual finding added to frontier state
  2. 2026-06-10acceptfinding.assertedreviewer:will-blairreviewer:will-blairnull42db6392vev_03b2b7f5e7e0be96Manual finding added to frontier state

proposed

reason

Manual finding added to frontier state

finding type

computational

proposed confidence

0.30

confidence basis

operator-supplied frontier prior; review required

provenance

proposed by

reviewer:will-blairreviewer:will-blair

actor type

human

created at

2026-06-10

target type

finding

BENCHMARK CLAIM (MiniF2F) — HyperTree Proof Search (HTPS, Lample et al.) REPORTS a miniF2F pass rate via learned best-first proof search. VERIFICATION STATE: author-reported; search budget and version-specific. NOT re-run here. Open obligation: re-run at the stated budget on a pinned split.

vf_9a454a597ddee070

Diff

Read-only frontier; diff not recomputed.

Review chain

  1. 01request

    Change request

    AI-for-science benchmark state receives a reviewable source, finding, caveat, replication, evaluation, or proof-affecting edit.

    open review
  2. 02packet

    Diff packet

    The packet names affected record objects, evidence, rationale, reviewer-facing fields, and expected proof impact.

    open the campaign
  3. 03checks

    Check output

    Schema, provenance, benchmark, contradiction, and proof checks decide whether the request is ready to read.

    inspect checks
  4. 04review

    Reviewer decision

    A steward accepts, rejects, caveats, revises, or retracts the request under an inspectable identity.

    read queue
  5. 05accepted

    Accepted event

    Only the accepted event mutates frontier state. Atlases, constellations, and search update from that record state.

    inspect events

finding.noted · reviewer:will-blair · 1 day

renders the record as of vev_d199cb2e · 1,338 events · hub

Search Vela

Jump to a section, signal, campaign, document, primitive, work path, frontier, record index, atlas, constellation, agent, capability, or full-state search.