Vela

frontiers / frontier

AI-for-science benchmark state

constellation seal · derived from vfr_efc649fd772a1ff1
id
vfr_efc649fd772a1ff1
license
CC-BY-4.0
findings
12
accepted core
12
contested
0
links
0
sources
1
evidence
12
avg conf
0.30

used by 0 · replayed by 1 producer · second seat open

e24/24 · finding.noted · reviewer:will-blair · 2026-06-10 · 6c12→d02f

Reviewable change

back to review

add a finding

verified — A frozen deterministic verifier re-checked the claim and passed.accepted

FAITHFULNESS HAZARD (MiniF2F). A reported 'solve' is only as good as the autoformalized statement matching the intended problem; the miniF2F Revisited effort found statements that were mis-stated or trivially true. VERIFICATION STATE: faithfulness of the FORMAL statement to the INFORMAL problem is the under-checked axis. Open obligation: every banked miniF2F solve needs a statement-faithfulness attestation (vela attest --scope formalism-fidelity).

id
vpr_965bbdde5ff53044
frontier
AI-for-science benchmark state
kind
finding.add
created
2026-06-10
findings
+1
state
null → 7241f04d

accept gate

2 of 4 on record
signature
reviewer:will-blair · key 4892f938
chain
null → 7241f04d
witness
no verifier attachment on record for this target
grade
in state · unreviewed

timeline

  1. 2026-06-10proposeproposed · finding.addreviewer:will-blairreviewer:will-blairvpr_965bbdde5ff53044Manual finding added to frontier state
  2. 2026-06-10acceptfinding.assertedreviewer:will-blairreviewer:will-blairnull7241f04dvev_31b40ef5e25c88b6Manual finding added to frontier state

proposed

reason

Manual finding added to frontier state

finding type

computational

proposed confidence

0.30

confidence basis

operator-supplied frontier prior; review required

provenance

proposed by

reviewer:will-blairreviewer:will-blair

actor type

human

created at

2026-06-10

target type

finding

FAITHFULNESS HAZARD (MiniF2F). A reported 'solve' is only as good as the autoformalized statement matching the intended problem; the miniF2F Revisited effort found statements that were mis-stated or trivially true. VERIFICATION STATE: faithfulness of the FORMAL statement to the INFORMAL problem is the under-checked axis. Open obligation: every banked miniF2F solve needs a statement-faithfulness attestation (vela attest --scope formalism-fidelity).

vf_dce7a34adf2878f2

Diff

Read-only frontier; diff not recomputed.

Review chain

  1. 01request

    Change request

    AI-for-science benchmark state receives a reviewable source, finding, caveat, replication, evaluation, or proof-affecting edit.

    open review
  2. 02packet

    Diff packet

    The packet names affected record objects, evidence, rationale, reviewer-facing fields, and expected proof impact.

    open the campaign
  3. 03checks

    Check output

    Schema, provenance, benchmark, contradiction, and proof checks decide whether the request is ready to read.

    inspect checks
  4. 04review

    Reviewer decision

    A steward accepts, rejects, caveats, revises, or retracts the request under an inspectable identity.

    read queue
  5. 05accepted

    Accepted event

    Only the accepted event mutates frontier state. Atlases, constellations, and search update from that record state.

    inspect events

finding.noted · reviewer:will-blair · 1 day

renders the record as of vev_d199cb2e · 1,338 events · hub

Search Vela

Jump to a section, signal, campaign, document, primitive, work path, frontier, record index, atlas, constellation, agent, capability, or full-state search.