| manual finding reviewer:will-blair vs_066123dd29a9c5b4 | paper | missing | missing | 1 |
| mechinterp causal sweep agent:replicator vs_2a6614cf8c092119 | synthetic_report | missing | missing | 9 |
| Quantifying LLM Attention-Head Stability: Implications for Circuit Universality (2026) reviewer:will-blair vs_3cd33009aa660811 | paper | missing | missing | 1 |
| Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition (2024) reviewer:will-blair vs_40006cec800a3ba9 | paper | missing | missing | 1 |
| Evaluating Sparse Autoencoders for Monosemantic Representation (2024); Sparse Autoencoders Find Highly Interpretable Features in Language Models (2023) reviewer:will-blair vs_47c534f38adcf038 | paper | missing | missing | 1 |
| Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence (2025) reviewer:will-blair vs_482cf7e9d0e55b5d | paper | missing | missing | 1 |
| Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework (2024); Transformer Circuit Faithfulness Metrics are not Robust (2024) reviewer:will-blair vs_4d88f4c63fd49a20 | paper | missing | missing | 1 |
| Open Problems in Mechanistic Interpretability (Jan 2025) reviewer:will-blair vs_51c2dec40002e687 | paper | missing | missing | 1 |
| Open Problems in Mechanistic Interpretability (Jan 2025); Quantifying LLM Attention-Head Stability (2026) reviewer:will-blair vs_55b823b7ab851047 | paper | missing | missing | 1 |
| Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework (2024) reviewer:will-blair vs_6f41cb13fb21f61c | paper | missing | missing | 1 |
| mechinterp causal sweep wave2 agent:replicator vs_77dbc3bbe15ee463 | synthetic_report | missing | missing | 2 |
| Circuit-Aware Reward Training: A Mechanistic Framework for Longtail Robustness in RLHF (2025) reviewer:will-blair vs_7e9999ec54d14123 | paper | missing | missing | 1 |
| Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework (2024); Towards Automated Circuit Discovery for Mechanistic Interpretability (2024) reviewer:will-blair vs_8c1dee0513603eb3 | paper | missing | missing | 1 |
| Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence (2025); Induction Heads in Transformers (emergentmind.com) reviewer:will-blair vs_992b9389fe5bde7a | paper | missing | missing | 1 |
| Beyond Induction Heads (2025); Induction Heads & In-Context Learning (emergentmind.com) reviewer:will-blair vs_a1bf940c53a584fa | paper | missing | missing | 1 |
| A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models (2024) reviewer:will-blair vs_a507d1ffeb01ee4b | paper | missing | missing | 1 |
| Weight-sparse transformers have interpretable circuits (2024) reviewer:will-blair vs_ad78c3c984009cfb | paper | missing | missing | 1 |
| Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence (2025); Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures (2024) reviewer:will-blair vs_b8a04f98c5991ecb | paper | missing | missing | 1 |
| Quantifying LLM Attention-Head Stability (2026) reviewer:will-blair vs_bd3d16868b182c1d | paper | missing | missing | 2 |
| Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures (2024) reviewer:will-blair vs_dbe73234e241a413 | paper | missing | missing | 1 |
| Evaluating Sparse Autoencoders for Monosemantic Representation (2024) reviewer:will-blair vs_e71ac2fca8d56384 | paper | missing | missing | 2 |
| mechinterp circuit harness agent:replicator vs_f1c2a9e52cbd0c39 | synthetic_report | missing | missing | 10 |