2026-01-20 | LISA Coworking - Circuits
Goal: LISA Coworking - Circuits Prerequisites | Plan and start project | DQN Lecture
Summary: Started reading Mathematical Framework for Transformers but realized to start backwards from ACDC instead; started watching video overview from Nanda + Conmy
Work sessions
| In | Out |
|---|---|
| 09:30 | 22:00 |
Plan for Today
-
Read
A Mathematical Framework for Transformer Circuits -
Read
Automated Circuit Interpretation via Probe Promptinghttps://www.lesswrong.com/posts/zQqGhKPqaCBZZDCge/automated-circuit-interpretation-via-probe-prompting
2.5. IOI either implement (Neel Nanda has a video)/ARENA and read the paper
-
Read more Anthropic Circuit related papers
-
Read Circuit Tracing Automation related papers
-
Develop a tractable plan for a minimal project and writeup by the EOW (Friday, January 23rd)
-
Thinking now I have about 18 hours over the next 3 days so perhaps I can do a Neel Nanda style MATS application project on this!
Meetings
-
Prep for meeting with Evan Feder
-
Prep for meeting with J Rosser
-
Prep to learn about Singular Learning Theory from Ilya Shirokov
Checklist Concepts to know by EOD
- Negative writes (either through MLP or MHA)?
Interesting Linear Subspace Write Strength for Layer 0 Head 0 for Qwen 2.5
