Skip to content

2026-01-20 | LISA Coworking - Circuits

Goal: LISA Coworking - Circuits Prerequisites | Plan and start project | DQN Lecture

Summary: Started reading Mathematical Framework for Transformers but realized to start backwards from ACDC instead; started watching video overview from Nanda + Conmy

Work sessions

In Out
09:30 22:00

Plan for Today

  1. Read A Mathematical Framework for Transformer Circuits

  2. Read Automated Circuit Interpretation via Probe Prompting https://www.lesswrong.com/posts/zQqGhKPqaCBZZDCge/automated-circuit-interpretation-via-probe-prompting

2.5. IOI either implement (Neel Nanda has a video)/ARENA and read the paper

  1. Read more Anthropic Circuit related papers

  2. Read Circuit Tracing Automation related papers

  3. Develop a tractable plan for a minimal project and writeup by the EOW (Friday, January 23rd)

  4. Thinking now I have about 18 hours over the next 3 days so perhaps I can do a Neel Nanda style MATS application project on this!

Meetings

  1. Prep for meeting with Evan Feder

  2. Prep for meeting with J Rosser

  3. Prep to learn about Singular Learning Theory from Ilya Shirokov

Checklist Concepts to know by EOD

  • Negative writes (either through MLP or MHA)?

Interesting Linear Subspace Write Strength for Layer 0 Head 0 for Qwen 2.5

Linear Subspace