2026-03-22 | Transformer From Scratch
Goal: Transformer from scratch: Implement Toy IOI Model
Summary: Finished Transformer Architecture
Work sessions
| In | Out |
|---|---|
| 11:05 | 11:30 |
| 12:40 | 13:40 |
| 20:15 | 20:50 |
Goals
AutoInterp
- Implement Toy IOI Transformer from scratch
- Use Claude to verify that my Pytorch Transformer implementation is correct
- Create Dataset of IOI Sentences
- Train Model and verify model generalization
- Understand how layer norms can be folded into the surrounding parts of a model
Linguistics
- Hodge Laplacian, understand how to derive/hand calculate
Progress Reflection
-
Finished almost all of the transformer architecture (understanding LayerNorm and how it can be folded into the surrounding parameters)
-
No progres on the finishing training model/IOI Task
-
No progress on Hodge Laplacian
On the bright side, I think my jet lag is putting me into a nice position to sleep ~9-10pm and wake up ~5pm which will be nice for having a work block in the morning!