Skip to content

2026-03-22 | Transformer From Scratch

Goal: Transformer from scratch: Implement Toy IOI Model

Summary: Finished Transformer Architecture

Work sessions

In Out
11:05 11:30
12:40 13:40
20:15 20:50

Goals

AutoInterp

  1. Implement Toy IOI Transformer from scratch
  2. Use Claude to verify that my Pytorch Transformer implementation is correct
  3. Create Dataset of IOI Sentences
  4. Train Model and verify model generalization
  5. Understand how layer norms can be folded into the surrounding parts of a model

Linguistics

  1. Hodge Laplacian, understand how to derive/hand calculate

Progress Reflection

  1. Finished almost all of the transformer architecture (understanding LayerNorm and how it can be folded into the surrounding parameters)

  2. No progres on the finishing training model/IOI Task

  3. No progress on Hodge Laplacian

On the bright side, I think my jet lag is putting me into a nice position to sleep ~9-10pm and wake up ~5pm which will be nice for having a work block in the morning!