2025-11-12 | Neo et al. 2024 cont.
Goal: Reproducing Neo et al. 2024
Summary: Solidify/document understanding in Interpreting Context Look Ups Notes
Work sessions
| In | Out |
|---|---|
| 20:15 | 20:50 |
| 21:50 | 23:55 |
Neo et al. 2024
- Work through derivations of Next-token neurons
- In progress documentation of
Individual Head Attribution
TODOs for Reproduction
- Finish notes section on Head Attribution
- Load in GPT-2 Model and find position of Next-token Neurons a. Verify that tokens at the very end layers are more correlated with token prediction b. Can attempt to use GPT to explain the pattern
- Find Next-token neurons in final layers
- Attribute heads for each Next-token neurons in (3)
- Use GPT to provide explanation for contexts
- Ablations for confirmation
Additional: 7. Note: I should also fix the writeup section on the (4) steps to success