2026-01-29 | Recursion Reading Group
Goal: Kickoff Neural Network/Safety Group
Summary: Introduce Emergent Misalignment Papers | Scaffold MNIST nn Project
Work sessions
| In | Out |
|---|---|
| 19:30 | 21:00 |
Relevant Papers
- Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
- Natural Emergent Misalignment from Reward Hacking in Production RL
- Reward Hacking Boat
- NATURAL EMERGENT MISALIGNMENT FROM REWARD HACKING IN PRODUCTION RL
- To end on a positive note: Research Using Interpretability to Identify a Novel Class of Alzheimer's Biomarkers