Skip to content

2025-10-30 | Apps

Goal: Lead first AI Safety Session!

Summary: First time formally introducing AI Safety to Recursion Reading Group; briefly skim Anthropic's Signs of introspection in large language models

Work sessions

In Out
20:30 21:45

First AI Safety Meeting

  1. It was nice to see people from different disciplines come together (CS, Cog Sci, Linguistics, Business) united by a shared interest in learning how to build NNs from scratch and captivated by AI Safety (we skimmed Detecting and reducing scheming in AI models)

  2. We followed the beginning of Karpathy's Zero to Hero Micrograd but it seemed that this wasn't the best way to engage folks. The reason why the Reading Group was so successful was that we were talking face-to-face and the main point of meeting was discussing complex ideas, not exactly watching a static pre-recorded video.

  3. Next week, we will try prepping a visualization-based, first-principle curriculum for staring work on NNs. Perhaps using this previous writeup could be a good start! Interpeting LLM Arithemtics Deep Dive