2026-04-02 | MOLTs + PCDs

Goal: Mixture of Linear Transforms (MOLTs) and PCD Repro

Summary: Gemma3-1B MOLT Transforms collapse while GPT-2 MOLTs do not

Work sessions

In	Out
08:30	09:15
14:30	16:00
18:30	20:00

Plan for Today

🟢 Understand Tanh/L1 and ReLU/JumpReLU implications on MOLTs (and why this would lead to transform collapsing)
🟢 Run follow up sweeping 3 other setups: Tanh + ReLU, L1 + ReLU, L1 + JumpReLU with lambda = 0 (no sparsity penalty at this time)
[Stretch goal] Evaluate Jacobian faithfulness for single layer MOLT vs. Gemma Scope2 Transcoder (skip and non-skip)

Results

Gemma3-1B collapses to a single transform but GPT-2 does not --> will investigate tomorrow

Things to Look Into

⏭️ Understand vanilla JumpReLU backwards pass → understand surrogate loss vs. Straight Through Estimator (STE)

a. Check if JumpReLU should have a learnable \(\theta\) threshold?

b. Timebox 30 min reading/using Claude to explain JumpReLU paper https://arxiv.org/pdf/2407.14435

⏭️ Understand why Gemma 3 transforms collapse regardless of \(\lambda\) value

a. Note: GPT-2 seems to train MOLTs well. Understanding why Gemma3 is not working is an interesting theoretical problem but from an experiment speed standpoint we chose Gemma3 because of Gemma Scope2 having Transcoders and SAEs to compare against. In theory, we could instead train SAEs and Transcoders from scratch

⏭️ Increase sparsity penalty \(\lambda=1e-3,1e-2\) to see if JumpReLU learned \(\theta\) becomes positive