Skip to content

Doing The Thing

2026-04-28 | MOLT Steering

2026-04-28 | MOLT Steering

Goal: Steering with MOLT Features and Top Activating prompts dashboard; see also PR for MOLTs to master and Wandb Runs

Summary: Feedback from Georg on using a translation task to verify necessary and sufficient conditions

Work sessions

In	Out
00:30	01:20
02:00	06:00

Feedback on Scaling

Train MOLTs on all layers of GPT-2 and Gemma-3-4B-IT

Necessary and Sufficient

Necessary --> If we ablate this transform, does the expected behavior go away?
Sufficient --> This feature must be active when the hypothesized behavior occurs

WandB Runs

Scaling MOLT Multiplier and Tokens:
MOLT Training with Varying Sparsity: 1e-4 and 1.5e-4 sparsity have balanced MSE vs. L0
BF16 Mixed Precision Training
Rebase MOLT on Master Branch: Verify that after the rebase the runs are essentially the same