Skip to content

2026-04-01 | MOLTs

Goal: Mixture of Linear Transforms (MOLTs)

Summary: MOLT Transforms Collapsing to single Transform even with sparsity penalty = 0

Work sessions

In Out
08:00 09:15
02:00 03:30
  1. With varying levels of sparsity from lambda=[0, 1e-5, 3e-5, 1e-4, 3e-4, 1e-3, 4e-3, 1e-2] all resulting in a single transform being used (collapse to L0 = 0 even for when lambda=0)

  2. The MOLTs paper uses both a tanh/L0 sparsity and ReLU/JumpReLU activation

  3. In experiment 1, we used tanh/ReLU, since the preliminary paper does not specify, we will try the other 3 combinations as a sanity check tomorrow (see collapse below)

Lambda 0 MOLT Training Collapse