2025-11-04 | Dyck Probe Debugging

Goal: Interpretability Probe Debugging

Summary: Enfore reproducibility error by setting torch.seed()

Work sessions

In	Out
00:23	1:00

Reproducibility

In the original implementation of the paper from earlier this year, performance was at a ceiling (close to 98% accuracy). However, reproducing the model led to different training accuracies across runs. After verifying that the input data and order is consistent, the torch.seed() was the root cause of hte issue between different runs.