Blog

Writing

Notes and longer pieces, mostly on machine learning and interpretability.

What mechanistic interpretability says about whether an LLM commits to a secret in 20 Questions.