You are viewing archived messages.
Go here to search the history.

Kartik Agaram 2023-08-10 22:15:09

Grokking

In 2021, researchers made a striking discovery while training a series of tiny models on toy tasks . They found a set of models that suddenly flipped from memorizing their training data to correctly generalizing on unseen inputs after training for much longer. This phenomenon – where generalization seems to happen abruptly and long after fitting the training data – is called grokking and has sparked a flurry of interest .

pair.withgoogle.com/explorables/grokking

📝 Do Machine Learning Models Memorize or Generalize?

An interactive introduction to grokking and mechanistic interpretability.