WHAT IS IT?
OpenAI published a new research paper detailing a technique that uses its GPT-4 language model to write explanations for the behavior of neurons in its older GPT-2 model, albeit imperfectly. It's a step forward for "interpretability," which is a field of AI that seeks to explain why neural networks create the outputs they do.
WHY IMPORTANT
While large language models (LLMs) are conquering the tech world, AI researchers still don't know a lot about their functionality and capabilities under the hood. In the first sentence of OpenAI's paper, the authors write, "Language models have become more capable and more widely deployed, but we do not understand how they work."