How do you think about the interpretablity research? #231
Replies: 1 comment 1 reply
-
Thanks for sharing, this looks like an interesting article. I remember seeing it shared somewhere a few weeks ago, but I haven't had time to fully read it yet. They say "We mostly treat AI models as a black box:" ... well, I'd say that's because they (the proprietary LLM providers like Anthropic) also only present their LLMs to customers as a black box. I can't speak for everyone, but I know many researchers who do like to look and analyze attention maps, myself included. I have to read it in more detail, but based on skimming it here, I don't thing they are doing anything new or unusual here by looking "under the hood". Even the first language model attention papers did something similar to this. E.g., the 2014 Neural Machine Translation by Jointly Learning to Align and Translate paper (https://arxiv.org/abs/1409.0473): ![]()
|
Beta Was this translation helpful? Give feedback.
-
Last month, Anthropic published their research about the interpretablity of Claude 3. Do you have any comments about it?
I'm quite interested in how the LLM works internally but it seems there are not many people working on that. I hope people can find some clues on the basics of intelligence by investigating LLM.
I'm a beginner in this area. Your book and blogs really helped me a lot. I'd like to share them with my Chinese friends. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions