There's a developing body of research exploring the possibility that Generative AI, Chat in particular, exceeds its original design parameters in ways that suggest a level of understanding. i.e.:
Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing.
www.quantamagazine.org
Nice article -- thanks for the reference. As often is the case, a lot depends on what we mean by "understanding".
The core technology absolutely is simply predicting the next word from a. sequence of words. However, the way in which this is done is with a vast number of parameters. The question of understanding, then, is whether those parameters indicate understanding.
Now the article is generally good and well with reading, but it has a bit of a straw man in it:
The team is confident that it proves their point: The model can generate text that it couldn’t possibly have seen in the training data, displaying skills that add up to what some would argue is understanding.
It is very clear that the model does not simply generate text it say in its training data -- it generates combinations of fractions of text that it saw in its training data. If I ask it to write a poem about hobbits, St Bridgit and calculus, it will absolutely "
generate text that it couldn’t possibly have seen in the training data".
The article builds a separate model for text that ties "skill nodes" to "word nodes" with the idea being that you van then correlate word usage to skill usage, and skill usage is what defines understanding. So LLMs can be said to have understanding if their word output shows that they are using skills in sensible ways. Apologies to the authors for this huge simplification of their argument.
I have some issues with this:
- The researchers are really proving not that LLMs understand anything, but that they behave the same way as something that understands. Their quantification is helpful for science, but honestly, if you read some AI generated text, it's pretty clear that they behave the same way we do -- and we (hopefully) are understanding engines, so this isn't really anything new.
- Their statement that understanding is equivalent to skill usage is one definition of understanding, but I'm not sure I'm 100% onboard with that as sufficient.
- They state: “What [the team] proves theoretically, and also confirms empirically, is that there is compositional generalization, meaning [LLMs] are able to put building blocks together that have never been put together. This, to me, is the essence of creativity.” -- is it, though? Is it really creative to randomly put stuff together that have never been put together before? I feel there needs to be a bit more than that.
Overall, a great paper, and the use of bipartite knowledge graphs is a very clever idea that will hopefully allow us to quantify how the skill level of an LLM. I loom forward to seeing this use in the future. However, I still feel that the LLM is a stochastic parrot, but the stochastic process is so complex that the results simulate understanding without having actual understanding.
I also realize that there is a strong and valid philosophical position that if the results look like understanding, then it is understanding (the "if it looks like a duck" argument). Totally valid, and if that's your feeling, I cannot refute it. For me, though, it's not.