Tag Archives: llm

Chat GPT and the Meaning of Meaning.

In the last decade there has been an explosion of literature on Large Language Modules. The availability of Chat GPT-3 and Chat GPT-4 has created an interest in the topic for the general public. The resulting literature has to some degree led to a rehashing of age-old debates in philosophy which began with the onset of symbolic systems in the 1980’s. John Searle circa 1980 with his Chinese Room argument claimed that there was no reason that a syntactic system should automatically acquire semantic content. Searle had his critics such as Dennett’s 1991 Systems Objection. It is fair to say that most computer scientists took Dennett’s side in the debate.

                With the rise of Chat GPT and other LLM’s this debate has arisen again. But the debate is a bit different. Virtually nobody thinks that LLM’s are conscious agents. Furthermore, nobody thinks that as they stand, they exhibit artificial general intelligence. Nonetheless, a debate has begun as to whether as they stand, they exhibit any meaning. Leaving aside the question of “hallucinations” which sometimes occur when people use GPT; the answers which the device provide generally seem to be largely pragmatically coherent, syntactically well formed, and semantically coherent from the point of view of people interpreting them.

                But despite users finding the models semantically interpretable; theorists argue that there is little reason to attribute any semantics to the systems. Now to evaluate this claim one needs to be explicit on what is meant by semantics in such discussions. There is a long tradition in linguistics, model theory, and philosophy to think of semantics in terms of a referential relation. Thus, the meaning of “Apple” is a physical apple in our intersubjective world of experience. The same would be true of other words in our language. On this conception LLM’s clearly have no semantics because they refer to no mind independent properties and objects. The story would go that there is a causal interaction between humans, their socio-linguistic community, and shared objects of experience as humans causally interact with their world and others. On this view it is blindingly obvious that Chat-GPT doesn’t have any semantics in the sense of a word world relation.

                But semantics in the sense of referential semantics isn’t the sole criterion of semantics. Critics of referential semantics abound in the literature. Some critics such as Chomsky who do think of language as involving a representational system don’t think of language been explicable in terms of a referential relation (Chomsky 2000). While philosophers such as Quine have drawn similar conclusions using different arguments (Quine 1960). However, reference aside it is apparent that humans’ words clearly have meaning, even if this meaning cannot be accounted for in terms of a crude referential relation.  And if our words have a meaning without a crude referential relation then there is no reason to think that Chat GPT-4 or any other LLM needs to use a referential relation in order to mean something by what they say.

                Nonetheless there seems to be something almost perverse in attributing semantics to an unthinking Large Language Model. But as Sogaard (2022) has noted we need some concept of meaning to explain how we can differentiate between syntactically identical sentences such as (1) Colourless Green Ideas sleep furiously, (2) Happy Purple dreams dance silently.  We need something to decide what distinguishes these syntactically identical sentences. And a key candidate is the meanings of the individual words {however we parse meaning}. In the next section we will try to explicate what we mean by meaning and whether LLM’s exhibit meaning in any of the traditional senses of meaning.

                    Meaning as Reference.

                A commonsensical parsing of what people understand by meaning is that they understand it in terms of reference. Nonetheless, experimental philosophy has revealed some cultural variation on how people conceive of the reference relation (Deutsch 2009). However, while it is interesting to note cultural variation in how people intuitively understand the reference relation, the job of the philosopher is to evaluate the logical coherence of understanding meaning in terms of a reference relation.

                Obviously, there are a class of words which clearly cannot be cashed out in terms of a reference relation. Such words are grammatical particles such as “to”, “not”, “ing”, “s”, “ed”, “ing” which modify nouns and verbs. These particles obviously cannot be parsed in terms of a reference relation. The same is true of logical operators such as “And”, “Or”, “Not”, “If and Only If” etc. While any attempt to understand the grammatical particles in terms of a reference relation is defunct. Understanding the logical operators in terms reference seems equally hopeless unless one were to appeal to reference to some non-physical Platonic Realm.

                But logical operators and grammatical particles aside people typically think of the reference relation as best cashed out in terms of nouns and categories which describe and modify nouns such as verbs and adjectives. Thus, a sentence such as the ‘The tall man runs Fastly’. Can be cashed out referentially as referring to a particular man who we are ascribing certain properties to. At an intuitive level we could argue that the meaning of this sentence can be cashed out in referential terms.

                But even with nouns we run into intractable problems in cashing out meanings in terms of reference. Fodor and Pylyshyn (2015) have summed up these problems. We use words to describe objects which we cannot cash out in sensory experience. We talk about extremely large events the multiverse which cannot be cashed out in any simple way in terms of a referential relation. And extremely small objects such as Quarks which cannot be directly observed (and hence cannot be cashed out in terms of a reference relation). There are impossible objects such as a Round-Square which clearly have a meaning, but which do not refer to any objects. There are fictional objects such a Unicorn which do not refer to any objects, but which clearly have meaning.

                When combined with the fact that grammatical particles and logical constants cannot be cashed out in terms of reference; the above considerations which show that many of our concepts which have meanings, cannot be cashed out in terms of reference. Therefore, it is rational to doubt that meanings can be cashed out in terms of reference. But this brings us full circle to our considerations of LLM’s and whether the words they use have meanings.

                 Some points to note. The sentences used by LLM’s clearly have a meaning to us as interpreters. We can parse their sentences as either sensible or senseless, as true or false etc. But the question is do the words the LLM’s use have meaning for the LLM? The answer to this question is clearly no. And one of the key reasons is that they do not acquire their words in terms of a grounding in sensory experience. Hence none of the words they use mean anything to them because they are not grounded in sensory experience and hence are nothing other than collections of words grouped together based on statistical patterns derived from the LLM.

                       The Contradiction.

But we seem to have arrived at a contradiction here. We have argued that LLM’s do not have meanings because their words are not grounded in sensory experience, but when it comes to humans, we have argued that we cannot cash out meanings in terms of reference. But this conflict is more apparent than real. When it comes to sensory grounding it occurs in the context of what Quine calls the mid-level scheme of ordinary enduring physical objects. Or as what Husserl would call the lifeworld the intersubjective shared world which embodied humans live in.

                It will help to begin simple. Let us take the grounding of a word “Apple”. Before the child acquires any words, their cognitive apparatus is engaged in ontological assumptions. Children will perceptually engage in categorising the objects they experience, and they will have implicit assumptions about object behaviour such as object permanence, object solidity etc (Carey, 2009). Such ontological categories shared by all humans will ensure a surprising convergence between humans when they interpret the sounds others are using in relation to shared objects of experience (Pinker 1994). But in order for a child to interpret what the sounds being spoken to him signify the child will need more than pre-linguistic ontological commitments to guide him. He will also need to know that the sound’s being used in his presence are intended in a communicative sense. To do this he will need a theory of mind, the ability to interpret pointing in terms of a shared object of experience, the ability to track eye contact in terms of a shared object of experience and a cooperative instinct.

                When thinking about this issue in terms of epistemology Quine called this the problem of how we go from our meagre input (impact of light and sound on our sensory receptors) to torrential out put (our total scientific theory of the world). Quine recognised that we couldn’t just appeal to shared objects of experience to explain the epistemological question of how we go from stimulus to science, we must explain how this is done. And his explanations relied on both preestablished harmony of perceptual similarity space built by natural selection and shared empathy to ensure that we understood that others who were gesturing towards objects had a psychology and motivation similar to our own.

                Sandford and Hayes (2014) cashed out our ability to go from simple referential relations between two people to more complex capacities such as the capacity to grasp the frame of coordination, and other complex relational frames. They do this by appealing to group selection as a factor which selected for cooperation amongst humans which in turn made it easier to triangulate on shared objects of experience.

                The key point to note is that on this picture. (1) The prelinguistic ontological commitments to object ontology (hinted at by Quine’s pre-established harmony and demonstrated by results in developmental psychology by Carey, Spelke et all). (2) Cooperation and Empathy (an empathic instinct argued for by Quine and demonstrated by Tomasello 2013, Sanford & Hayes 2014). All make it possible for a child to triangulate with a parent on a shared object of experience when they are acquiring their basic concepts. This epistemological triangle results in the child’s early concepts being grounded in an umwelt shared with their species and social community. Now when a child has acquired his grounded base level concepts, he will have the capacity to learn other concepts which are not grounded e.g. Unicorn, Quark, Round-Square etc. But this process will involve using combinatorial syntax (merge), along with analogical reasoning, and perhaps the use of relational frames to create more complex concepts which cannot be cashed out in terms of our shared sensory umwelt.

                This removes our supposed contradiction. A lot of our complex concepts are not explicable in terms of our shared umwelt. Nonetheless, our base level concepts cashed out in terms of our shared umwelt are grounded in terms of triangulation on shared objects of experience for humans when acquiring their concepts. This triangulation is something that LLM’s do not have. Rather than triangulate on shared objects of experience they simply put have to detect statistical patterns in the input they receive as well as modify them depending on the biases and/or reinforcement learning the module is subjected to.

                So, if we go back to our reference problem from above. We noted that impossible objects, very small objects, very large objects, fictional objects have meanings, but they cannot be cashed out in terms of reference. This created a puzzle that when LLM’s were answering questions they appear meaningful to us. But the sceptic would argue that they cannot be meaningful because the words aren’t picking out mind independent objects. But our consideration of human concepts shows that a lot of our concepts do not have meanings conferred in terms of mind independent objects. But if cashing out meanings in terms of mind independent objects isn’t what makes human words meaningful and LLM’s meaningless (unless interpreted by humans), one wonders what is. The answer is that when humans are acquiring a word they do so triangulating on a shared object of experience made possible by shared concepts of objects and a shared instinct to cooperate. This means that as humans acquire their base level concepts, they do so in terms of a shared world of experience in which multiple humans are causally interacting with the world while communicating with each other. While humans can learn new concepts which are grounded directly in the lived world of sensory experience; the concepts they first acquire are grounded in the human shared world of experience. Where as for LLM’s none of their words are grounded hence they mean nothing by what they say. Though we as agents with grounded concepts can attribute meaning to their linguistic output (Harnard, 2024).

                                  Some Objections

                Not everyone agrees that grounding is what separates Humans and LLM. Some theorists even go as far as to argue that LLM’s have can derive meaning via reference in a similar manner to Humans.

“Referential semantics or grounding now amounts to learning a mapping between the Transformer model vector space and this target space. But why, you may ask, would language model vector spaces be isomorphic to representations of our physical, mental, and social world? After all, language model vector spaces are induced merely from higher-order co-occurrence statistics. I think the answer is straight-forward: Words that are used together, tend to refer to things that, in our experience, occur together. When you tell someone about your recent hiking trip, you are likely to use words like mountain, trail, or camping. Such words, as a consequence, end up close in the vector space of a language model, while being also intimately connected in our mental representations of the world. If we accept the idea that our mental organization maps (is approximately isomorphic to) the structure of the world, the world-model isomorphism follows straight-forwardly (by closure of isomorphisms) from the distributional hypothesis.” (Sogaard, 2022 pp. 442-443).

Even if we accept Sogaard’s tendentious statement that the words which are statistically likely to occur together are to some degree isomorphic to structural features of the world. We still have a gap to fill to tell us how such vector spaces are used to refer to objective features of the mind independent world.  It is a leap to go from saying that pattern A is roughly isomorphic with pattern B to saying that a subject is using pattern A as a representation of pattern B. And thus far there is no reason to think that GPT or any other LLM is using words in a representational manner at all.

                 Bender and Koller used an interesting thought experiment which is useful as an intuition pump to demonstrate the lack of grounding in LLM’s speech.  They ask us to think of two agents who are trapped on separate islands who can communicate using a code they send to each other through a wire. They then ask us to imagine an Octopus who a statistical genius who floats beneath the ocean. This genius manages to intercept the wire and learns how to interpret the signals (pick out statistical patterns in the data) and communicate with the Islanders. He becomes so effective that he manages to fool Islander A that he is Islander B. But Bender and Koller argue that if Islander A asked the Octopus to find some coconuts and build a coconut on the Island and then report its findings back, the Octopus would not be able to do so. It is important to note that the fact that Chat GPT could answer this question by having access to the world’s written material is irrelevant. The point is that while the Octopus can mimic the Islanders based on detecting statistical patterns in their conversations, he would not have the capacity to think up an answer to the Catapult question unless he was repeating what he was exposed to before behaving like a large “stochastic Parrot” or if he was an embodied agent whose understanding of the concepts was grounded in experience interacting with the objects under discussion.

                Bender and Koller’s thought experiment reiterates what we have been arguing throughout this blogpost. While LLM’s can appear surprisingly sophisticated when answering questions about various topics there is no evidence that their words have any meaning in the sense that humans have meaning. And the reason for this is because the robots are not embodied and are not causally interacting with a world they are communicating about with other agents.