Dennett and The Typical Mind Fallacy

DENNETT’S UNCONSCIOUS AUTO-BIOGRAPHY

                                                 INTRODUCTION

 

Gradually it has become clear to me what every great philosophy so far has been: namely, the personal confession of its author and a kind of involuntary and unconscious memoir.  (Nietzsche: Beyond Good and Evil Sec 6)

 

In this paper we will argue that Dennett’s views on the nature of the mind are based on him unconsciously projecting his own idiosyncratic type of mind onto all other people.  We will argue that he is guilty of what William James called “The Typical Mind Fallacy” of assuming that his type of mind is representative of all types of mind. We will show that Dennett has a particular type of consciousness and that his type of consciousness is not representative of the consciousness of all other people. We will demonstrate that Dennett has a “Bodily-Linguistic Type of Consciousness”.  However, prior to doing this we will sketch the typology which we are using to interpret the mind of Dennett and what evidence we have to support the existence of this particular typology.

We are working with the typology which David Berman invented in his 2008 book Penult. This typology divides people into three main divisions: (1) Type 1: Mentalistic Consciousness, (2) Type 2: Bodily Consciousness, (3) Type 3: Socio-Linguistic Consciousness. The evidence for this typology comes from the variability people have in their mental abilities. Some clear examples of mental variability which have been confirmed are:

(1)Galton: Provides evidence that people have different abilities to form mental imagery. See also Kosslyn, James etc.

(2) Dalton’s discovery that some people are colour blind.

(3) Some people have synesthesia; this fact was denied for years but is accepted now.

(4) There are variations in how people experience pain. See, for example, Roger Filligim (2005) “Sex and Gender Issues in Pain”, as well as Young et al. (2008) “Genetic Differences in Pain Variability”.

The above four examples show variation in people’s cognitive abilities and subjective experiences which went undetected for years before being discovered by empirical research. We suggest that such variation in mental abilities inadvertently influences the theories of the mind which philosophers construct. So, for example, there is evidence that people’s psychological abilities affect the philosophical theories they accept; see Berman “Philosophical Counseling for Philosophers” (2013), as well as William James “Principles of Psychology” (1892), and Holtzman “Do Personality Effects Mean That Philosophy is Intrinsically Subjective”? (2013).

Showing that there is variation in people’s mental abilities and that these variations influence the type of theories which philosophers accept is one thing, however it is not evidence for Berman’s Tripartite Typology. In his Penult Berman uses evidence from the history of philosophy, evidence from interviews with living subjects, as well as his own introspective experiences to justify his Typology. His Typology makes sense of the views of the great philosophers throughout history. It also shows why so many of the great philosophers, in the past despite being great thinkers, could not agree with each other about the nature of the mind. All of these philosophers were operating on the mistaken assumption that there is only one type of mind. The diagram below shows how Berman divides the philosophers of the past according to his Tripartite Typology.

Type 1 Kant

Nagel

Searle

 

Type 2 Type 3
DescartesBerkeley

Plato

JamesDennett (major2/minor3)

Spinoza

RussellRorty

(major3/minor2)

Hegel

 

Berman’s Typology

Type 1 philosophers have a strong experience of their own consciousness; this experience is called unbounded consciousness. Type 2 philosophers have no experience of unbounded consciousness; their experience is entirely bodily and linguistic. Type 3 is a socio-linguistic type of consciousness. The various different types of consciousness can be blended, with one of the types being dominant. So, for example, Berman argues that both Rorty and Dennett are a blend of Types 2 and 3; with Dennett Type 2 is dominant, while for Rorty Type 3 is dominant (Penult p. ).

Berman’s three main psychological types are vividly represented by Rene Descartes (Type 1), William James (Type 2), and Richard Rorty (Type 3). The following quote from Descartes illustrates his Mentalistic abilities:

 When I consider the mind—i.e. consider myself purely as a thinking thing—I can’t detect any parts within myself; I understand myself to be something single and complete. The whole mind seems to be united to the whole body, ·but not by a uniting of parts to parts, because if a foot or arm or any other part of the body is cut off, nothing is thereby taken away from the mind. As for the faculties of willing, understanding, of sensory perception and so on, these are not parts of the mind, since it is one and the same mind that wills, understands and perceives, They are (I repeat) not parts of the mind, because they are properties or powers of it.  (Descartes: Meditations on First Philosophy p. 11)

The above quote illustrates that, as a Type 1 philosopher, Descartes experiences himself as primarily a mental entity. James’ experience of himself is entirely different:[1]

Let the case be what it may be for others, I am as confident as I am of anything that, in myself, the stream of thinking (which I emphatically recognize as a phenomenon) is only a careless name for what, when scrutinized reveals itself to consist of the stream of my breathing. (James 1904)

There is I mean no aboriginal stuff or quality of being, contrasted with that of which material objects are made, out of which our thoughts of them are made, but there is a function in experience which thoughts perform…(namely)…knowing. Consciousness is supposed necessary to explain the fact that things not only are, but get reported, are known. (James: 1904 p.101)

Everyone assumes that we have direct introspective acquaintance with our thinking activity as such, with our consciousness as something inward contrasted with the outer objects which it knows. Yet I must confess that for my part I cannot feel sure of that conclusion. Whenever I try to become sensible of my thinking activity as such, what I catch is some bodily fact an impression coming from my brow, or head, or throat or nose. It seems as if consciousness as an inner activity were rather a postulate than a sensibly given fact. (James: 1892: Text Book of Psychology p. 467)

James above is denying that we introspectively experience consciousness; his distinctive subjective experience is entirely different from Descartes’. While Descartes and James are good exemplars of Type 1 and Type 2, Richard Rorty is a good exemplar of Type 3. Here some Rorty quotes illustrate his type of mind:

The temptation to look for criteria is a species of the more general temptation to think of the world, or the human self, as possessing an intrinsic nature, an essence. That is, it is the result of the temptation to privilege someone among the many actual and possible languages in which we habitually describe the world or ourselves (Richard Rorty: Contingency, Irony and Solidarity p. 15)

This is that there is nothing deep inside each of us, no common human nature, no built in solidarity, to use as a moral reference point… To be a person is to speak a particular language, one which enables us to discuss particular beliefs and desires with particular sorts of people. It is a historical contingency whether we are socialized by Neanderthals, Ancient Chinese, Eton, Summerhill, or the Ministry of Truth. (ibid p 177)

We can see from the above quotes that Rorty is incapable of conceiving of himself independent of language. This is a clear indication that he is a Type 3 thinker.

In his Penult Berman goes into much more detail in defending his typology. Here we have shown what the evidence for the typology is and how it explains the philosophical views of various great philosophers. Throughout this paper we will analyse Dennett’s writings in detail to show that he is a combination of a Type 2, and a Type 3 thinker and that he incorrectly generalises his type of mind to be representative of all minds. We will analyse Dennett’s views on Mental Imagery, Dreams and Pains and show that his views on these topics are derived from his own subjective experience of these phenomena.

DENNETT: EARLY CHILDHOOD AND PHILOSOPHICAL DEVELOPMENT

We will make the case that Dennett’s philosophy of mind is partly derived from his own idiosyncratic psychology. This claim portrays Dennett as having experiences which are radically different from the experiences of other philosophers. This leaves the question of how Dennett has developed the type of mind that he in fact has. Was his type of mind derived from his genetic code, or did socialisation play a more important role? Any answer to this question will be highly speculative as there is little written about Dennett’s upbringing and obviously we have no information about the structure of his genetic code. However, trying to understand Dennett’s life and to what extent his experiences influenced the type of mind he developed will help to make sense of his strange views on the nature of the mind. Dennett wrote a small auto-biography detailing his life experiences. Discussing this auto-biography will help us understand his type of mind better.

Dennett’s brief discussion of his early life and his early education in philosophy is interesting but fairly sparse in details. Certain things do stand out though. Dennett’s early loss of his father was obviously a traumatic experience for him. According to Dennett, he was expected to follow his father into academic life in the humanities. So while Dennett’s father was no longer alive he was present to Dennett as an idealised figure. His father was an idealised figure who people expected Dennett to live up to and to follow into the humanities. Dennett was interested in engineering and loved building things yet was expected to go into the humanities. This situation would create resentment in most teenagers. His own dreams and aptitudes were subordinated to the expectations and wishes of his parents. He managed a minor rebellion against his family’s wishes by not going to Harvard. He instead went to Wesleyan University. He studied Maths and English. It was in his advanced Maths Class that he first discovered Quine’s work when he read his book ‘Mathematical Philosophy’. When Dennett first read Quine’s “From A Logical Point of View” he claimed he was impressed but found something wrong with Quine’s position.  So Dennett contacted Quine and transferred to Harvard. It is worth noting that after less than a year Dennett’s rebellion against his parents’ wishes had ended. He was studying where they wanted him studying. Furthermore, he was studying philosophy a humanities subject. So Dennett’s discovery of Quine was doubly beneficial for him; he had found a thinker who really interested him and he had found a way of pleasing his family (in particular his dead idealised father). Another benefit which the discovery of Quine had for him was that it helped Dennett keep both his family and himself happy simultaneously. By studying Quine he would be studying a humanities type subject (keeping his family happy), while Quine’s interest in natural science and formal logic chimed well with  Dennett’s scientific/engineering interests.  So it is understandable that Dennett was excited by the discovery of Quine, as it helped him to fullfill his own desires and his deep need to live up to his father’s expectations.

Dennett eventually wrote his undergraduate thesis on Quine. He called the thesis “Quine and Ordinary Language”. There are no details of the thesis available though Dennett mentions that it was critical of Quine and that Quine thought highly of it. Dennett said that at this stage of his development, he, unlike most of his fellow students, held Ryle’s work in high regard. He thought of Ryle’s “The Concept of Mind” one of the best philosophy works he had read. Dennett finished college at the age of twenty and at this early stage of his development his primary influences were Ryle and Quine, both of whom were behaviourists. So it is tempting to argue that Dennett’s strange eliminativist views on the mind resulted from his being educated by behaviourists: but this cannot be correct both Nagel and Kripke were undergraduates with Dennett, and they certainly did not end up sharing Dennett’s view of consciousness. So we need something more than the influence of Dennett’s teachers to explain his views on consciousness. Something in Ryle and Quine chimed with Dennett.  We argue that their theories seemed intuitively correct to him because they rang true to him at an experiential level. Their theories of the mind rang true to his own lived experiences. Those with different types of mind, for example Nagel and Kripke, found Ryle’s theory counter-intuitive because the way they experienced the world was so different than Ryle or Quine’s descriptions of the mind.

Because he thought so highly of Ryle Dennett went to Oxford to study under him. Ryle was the supervisor of Dennett’s doctoral thesis which eventually became his book “Content and Consciousness”. Dennett notes that he saw problems with qualia as early as 1963 and even then he believed that our conscious experience only seemed to be as rich as we believed it was. So we can see that from a very young age he already felt that the philosophical community was deeply wrong about the nature of the mind. He would spend the next fifty years developing arguments and finding empirical evidence which supported his intuitions on the nature of the mind.  Pretty much from day one in Oxford he tried to inform his theories of the mind with the best neuroscientific evidence available. His PhD was so deeply immersed in neuroscience that a neuroscientist was asked to examine his thesis. Dennett had managed to follow in his father’s footsteps while still getting to study the science which so interested him.

The most important thing to note about his description of his early childhood development is the fact that from the age of seventeen when he was first exposed to philosophy he was well-disposed to the philosophies of Quine, and Ryle. This indicates that he saw something in their behaviouristic explanations of mind which chimed with his own experience. He felt that ‘qualia’ was a useless theoretical term as early as the age of twenty-three. His entire philosophical life was dedicated to providing philosophical arguments and scientific evidence to support his intuitions of the mind.

What appealed to Dennett about Ryle and Quine was their emphasis on language and behaviour. Subjective experience was not important to either Ryle and Quine and they believed that language played a central part in making us human. Dennett being primarily a linguistic-bodily thinker would have related to Ryle and Quine perfectly.

                         DENNETT ON CONSCIOUSNESS AND LANGUAGE

Dennett’s  theories of mind and consciousness have always emphasised the importance of language for thought and experience. In “Consciousness Explained” Dennett described the self as follows:

Selves are not independently existing soul-pearls, but artifacts of the social processes that create us, and, like other such artifacts, subject to sudden shifts in status. The only “momentum” that accrues to the trajectory of a self, or a club, is the stability imparted to it by the web of belief that constitute, it, and when those beliefs lapse, it lapses, either permanently or temporarily. (Consciousness Explained: p 423)

We suggest that Dennett’s description is of his own type of mind and not of the minds of all people. Language is such a deep part of his self that he is incapable of thinking independently of it. We will provide evidence for this claim by examining his views on Mental Imagery, Dreams, and Colour etc.

The importance of language for Dennett becomes even more apparent when one considers his theory of consciousness. Dennett’s theory of consciousness which in 1996 he dubbed the “Fame in the Brain Model” is meant to replace the tempting idea of the “ Cartesian Theatre”. Throughout his philosophical career Dennett has attempted to replace the “Cartesian Theatre” picture with increasingly more apt models, from his “Multiple Drafts Model” to his “Fame in the Brain” model to his “Fantasy Echo Model”. With his Fame in the Brain model he argues that consciousness is more like fame than television. It is not a special “medium of representation” in the brain into which content-bearing events must be “transduced” in order to be conscious (Sweet Dreams: p. 160). He argues that it is more like fame, where one content bearing event achieves something like fame in competition with other fame-seeking events. This metaphor is obviously imperfect. Fame is typically an intentional object of a number of agents (ibid, p.161), and he concedes that it makes little sense to think of multiple homunculi all holding some content as an intentional object and elevating it to fame. So he instead opts for the less troublesome metaphor of consciousness as “influence in the brain”.

For Dennett conscious events are attention-grabbing events. They are thoughts that persist in the mind for a sustained period of time. To help us think about this fact he asks us to think about them in terms of fame and how long a period a person can be famous for. Dennett plausibly argues that a person cannot be famous for a few seconds; likewise, a person being viewed by millions of people on television will not necessarily be famous. No matter how many people may see me in the background in some sitcom this will not make me famous. To be famous I would have to be a person whom others talked about enough to ensure that my presence reverberated within the stream of public discourse. To achieve this importance within the public stream of discourse takes time; it does not happen instantly. Some may argue that it in fact does occur instantly. They will point to people who seem to achieve instant fame as a result of some event, for example, an assassin of a president may achieve a level of instant fame. Nonetheless, despite appearances, instant fame does not really occur. It is something which is conferred post-hoc as a result of the influence the event has on the public discourse of others. Without the reverberations within the linguistic community the shooter would not achieve fame. Dennett argues that consciousness has similar features to the type of fame that is achieved in the external world.

For Dennett, fame in the brain, like fame in the world, requires echo-making capacities. To say that something is conscious even though it doesn’t have any echo-making properties within the mind is for Dennett senseless. Dennett uses the example of the aroma of classroom library paste as something that has echo-making properties for him. The aroma has strong echoic properties because it evokes various vivid childhood memories and associations for Dennett.

To be conscious is precisely to have a thought that yields enough influence to direct a lot of homunculi to engage with it, he argues that fame is like this too. He also notes that the nature of fame has changed with the invention of electronic media:

“ a recursive  positive feedback became established, dwarfing the initial triggering event and forcing the world to wallow in wallowing in wallowing in reactions to reactions to reactions to the coverage in the media, and so forth.” (ibid p. 167)

He thinks that such fame was very difficult to achieve in the pre-electronic age. He argues that language did for human consciousness what electronic media did for fame: it transformed it.  It is for this reason that he calls us ‘Joycean machines’. The recursive capacity which is a central feature of natural language is one of the key features of consciousness. Dennett even makes the speculative prediction that non-human animals may have little echoic capacities so strictly speaking will not have consciousness. Dennett’s view on animal consciousness is definitely a minority position. The majority of scientists now accept that animals are conscious. Obviously, science is not a majority rules activity so the fact that Dennett’s position is not accepted by many does not prove that it is incorrect.  The primary reason that people argue that non-linguistic animals are conscious is because of their complex behavioural capacities. Behavioural tests have shown that some animals have complex concepts of number, agency, causality etc. However, Dennett points to the fact that people often reason with these concepts unconsciously and asks whether we can be sure that animal’s competence with the above concepts may entirely involve unconscious reasoning. He also notes that we cannot be sure that animal’s behaviour isn’t a kind of blind-sight. Such animals would be able to predict certain aspects of their environment without having any conscious experience of doing so. It is worth noting though that some mammals also respond to anti-depressants and anti-anxiety medications in a similar way that humans do. This indicates that their experiences are similar to ours. There is little to recommend the view that such animals are suffering from total blind-sight in all of these areas, though it is admittedly theoretically possible.

Dennett’s speculation that non-linguistic animals may not be conscious is not central to his theory of consciousness. If his claim about the consciousness of non-linguistic animals not being conscious were refuted (and I think it will be), this would not refute his theory of consciousness as influence in the brain. Non-linguistic animals may have basic recursive capacities that make the type of conscious experience which Dennett talks about possible.

At this point I should clarify what Dennett means by recursive capacities. In an obvious sense Dennett cannot mean that recursion is entirely unique to humans. Dennett accepts the computational theory of the brain; he also accepts that the computational procedures govern animal brains. It is a fair bet that the computational procedures which govern human and animal brains sometimes involve recursion. Dennett does not deny this obvious fact about the structure of animal brains. When he says that human language has a unique recursive structure he is speaking of recursive structure as a kind of software. He calls this software a virtual Von-Neumann machine (Consciousness Explained, p.201 ). Dennett argues that the mind is like a software programme installed on the parallel neural networks of the brain ( Densmore and Dennett 1996 p .1). The software is installed onto the hardware of the brain through memes. Our linguistic abilities give us the capacity to understand and produce a potentially infinite amount of sentences. Through communicating with each other humans can learn different ways of thinking about the world. Our culture, made possible by our linguistic capacities, means that every child born into a linguistic community will have the capacity to use the tools for thinking which were created by previous generations, such as mathematics, logic, the scientific method etc. These tools, when learned, install programmes which can give brains the capacity to think in ways which are far superior to the ways animals unaffected by culture are capable of thinking. This is analogous to the way in which some software (e.g an app), when installed on a hardware,  gives the hardware capacities which it previously did not have.

We can see from the above description that Dennett’s conception of consciousness is strongly tied up with  linguistic competence. This equating of consciousness with language is so strong that he goes as far as to deny that animals are conscious. Dennett’s equating of consciousness with language is at odds with a lot of neurological data. Yet he stubbornly persists in claiming that language and consciousness are intimately connected. A possible explanation for his position is that he is unconsciously projecting his own mental abilities onto those of others. To analyse this contention we will examine his overall views on consciousness. In Part 1 we will examine his views on mental imagery, and dreams. In Part 2 we will examine his views on the nature of pain. We will show by analysing Dennett’s conception of mental imagery, dreams, and pain that he is guilty of unconsciously commiting the typical mind fallacy.

PART 1: DENNETT ON MENTAL IMAGERY

We claim that Dennett is a non-imager whose views on the nature of mental imaging are derived primarily from his own idiosyncratic psychological abilities.  We argue for this conclusion by considering various psychological tests which indicate that people do indeed have differing abilities to form mental imagery, ranging from non-imagers to eidetic imagers. We will analyse Dennett’s various different discussions of mental imagery and show through textual analysis where he indicates that he may be a non-imager. We will compare Dennett’s reports of his imaging abilities with those of other people with various different levels of imaging abilities. Overall we will conclude that it is highly probable that Dennett is a non-imager.

                  Section 1: Dennett: Experience is not what you think it is

Dennett discusses pain in the context of a general evaluation of our supposed phenomenological world in Chapter 2 of Consciousness Explained. Dennett proposes a number of experiments, which are designed to show us that our knowledge of our phenomenological world is less accurate than we assume. He proposes an experiment on mental images p. 27, one on our sense of touch p. 47, one on sound p. 49, and one on vision p. 54, as well as the experiment on pain p. 60. Dennett argues as follows from his experiments:

Did anything you encountered in the tour of the phenom in the previous chapter surprise you? Were you surprised for instance that you could not identify the playing card until it was almost dead centre in front of you? Most people, I find, are surprised-even those who know about the limited acuity of peripheral vision. If it surprised you, then that must mean that had you held forth on the topic before the surprising demonstration, you would very likely have got it wrong. People often claim a direct acquaintance with more content in their peripheral visual field than they in fact have. Why do people make such claims? Not because they directly and incorrigibly observed themselves to enjoy such peripheral content, but because it seems to stand to reason…Am I saying that we have absolutely no privileged access to our conscious experience? No, but I am saying that we tend to think we are much more immune to error than we are.  (Consciousness Explained p.68).

The card experiment is easily testable because the card exists in the external world. So when we ask people what card they are seeing we can test their claims against a fact in the external world. Asking people about things like mental images is different; when people make claims such as ‘I am a vivid imager’ we cannot easily test whether such people are guilty of exaggerating their experiences like we can with the card test. Likewise with people’s experience of pain.  Dennett is asking us how we can know that people are not spinning stories based on theories they hold as opposed to observing internal objects?  However, let us assume that his comparison of the card trick with our internal experiences of mental images, and felt experiences of pains, is valid. What would this mean for our theory? Suppose that in the card experiment some of us discovered that our visual experiences were not as accurate as we believe they are, we did not discover that we have no experiences, our experience of the card at the centre of our visual field was ok. It was only our peripheral vision which was affected. If our introspective experiences are analogous to our perceptions and the same errors occur then this means that we have to be careful with our introspection, not that introspection is somehow inherently flawed.

Section 2: Mental Imagery

William James and Francis Galton famously warned us against making the typical mind fallacy, of wrongly assuming that all people have minds which are the same as ours. Galton’s breakfast table questionnaire showed that people have differing abilities to form mental images, ranging from non-imagers to weak imagers to strong imagers to eidetic imagers. Recently psychologists Reisberg, Pearson and Kosslyn’s paper “Intuitions and Introspections about Imagery: The Role of Imagery Experience in Shaping an Investigator’s Theoretical Views” demonstrated that people who were working on the imagery debate were influenced by their own imaging abilities as to what views they held on the nature of imagery. Their study showed that, as scientists did more experiments and learned more about the topic, their imaging abilities played less of a role in what their views on the topic were because they were more influenced by the scientific evidence. Kosslyn however noted that a small percentage of those studied were not swayed by the scientific evidence:

In addition, the VVIQ scores were correlated with the responses for the views in 1980: the results indicated that researchers who reported less vivid imagery were more inclined toward the propositional view. Thus, in the early stages of the debate, researchers’ personal experiences were related to their theoretical stances. In contrast, when the VVIQ scores were correlated with their current views, the scores were found not to be correlated with their current attitude. This finding suggests that scientists really do pay attention to the data and let the data take precedence over their introspections or images. As results from increasing numbers of imagery experiments were reported and the case for depictive representations grew stronger, even many of those with poor depictive imagery became convinced that depictive representations exist and are used in imagery. Nevertheless, some of the extreme cases-who reported no imagery-persisted in denying the existence of mental imagery. (Kosslyn et al: The Case for Mental Imagery p 181)

Kosslyn claims that in some extreme cases people who have no imagery persist in denying the existence of mental imagery. In an online article for edge.org Kosslyn argued that Zenon Pylyshyn was a non-imager whose views on the imagery debate were strongly derived from his lack of ability to form mental images. We aim to demonstrate in this paper that Dennett’s views on mental imagery can also be explained by his being a non-imager.

Dennett typically interprets people’s claims that they think using mental images as being nothing more than a theorists fiction; in other words, a metaphor[2]. The reason he gives for this belief is partly based on his heterophenomenological method and his theoretical belief that, since people have no internal eyes, then having mental images is impossible. We do not doubt that some of Dennett’s reasons for treating mental images as metaphors are theoretical, however we argue that the primary reason Dennett treats mental images this way is because he is a non-imager. This is not meant as any kind of personal slight on Dennett; both of the authors of this paper are extremely poor imagers. In fact one of the authors of this paper, David Berman, has written a paper “Philosophical Counselling for Philosophers” (2008), which recounts how his own philosophical views were inadvertently affected by the fact he is a poor imager.  It is vitally important that such facts are discovered if we are to ascertain the truth rather than merely assume that all thinkers’ minds are the same.

In his “Heterophenomenology Revisited” (2007) Dennett makes clear his views on mental images and his view that they are mere metaphors which describe neurological facts we do not yet understand. I will quote his views in this paper in full as they are very instructive:

The standard presumption is that ‘’I know because I see it’’ is an acceptably complete reply when we challenge a reporter, but where a subject is reporting on mental imagery, for instance, we create an artefact. We ask a subject to tell us how many windows there were in the front of the house he grew up in, and he closes his eyes for a moment and replies ‘‘four’’. We ask: How do you know? ‘‘Because I just ‘looked’… and I ‘saw’ them!’’ But he didn’t literally look. His eyes were closed (or were staring unfocused into the middle distance)…When we confront this familiar vacuum, there is an almost irresistible temptation to postulate a surrogate world-a mental image-to stand in for the part of the real world that a reporter observes…The ‘’recollected image’’ of the house has a certain richness and accuracy that can be checked, and its limits gauged. These limits give us important clues about how the information is actually embodied in the brain, however much it seems to be embodied in an ‘’image’’ that may be consulted. This is where the experimental work by Shepard, Kosslyn, and Pylyshyn and many others comes into play. (Hetrophenomenology Revisited. Pp. 256-257).

One important point to note is that Dennett cites the experimental work of both Kosslyn and Pylyshyn as a way of finding out what is actually going on in the brain. This is interesting because of course Kosslyn is a defender of mental images, while Pylyshyn uses his experiments to cast doubt on the existence of mental images. This may give the impression that Dennett is open to the existence of mental images, that he is merely doubting them for the purposes of collecting heterophenomenological data, which he will later subject to experimental tests. However, his discussion of Shakey the robot shows that Dennett actually subscribes to the more radical view that mental images do not exist.

Shakey is a robot that Dennett introduced into Consciousness Explained for a thought experiment. Shakey is an actual robot that was built in 1966 and that is programmed to move boxes in the room.    Below (fig.  1) is a picture of Shakey as he appeared in 1966 when he was first built:

Fig 1 Shakey

Shakey cannot speak, however Dennett speculates that if we could programme it with a language we could programme it to report that it moved boxes by rotating mental images of the box in its mind’s eye. We as designers know that Shakey does not actually rotate mental images in his mind before moving the boxes.  Dennett suggests that something similar occurs when humans report rotating mental images:

If a human subject says she is rotating a figure in her mind’s eye, this is a claim about what it is like to be her, and indeed she intends to describe, unmetaphorically and accurately, a part of the real world (a part that is somehow inside her) as it seems to her. But she is speaking at best metaphorically and at worst creating an unwitting fiction, for we can be quite sure that the process going on in her head that inspires and guides her report ( to put it as neutrally as possible) is not a process of image rotation. (ibid, p. 258).

Here we can see clearly that Dennett, despite his claims of neutrality, is denying that people really have mental images. The above statement casts interesting light on a recent comment on his imaging abilities that Dennett made:

And when I do these mental gymnastics I make heavy use of my imagination, exploring various diagrams and pictures in my head, for instance. In short I exploit what Descartes would disparage as merely my imagination to accomplish what he would celebrate as my conception. (Intuition Pumps And Other Tools For Thinking: p. 289)

Here Dennett is claiming that he makes heavy use of mental images to think. However, when placed alongside his views on Shakey it is obvious that he cannot be claiming that he experience mental images. Suppose I see what I think is a ghost on front of me. If it is afterwards shown that what I thought was a ghost is merely a projected image created as a hoax this will affect my beliefs about the nature of what I saw. However, it will not change the fact that I did indeed experience something. The case of Shakey is different, according to Dennett. We and Shakey may sincerely report rotating a mental image in our mind’s eye, however we do not actually rotate the mental object or experience ourselves doing so; we merely report doing so. It is inconceivable that Dennett could subscribe to the view that all we are doing is giving a verbal report of something that we do not experience, if he in fact experienced mental imagery. We suggest that based on Dennett’s views on the nature of imagery, his casual remark about thinking in imagery must be merely metaphorical.

Dennett spends a lot of Hetrophenomenology Revisited arguing that he is neutral about the reality of the descriptions people give of their internal world. However, his manner of characterising people’s reports in terms of a fiction contradicts this supposed neutrality. His discussion of after-images nicely illustrates this point:

Just as the fictional Sherlock Holmes can be correctly described as taller than the real Max Beerbohm, the fictional red stripes on your afterimage can be correctly described by you as somewhat more orange than the real red stripes on the flag flying outside the window. Fiction is parasitic on fact, and the afterimage stripe is red in exactly the same way that Holmes is tall.  (ibid, p.263).

So here we see that Dennett, despite his claims of neutrality is in fact denying the reality of afterimages. We suggest that there are two sources for his scepticism: (1) a theoretical commitment to materialism, and (2) the fact that he is himself non-imager. We will primarily focus on fact number (2).

When criticizing traditional phenomenology Dennett complains that we are unable to agree what is and is not phenomenologically manifest (ibid, p. 261). We suggest that the reason that phenomenologists have not discovered one true description is because different theorists have differently structured minds. However, for the purposes of this paper we will primarily evaluate Dennett’s kind of mind, and compare it with the minds of other kinds of thinkers.  We will examine some of the claims Dennett makes in his book Consciousness Explained to illustrate this point.

In Consciousness Explained Dennett asks his readers to perform the following experiment:

When you close your eyes, imagine, in as much detail as possible, a purple cow.

Done? Now:

(1)   Was your cow facing left or right or head on?

(2)   Was she chewing the cud?

(3)   Was her udder visible to you?

(4)   Was she a relatively pale purple, or deep purple?

(Conscious Explained p. 27)

Dennett speculates that most people would be able to answer this question, and that if they were not able to answer it they probably did not try to call up the image but merely said the words ‘call up a purple cow’ to themselves. This fact seems to indicate that Dennett is himself capable of calling mental images to mind. This is interesting because in his discussion of Shakey the robot he argued that people only have mental images metaphorically, yet in the above discussion of purple cow he seems to admit that he can form mental images. However, a closer look at Dennett’s views shows that he is not in fact claiming to be able to form mental images. Dennett claims that there are many things which we believe we experience but which, upon closer reflection, we do not experience. So, for example, in Consciousness Explained pp. 53-55 he discusses our visual field. He notes that from the point of view of naive reflection our visual field seems to be uniformly detailed from the centre out to the boundaries. However he asks the subjects to perform an experiment of trying to distinguish cards at the periphery of one’s visual field. People always fail this test. Dennett takes this fact to establish that while it may seem to people that they have certain experiences, experimental studies can reveal otherwise.  He views the case of imagery as similar to the case of the visual field. It often seems to people that they have experiences of mental images; however a closer inspection reveals this to be wrong. In his Content and Consciousness Dennett speaks of a mental image of a Tiger. He notes the following:

Consider the Tiger and his Stripes. I can dream, imagine or see a striped tiger, but must the tiger I experience have a particular number of stripes? If seeing or imagining is having a mental image, then the image of a tiger must- obeying the rules of imaging in general-reveal a definite number of stripes showing, and one must be able to pin this down with such questions as ‘more than ten?’, ‘less than twenty?’. If, however, seeing or imagining has a descriptional character, the question needs no definite answer. Unlike a snap shot of a tiger, a description of a tiger need not go into the number of stripes at all; ‘numerous stripes’ may be all the description says. (Dennett: Content and Consciousness p.154).

Dennett’s argument is that people cannot count the stripes of supposed mental images of tigers. Therefore purported mental images just seem to be images and upon closer inspection they are revealed as mere descriptions of a tiger[3].  Dennett is doing a similar thing when asking people to call up a mental image of a cow. He admits that it seems like we can do so, however he asks us further questions which aim to cast doubt on the reliability of our introspections.

Dennett follows up his question about an imagined purple cow by asking us to imagine a yellow cow. He asks us to answer the same four questions about the mental image of the yellow cow. He then asks a fifth question: (5) What is the difference between imagining a yellow cow and a purple cow? Dennett argues that since nothing in the brain is purple or yellow, and even if it were we have no internal eyes to see such creatures, then such claims must be mere seemings rather than something we actually experience. Some philosophers have argued that since we can experience a yellow or a purple cow in our mind’s eye and these experiences by Leibniz’s law cannot be identical with brain states which are neither yellow or purple, and they are not events in the external world, then they must be in a different realm, the realm of the non-material mind. Dennett’s discussion of Shakey above is his way of doing away with this problem. The thought experiment of Shakey is designed to show a creature who can report on mental states which it does not experience. Dennett argues that as a matter of empirical fact we are such creatures. We will argue on the contrary that Dennett is, as Nietzsche says, giving an unconscious autobiography of his own mental type. We will develop this point further when discussing Dennett’s take on Kosslyn’s experiment on mental rotation later in this paper.

Later in Consciousness Explained Dennett makes further claims about introspection which give us a deeper insight into the type of mind he has. He notes  the strange fact that since Descartes’ time, philosophers have been trying to describe the basic facts of their internal experiences in as clear a manner as possible, and yet have no clear results which all agree on. He considers briefly the possibility which we are developing here that people have different types of mind and that they are accurately describing their own types of experiences. But he never really considers any evidence for or against this possibility. Instead he explains the lack of agreed upon introspective results in terms of the fact that our internal world is so vague that over-theorising our vague internal experiences will result in us unwittingly creating fictional internal worlds. It is important to note that Dennett does not offer much evidence to support his view that introspection is theorising on vague, practically non-existent internal world. He points out that some experiments prove that people are wrong about aspects of their perceptual experiences which they have made confident assertions on. However, the fact that people can be wrong about their perceptual experiences does not really prove anything about their introspective experiences. It does show that people can at times be gullible theorisers, but what this fact points to is that we should be careful when introspecting and perceiving, not that introspection and perception are bad tools.

So why does Dennett deny that the reason people have different descriptions of their internal worlds is because they have different types of minds? We argue that he is guilty of committing the Typical Mind Fallacy. Dennett makes the following claim:

I suspect that when we claim to be just using our powers of inner observation, we are always actually engaging in a sort of impromptu theorising-and we are remarkably gullible theorizers, precisely because there is so little to ‘‘observe’’ and so much to pontificate about without fear of contradiction. (ibid, p.68)

The key claim to note from above is Dennett’s claim that ‘there is so little to observe’ when we introspect.  We argue that we should take Dennett at his word here. When he introspects there are no images to observe because he, like 5% of the population, is a non-imager.  Both of the authors of this paper report that they find little to observe when they introspect. When they introspect they experience virtually no mental images; almost all thought is in terms of internal talking, although from time to time they do experience some mental images. So from an introspective point of view they can understand Dennett’s more extreme position, that of having no mental images. We can see why Dennett would believe that introspection involves a kind of unconscious theorising. The important point is that while Dennett and David Berman and David King are drawing from their own experiences, the reports which have been gathered from introspective psychologists indicate that not everybody shares the same experiences as Dennett. We argue, and will establish empirically, that people who are eidetic imagers will find Dennett’s claims absurd. They are right to view Dennett’s claims absurd from the point of view of their own minds. However, they would be wrong to think that Dennett is incorrect about his own mind; Dennett’s mistake was unconsciously generalising from his own mind to all minds. We will illustrate in the next section that Dennett’s views on mental imaging are not only derived from his own experiences but are also derived from theoretical considerations.

Section 3: Dennett on Rotating Mental Images

In chapter 10 of Consciousness Explained Dennett discusses some experiments which have been done on the rotating of mental images. The experimental data he discusses is extremely interesting. The first experiment is noteworthy because one of the authors (Dave King) was unable to perform the experiment at all because of a difficulty in calling up mental images. In the experiment subjects are asked to check if two objects beside each other but at different angles are the same shape.

The subjects typically reply that they are. When asked how they know, they reply that they rotated one of the objects in their mind’s eye and superimposed it on the other. Now obviously this could be explained away as being just a way of talking which the subjects have. However, further experimental research casts doubt on this view. Subjects were asked to rotate an object which was at a 45-degree angle to an object beside it to see if they were the same size and shape. Subjects were then asked to rotate an object at a 90-degree angle to the object beside it to see if they were the same size and shape. The idea was that, assuming that subjects rotated the objects in their mind’s eye at the same speed, then this would take twice as long. The experimenters (Kosslyn, 1980) found that it did indeed take twice as long to rotate an object through a 90-degree angle as through a 45-degree angle. This would appear to show that some people do indeed think with mental images.

Dennett is primarily worried about explaining these facts in a manner which does not commit him to the existence of a Cartesian Theatre. He gives a theoretical account of how we can explain this experiment in terms that do not presuppose the existence of the Cartesian Theatre.  When discussing the human ability to rotate mental images Dennett compares it with the ability of a CAD system to rotate various shapes because of its Cathode Ray Tube. He asks us to imagine a blind engineer using a CAD system as a prosthetic device to rotate images. The blind person uses braille to communicate with the CAD system.  This CAD system which he calls Mark 1 CADBLIND will be fitted with a computer vision system, complete with TV camera aimed at the Cathode Ray Tube. This computer visual system will be able to read the shapes on the Cathode Ray Tube and communicate the results to the blind engineer.  Dennett goes on to argue that if we can design a Mark 1 CADBLIND system, then developing a Mark 2 CADBLIND will be easy:

We just throw away the CRT and the TV camera looking at it, and replace it with a simple cable. Through this cable the CAD system sends the Vorsetzer a bit-map, the array of zeros and ones that defines the image on the CRT. (ibid p 291)

So after a CAD system performs its calculation it passes the information to the Vorsetzer (visual system) and the information is translated. Dennett notes that we are not really saving  much in terms of the calculations which still need to be done, we are just getting rid of some unnecessary hardware. He therefore proposes a Mark 3 CADBLIND system which actually saves on calculation:

So our Mark 3 CADBLIND will exempt itself from huge computational tasks of image-rendering by taking much of what it “knows” about the represented objects and passing it on to the Vorsetzer subsystem directly, using the format of simple codes for properties, and attaching “labels” to various “places” on the bit-map array, which is thereby turned from a pure image to something like a diagram. Some spatial properties are represented directly-shown-in the (virtual) space of the bit map, but others are only told about by labels. (ibid p293).

Here Dennett notes that such cancelling out only works if the two systems that need to communicate speak the same language. There would be a problem if the information which the CAD system had was not in a format that the Vorsetzer could use. Dennett argues that given that the human brain was designed by the tinkering processes of natural selection we should expect to find such difficulties of communication between the different areas of the brain. He speculates that since diagramming is an effective way of communicating the brain may use such processes. He argues as follows: ‘Diagrams do indeed amount to re-presentations of the information-not to the inner eye, but to the inner pattern recognition mechanism that can also accept input from the outer eye (ibid, p.293).

Twenty years after Dennett wrote Consciousness Explained experiments by Kosslyn have confirmed that two-thirds of the brain area that is lit up when a person visualises (the occipital lobe) is also lit up when a person forms a mental image.

Not everyone accepts Kosslyn’s results (see for example Pylyshyn 2003), however a substantial majority do. It has furthermore been shown that when people form mental images  geometric patterns of these images are formed in the brain (Kosslyn et al. 2006). Experimental work has also been done on monkeys which shows that topographic images are formed on their visual cortex when they view patterns of lights which are flashed to them by experimenters. (Kosslyn et al., 2006)

Notably Dennett, despite arguing for an architecture similar to Kosslyn’s, draws a different conclusion. Kosslyn believed that these experiments showed that people rotated such images in their mind’s eye. Dennett, however, drew an entirely different conclusion:

Or in other words, it is asking whether any of the pixel-demons want to tell it, “color number 37 here”. All the red is gone-there are only numbers in there. In the end, all the work in a CADBLIND system must be done by arithmetic operations on bit strings, just as we saw at the lowest level of Shakey in Chapter 4. And no matter how quasi-pictorial or imagistic the processes are that eventuate in the Vorsetzer’s verbal answers to questions, they will not be generated in an inner place where the lost properties (the properties merely “talked about” in the bit map) are somehow restored in order to be appreciated by a judge that composes the answers…People are not CADBLIND systems. The fact that a CADBLIND system can manipulate and inspect its “mental Images” without the benefit of a Cartesian Theatre doesn’t by itself prove that there is no Cartesian Theatre in the human brain, but it does prove that we don’t have to postulate a Cartesian Theatre to explain the human talent for solving problems in “the mind’s eye”. There are indeed processes that are strongly analogous to observation, but when we strip down Kosslyn’s CRT metaphor to its essentials we remove the very features that would call for a Cartesian Theatre.  (ibid, p. 297)

Dennett is in effect arguing that while it may seem to some people that they are forming mental images in their mind’s eye, closer inspection reveals that they are doing no such thing. He is trying to explain away people’s verbal statement that they are forming mental images. Most people find Dennett’s view on this topic a bit strange. They can understand how Dennett can treat other people’s verbal reports using the heterophenomenological methodology. Such an approach treats people’s verbal reports as being epistemically on a par with the verbal reports of Shakey the robot. However, the problem which most people have with Dennett’s characterisation of mental images comes from the first-person point of view. Independent of my verbal reports to others, if I see a purple cow in my mind’s eye, then I see it. Consider by analogy a table that I see on front of me. It could be argued that despite appearances the table is really just a colourless collection of quarks and gluons. Nonetheless, despite this fact, I still see a table on front of me. Likewise, while my mental image of a purple cow may be caused by messages to my occipital lobe from the temporal lobe, what I see is a purple cow. Dennett’s denial of mental images is only explicable from a first-person point of view if he is in fact incapable of forming such images. If he experienced vivid images then surely he would find his own CAD, and Shakey thought experiments, less impressive than he does.

He goes on to ask us to perform the following introspective experiment:

Here is a simple test to remind us of how limited our imaging abilities actually are: In your mind’s eye, fill in the following three by three crossword puzzle, writing the following three words in the columns, starting with the left column: GAS OIL DRY. (ibid p.295)

He notes that if written on a page the words would pop out whereas in the mind’s eye they do not. Here he is guilty of the typical mind fallacy. He assumes that because he cannot do it, nobody can. He admits there may be variation in people’s abilities to do crossword puzzles in their mind’s eye, and even says that some people may be better imagers than others. However, in the next sentence Dennett reminds us that in the ultimate sense it is all tell and no show (ibid, p. 295). In other words, we do not really use mental images to solve the crossword puzzles. So here again we suggest that the fact that Dennett is a non-imager plays a massive role in the theoretical views he holds about the nature of mental imagery.

Oliver Sacks recently spoke about a patient of his doing crosswords in her mind’s eye. I will give the whole quote here, because it strongly contrasts with Dennett’s view:

I was reminded, when Lillian told me of this, of a patient I had seen in the hospital some years before, who overnight become totally paralyzed from a spinal cord infection, a fulminating myelitis. When it became evident that no recovery was forthcoming she fell into despair, felt that her life was over-not only the great things of life but the little familiar pleasures of each day, like doing the New York Times crosswords, to which she was addicted. She requested that the Times be brought to her each day so that she could at least look at the puzzle, get the configuration, run her eyes along the puzzle, get its configuration, run her eyes along the clues. But when she did this something extraordinary happened, for as she looked at the clues, the answers seemed to write themselves in the spaces. Her visual imagery strengthened over the next few weeks, until she found that she was able to hold the entire crossword and its clues in her mind, after a single, tense inspection, and then solve it mentally, at her leisure later in the day. This became a source of great solace to her, in her paralysis; she had no idea, she told later told me, that such powers of memory and imagery were available to her. (Oliver Sacks: The Mind’s Eye p. 24)

This example is obviously anecdotal, and a verbal report cannot be simply assumed to be a correct description of an experience. Nonetheless, since introspective reports, behavioural evidence and neuroscientific evidence do indicate that people have different ranges in their imaging abilities, we have no reason to doubt Sack’s patient.  We suggest that Dennett could benefit from reflecting on the variation of people’s mental imaging and that such reflection is necessary to stop him from unconsciously projecting his own type of mind onto all other people’s minds. One of the authors of the present paper (Dave King) tried Dennett’s introspective task of imaging crosswords. He described his experiences as follows:

When trying to imagine the crossword in my mind visual images played a limited role. I spelled the words out ‘’verbally’’ in my mind, and sometimes could see a glimpsy word in my mind’s eye. And I figured out what the words were by a process of memory, spelling out each word ‘‘verbally’’and then through memory figuring out what each horizontal word would be. Some imagery was present, I vaguely ‘‘saw’’ a square and red dots appear where I would imagine the words would be if in a crossword. Looking at each red dot seemed a physical experience; I could feel my eye balls move as I looked at the dots. The images did not seem to play any role in figuring out the words though as I said, images were present a lot of the time.

In an earlier paper co-author of this paper David Berman described his own experiences of trying to form a mental picture of a similar game:

Here is another similar test for image impairment. I say: let’s play noughts and crosses. You agree. But instead of your playing it on paper, I ask you to play it in your head or mind’s eye, using the a, b, c and 1, 2, 3 grid, as used by chess players. Here again the weak or non-imager is going to show his impairment; whereas the photographic and eidetic imager 10 could not only easily play noughts and crosses in his head, but (amazingly to me) actual chess as well. Of course, working out the place of a letter in a word, or playing mental noughts and crosses, is not a task that comes up that often. But what does arise rather more normally is a question like:‘Dad, how do you spell the word . . . ?’ For a low imager, that isn’t difficult if the word is short. But it is extremely difficult, as I know from my own experience, if the word is long, for then it is hard to be sure which letter one has reached by the middle or end of the word. Again, it is easy enough to do it if you can write the word down, and that is precisely what the strong imager can do, although in his mind’s eye. (Philosophical Counselling for Philosophers: p. 7)

The views of Dennett, Sacks’ patient, David Berman and Dave King are all different. Dennett, because he is, we believe, a non-imager, assumes that the differences result from theorising on our limited vague internal experiences.  We argue that he believes this because when he introspects he experiences no mental imagery and incorrectly assumes that all people have minds which are the same as his.

                    Section 4: Dennett as a  Heterophenomenological Subject.

Dennett discusses pain in Chapter 3 of Consciousness Explained; the chapter is called ‘A visit to the phenomenological garden’. When discussing pain Dennett asks us to perform the following introspective experiment:

Close your eyes and imagine that someone has just kicked you, very hard, in the left shin (about a foot above your foot) with a steel-toed boot. Imagine you almost faint, so nauseatingly sharp and overwhelming is the jolt of pain you feel. You imagined it vividly; did you feel any pain? Might you justifiably complain to me that following my directions has caused you some pain? Some find it somewhat disturbing, and others find it a rather enjoyable exercise of the mind, certainly not as unpleasant as the gentlest pinch on the arm that you could call pain. (Consciousness Explained, p. 60)

Dennett then asks the following two questions:

(1) Imagine you dreamt of experiencing the same kick in the shin that you imagined in your introspective experiment, and you woke up clutching your shin.  Are dreamed pains real pains, or a sort of imagined pain?

Dennett’s answer is that dream pains are at least something that we do not like having, whereas imagined pains are something nobody minds having. He then asks another question:

(2) Compare dream pains to pains that arise in you while you sleep, when you roll over and inadvertently twist your arms in an awkward position, and then, without waking up, without noticing it at all, roll back into a more comfortable position. Are these pains? Dennett is ambivalent in his answer to this question. He only seems to ask the question to illustrate how unclear the notion of pain is; that despite some philosophers thinking that qualia like pain are immediately known, the experience is in fact much more vague than we think.

To support our view that Dennett is a non-imager inadvertently committing the Typical Mind Fallacy we asked 12 people to do his introspective task. We aimed for a qualitative approach; we asked our participants to detail their experiences in as rich a manner as possible. We argue that such an approach is in keeping with the manner Dennett answered his own question. We sent our introspective tests to over 40 people and used the results from the 12 people who answered. Our aim is to merely illustrate the divergences which occur within our small sample. We think that the divergences are suggestive and warrant being further researched on a large scale and with more rigorous statistical analysis which takes account of age differences, sex differences, and cultural differences, educational level etc. For the purpose of this paper we argue that our small sample is enough to show that Dennett is wrong to generalise from his own type of mind to the structure of all minds.

Dennett’s manner of answering his own question combined with other studies reveals a lot about his kind of mind. He argues that imagined pain is something that people do not mind having, while dreamed pain is something people do mind having. Most of the people who answered our questionnaire did not agree with him on this. We also think that Dennett’s account reveals that his dream experiences are more real for him than his imagined experiences. This is significant because one of the authors of this paper, David Berman, has evidence that non-imagers typically have vivid dream experiences[4]. This fact was also observed by the psychologist Bill Faw[5]. So the fact that Dennett admits by implication that his dreams are more vivid and real than his imagination fits well with our claim that he is a non-imager.

So we argue there are four key facts which indicate that Dennett is a non-imager who is guilty of unconsciously assuming that all people have the same mental architecture as himself. (1) There is evidence of great philosophers making this mistake in the past[6]. This obviously does not prove that Dennett is guilty of making the same mistake; it is merely intended to establish historical precedence and to open up the theoretical possibility that Dennett may be guilty of the same mistake. (2) The fact that Dennett describes himself in terms of Shakey and Mark 3 CADBLIND, creatures who have no imagery whatsoever indicates that he, like them, has no mental imagery. (3) The study of Reisberg et al.(2002) indicates that non-imagers typically remain sceptics about mental imagery ability despite the overwhelming evidence in favour of the existence of mental imagery.(4) Dennett’s confession that he finds dreams pains more vivid than imagined pains is consistent with the findings of Faw (1997) and Berman (2009) that non-imagers typically experience vivid dreams.

                             

 

                       

 

                                     Section 5: Dennett: Are Dreams Experiences?

The psychologist Brian Faw has argued that typically people who are non-imagers experience vivid dreams. A similar pattern has been noted by David Berman in studies of people’s introspective abilities over the years. In his 2009 “A Manual of Experimental Philosophy” two of his subjects were non-imagers and both of them reported having strong dream experiences. One of his subjects Timo made the following reports to Berman:

I must confess that I cannot see any images in my mind, except when I fall asleep. But I can for instance freely wander around my childhood home in Vaasa, from room to room, and I can ‘see’ quite exactly what was there. I go back to that flat. But I do not have visual images which are comparable with those of dreams. I just remember’… I suspect that people call the pictures of imagination pictures even if they do not see any pictures. They conventionally call them so…David, believe me, I don’t see any pictures. I do the construction conceptually, and I remember what I thought in the (black) visual space. When I walk through my home, I use words. Believe me I see nothing except in some wide figurative sense perhaps. But no images. It is all about thinking not seeing. (A Manual of Experimental Philosophy: pp.63-66)

We can see from Timo’s report that his view of images as a type of metaphor is similar to Dennett’s view. Like Dennett, his report of dreams indicates experiences much more vivid than his non-imaging  abilities when awake. The other non-imager in Berman’s report made similar claims. Marek, who reported that he is a non-imager made the following claim about his dream experiences:

Yes I do have visual dreams, in fact the images (plus voices, of course) are so vivid that I try and do interact with the people in the dreams. So like Descartes predicted, I am fooled in my dreams and take them for reality. Sometimes though I can tell that I am dreaming when something very illogical happens. (ibid p. 70).

 

We have argued in this paper that Dennett was a non-imager and that like a lot of non-imagers he had vivid dream experiences. The evidence we cited above from Berman and Faw indicates that there is a correlation between non-imagery and strong dream experiences. We have also argued that Dennett’s discussion of the difference between imagined pain and dreamed pain further illustrated our point. He claimed that dreamed pain would be more vivid than imagined pain and this claim nicely fitted in with our empirical data that non-imagers have vivid dream experiences[7].

There are, however, two objections which can be made to our claim. (1) Our claim that Dennett has vivid dream experiences is inconsistent with our claim that he denies that mental images exist. If Dennett does indeed experience mental images while dreaming then this seems to undermine our claim that he denies the existence of images because he never experiences them. (2) Dennett has written two different papers about the nature of dream experiences, and his views on dream experiences do not, on the face of it, sit well with our view that Dennett has strong dream experiences. We will deal with Dennett’s theoretical papers on dreams first, and then after that answer question (1).

In Dennett (1978b) he aims to undermine the authority of the received view that dreams are experiences. The received view claims that dreams are experiences that occur during sleep,  and that these experiences consist of sensations, thoughts, images and feelings which are organised in a narrative form occurring somehow in awareness even though we are not consciously aware at the time. Dennett aims to cast doubt on the received view and to outline his view of experience and memory which he will incorporate into his physicalistic theory of consciousness.

Dennett notes that the received view is an empirical theory; and that there is some anecdotal evidence which goes against the received view. For example, people are often woken by an alarm which they hear in their dream; however they were undergoing REM hours before. So unless the dreamer has precognition then he cannot have dreamt of the alarm at the said time. Dennett claims that this shows that dreams are mere recollections, not something we actually experience. Dennett conjectures that science may show that dreams are like déjà vu, i.e. it only seems like we had them before (Dennett: Are Dreams Experiences p. 155). We may, he speculates, have dream memories constructed in our brains ready to be ‘recollected’ upon waking.

Dennett argues that the difference between the received view and his alternative (cassette) conjecture is that in the new view, unlike the old view, there is nothing it is like to have a dream (though there is something it is like to remember a dream). For Dennett there is nothing to distinguish between the received view and the cassette alternative. He notes that only our common sense intuition counts against his alternative, although he argyes that common sense intuitions should not decide a scientific matter.

Dennett points out that one criterion which counts against the received view is ordinary language. We would normally say that to experience something we must be conscious of it. On the received view we supposedly have unconscious experiences. So this is reason to prefer the alternative view because it is less vulnerable to conceptual confusion than the previous view. He offers a further thought experiment which he claims shows that dreams may not be experiences. Suppose a person as a result of some brain twinge claims to have seen a ghost at 12pm. Suppose further that we are with the person at 12pm and between 12pm and 12.14pm he behaves normally and gives no indication of seeing a ghost. Now suppose that at 12 15pm this person gets agitated and claims that he has seen a ghost at 12pm. Since the person gave no indication of this at 12pm we will assume that the person at 12 15pm, as a result of brain injury, has a hallucination of a recollection (ibid, p.20). Dennett claims that this hallucination story can have effects for the theory of dreams. All normal dreams do not get reported until after the fact. So we do not have sufficient evidence to say people are experiencing dreams. He makes an exception when it comes to nightmares because we have behavioural evidence in the form of increased heart rate, moans, cries. He claims that bad dreams would not be experiences because of lack of behavioural evidence at the time we claimed we experienced them, though he claims we could remember the dream in agony (ibid p.20).

He argues quite plausibly that if paralysis is only peripheral and more central processes are lit up, e.g. the visual system is lit up (which it is), this may count towards us saying that dreams are experiences. However, given that the visual area is lit up in blind sight and hysterical blindness this shows that we cannot say for certain that a lit up visual area means that we are having dream experiences. He notes that it is an open theoretical question whether dreams are experiences.

He argues that we will want to investigate four things if we are to construct an accurate theory of dreams.

(1)   The role of experiencing in guiding current behaviour.

(2)   Our current abilities to say what we are experiencing.

(3)   Our recollective capacities to say what we are experiencing.

(4)   The functional saliencies which emerge from empirical investigation.

So in effect Dennett argues that we cannot at present decide between the alternatives however, any alternative must be established empirically.

In his later paper “What is Dreaming for, if anything?” (2013) Dennett defends this Malcolm-type view on dreams against an alternative proposed by psychiatrist Allan Hobson whose theory of the nature of dreams is a version of the received view. Dennett again offers a series of arguments which he thinks show that his view is as viable an alternative as the received view. The one area where he expands upon his earlier theory is by emphasising the theoretical possibility that linguistic competence plays a central role in developing our hallucinations of dream experiences.

It might be felt that Dennett’s view outlined in his two papers on dreams which indicates that he does not think that dreams are experiences is strong evidence that our view that he is a vivid dreamer is incorrect.  We, on the contrary, think that Dennett’s papers on dreams support our position rather than going against it.

It is worth noting that the empirical evidence that non-imagers have vivid dreams seems on the face of it to be a strange fact. If a person is incapable of having mental images then it is odd that they can have vivid dreams which would involve imagery. However, from a phenomenological point of view mental images are different than hallucinations. People never take mental images for the real thing (even eidetic imagers do not), whereas people typically confuse hallucinations for real experiences. The fact that dreams are typically confused with reality indicates that they have the quality of a hallucination rather than of imaging. Dennett, while claiming that dreams may not be directly experienced, does indicate that we experience our memories of them and that they are hallucinations of recollections of something we never experienced (ibid, p.20). But we experience the actual hallucination, and the hallucination of a bad dream can be remembered in agony (ibid, p.20). What this means is that although the hallucination is a false memory we still experience the actual hallucination and the hallucination can leave us in agony. So again this view is actually consistent with Dennett’s claim that imagined pain is not as severe as dreamed pain. We will not evaluate Dennett’s defence of Malcolm’s claim that dreams are not directly experienced because it is not relevant to this paper and Dennett admits it is only a theoretical possibility. It is enough to note that nothing in Dennett’s paper is inconsistent with our claims that Dennett is a non-imager who has vivid dream experiences. Dennett, even in his paper’s on dreams, indicates that even if we consider dreams as hallucinated memories we can still experience these hallucinations in agony. This contrasts strongly with his claim that people cannot imagine real pain in his Consciousness Explained. It is consistent with Dennett’s claim that dreamed pain is something we mind having. It is also consistent with our view that Dennett is a non-imager who is a vivid dreamer. Dennett is guilty of assuming that his idiosyncratic psychology is representative of human minds. He is not alone in this. Berman has argued convincingly that Locke and Berkeley were also guilty of it. We think it is worth researching whether other thinkers in the philosophy of mind like Nagel and McGinn are guilty of making a similar mistake. Philosophers like Dennett and Quine have argued that philosophy should become more naturalistic and empirical. We aim to extend this empiricism to the study of the individual minds of philosophers and how their minds unconsciously influence their philosophical theories.

Section 6: Real Seeming

In his Content and Consciousness Dennett talks about the fact that while it may seem to some people that they have mental imagery, closer examination reveals that what they call a mental image is really only a description. Twenty five years later in his Consciousness Explained when discussing Kosslyn’s experiments on mental imagery, Dennett noted that despite appearances mental imagery is really all tell and no show. One curious thing about Dennett’s view is the fact that he claims that despite the way things seem mental imagery is really a mental description. What is strange about this view is the fact that a description can seem like an image. This is a very odd way to understand the word ‘seem’. A paradigm example of an x seeming like it is a y is given by Descartes. He talks about how a stick which is in transparent water will seem to be bent because of light refraction, though in reality the stick is not bent. What Descartes means by the words ‘seems to be’ is the same as ‘appears to be’; and this of course is the standard meaning of ‘seems to be’. However, even to a weak imager like me, it is patently obvious that mental images are nothing like mental descriptions. If something really seemed (as in appeared) to me like an image, then it follows that I would have an experience of something image-like, and a description is in no way image-like. This leads to the question of what Dennett could possibly mean when he admits that it atleast seems to some people that they experience mental images?

In Consciousness Explained Dennett carefully explains what he means by the word ‘seems’; evaluating his views on this will help clarify his strange beliefs about the nature of images. In Chapter 5 Section 5 of Consciousness Explained Dennett discusses the colour phi experiment[8]. In this discussion he makes explicit his strange views on the nature of ‘seeming’.  The colour phi phenomenon is apparent motion. We see examples of it on our television screen every day, where a series of still pictures are flashed one after the other at a certain speed to create the illusion of motion. Dennett discusses a simple example of colour phi where two spots separated by as much as 4 degrees of visual angle are flashed rapidly, creating the illusion of one spot moving back and forth rapidly (ibid, p.114). Kolers and Grunau (1976) did a phi experiment with two dots, one red and one green,  flashing on and off. This gave the illusion of a red spot starting to move and changing colour to green mid-passage. Since the red dot is not moving and does not turn into a green dot we need to ask what is going on with this illusion. As the red dot moves we see it turn green as it moves towards its destination. The question is: how do we see the dot turn green before we actually see the green dot? One might think that the mind must first register the red dot and then the green dot and after that point the apparent motion must be played before the mind’s eye. To think this, way Dennett warns, is to demonstrate that one is still in the grip of the metaphor of the Cartesian Theatre (ibid p. 115).

To loosen the grip of this picture on our minds Dennett discusses two fictional processes which one could attribute to the brain. He calls them the Orwellian Process and the Stalinesque Process.  The Orwellian Process occurs when I misremember something because my brain tampers with what I remember so that I no longer remember accurately. The Stalinesque Process is where the brain projects a false picture of reality into the mind’s eye. Dennett notes that while a distinction between Orwellian and Stalinesque processes makes sense in the external world it is an illusion to assume that it makes sense as an explanation of what is going on at the level of the brain.

Let us think of both of these processes as they apply to the case of colour phi.  In the Orwellian case we did not see the apparent motion; our brain merely revised our memory and informed us that we did see the motion. In the Stalinesque case we unconsciously registered the two dots and afterwards our brain created a kind of mock event for us to watch. Dennett notes that once we give up the notion of Cartesian Materialism, we will see that there is no answer to the question of whether the Orwellian or Stalinesque process took place. He puts things as follows:

So here is the rub: We have two different models of what happens to the color phi phenomenon. One posits a Stalinesque “filling in” on the upward, pre-experiential path, and the other posits an Orwellian “memory revision” on the downward, post-experiential path, and both of them are consistent with whatever the subject says or thinks or remembers…Both models can deftly account for all the data-not just the data we already have, but the data we can imagine getting in the future (ibid, pp. 123-124)

So there is no fact of the matter which can decide between the two different stories. Dennett argues that the reason that we cannot decide between the two accounts is that there is really only a verbal difference between them. With Dennett’s rejection of Cartesian Materialism and his alternative multiple-drafts theory of consciousness we can no longer draw a non-arbitrary line to decide when an event becomes conscious.  There is therefore no fact of the matter as to whether an Orwellian or a Stalinesque process took place.

When Dennett claims that we cannot decide between the Stalinesque and Orwellian alternatives we are left with what seems like a mystery. In the external world a red object is not really moving and turning into a green object, yet Dennett is also denying that a Stalinesque show trial is played before the mind’s eye.  So the obvious question is: where does the movement of the ball occur? Dennett’s answer is that the ball does not move and turn green it only seems to. However, to say that a ball seems to move is to say that people have an experience of the ball moving. And this leads us back to our original question: what generates this experience, and how is it generated? Dennett thinks that this is a bad question because the brain does not need to create an experience of the ball moving; it merely has to form a judgment that such movement occurred:

The Multiple Drafts model agrees with Goodman that retrospectively the brain creates the content (the judgment) that there was intervening motion, and that this content is then available to govern activity and leave its mark on memory. But the Multiple Drafts model goes on to claim that the brain does not bother “filling in” the blanks. That would be a waste of time and (shall we say?) paint. The judgment is already in, so the brain can get on with other tasks. (ibid, p. 129)

This claim of Dennett’s is extremely strange. He is claiming that the brain judges that the motion occurred. However, as a matter of fact, we do not experience the motion; we only think we do.  The obvious reply to this is to categorically state that I do experience the movement and I judge that the movement occurred based on this experience. In other words, the experience is prior to the judgment. The experience is not of a fact in the external world (where no movement occurred), it is rather an experience of a person’s subjective qualia. When Dennett denies that it is the experience that leads to the judgment, he is leaving the phenomenal experience out and is focusing entirely on access consciousness.

The claim that Dennett is denying the existence of phenomenal consciousness is on the face of it an incredible claim. So before proceeding it is important that we show that this is indeed Dennett’s position. To demonstrate that this is indeed Dennett’s position I will now provide detailed quotes from him to make clear his position.  When discussing phenomenal space Dennett makes the following claim:

Now what is phenomenal space? Is it a physical space inside the brain? Is it the onstage space in the theatre of consciousness located in the brain? Not literally. But metaphorically? In the previous chapter we saw a way of making sense of such metaphorical spaces, in the example of the “mental images” that Shakey manipulated. In a strict but metaphorical sense, Shakey drew shapes in space, paid attention to particular points in that space. But the space was only a logical space. It was like the space of Sherlock Holmes’s London, a space of a fictional world, but a fictional world systematically anchored to actual physical events going on in the ordinary space of Shakey’s “brain”. If we took Shakey’s utterances as expressions of his “beliefs”, then we could say that it was a space Shakey believed in, but that did not make it real, any more than someone’s belief in Feenoman would make Feenoman real. Both are merely intentional objects. (Ibid, pp. 130-131)

The above passage is very instructive. It speaks to our topic of mental images and again shows that Dennett thinks of them as theorists’ fictions. Furthermore, his invoking of Shakey, who despite its verbal reports is not experiencing any items in phenomenal space, shows that Dennett thinks that we, like Shakey, despite our verbal reports are not experiencing anything in phenomenal space. Dennett is claiming that our brains may tell us that we have such and such experiences, and as a result of this brain report we form the judgment that we saw a red light move and turn into a green light. However, this judgment, despite appearances, is not grounded in a phenomenal experience.

It is worth noting that a lot of thinkers misinterpret Dennett’s claims on ‘seeming’ and colour-phi as indicating that he denies that we experience colours. This is not the case. Dennett’s arguments above only apply to colour hallucinations, Dennett tells a different story about how we perceive colour in the external world.

To understand Dennett’s views on colours it is helpful to think in terms of the primary quality distinction. One of the central motivations for claiming that the world is in your head is the existence of secondary qualities. When one looks at a beautiful garden one sees a variety of different colours: like the bright yellow sun-flower, the green-grass, the multi-coloured butterflies, the blue-sky and the bright-yellow-sun. Since the seventeenth century people like Galileo and Locke have been telling us that colours do not exist in the mind independent world. Colours are effects of light reflecting off objects and hitting our retinas in a variety of different ways. The majority of scientists since Galileo accept this dichotomy between primary and secondary qualities. Primary qualities are: Solidity, Extension, Motion, Number and Figure, while the Secondary qualities are: Colour, Taste, Smell, and heard Sounds. One consequence of accepting this picture is that the world is not as it reveals itself to us in our experience furthermore colours do not exist in a mind independent world. A further consequence is that we have a rich world which we experience consisting of taste, smells and colours but this world exists only within our minds. So on this view we have a subject, who is presented with certain experiences, and only some of those experiences correspond with a mind independent entity. The cartesian materialist who accepts this world picture has a difficult job on his hands. Nowhere in the brain is the experience of a blue sky located or a yellow daffodil located. He may be able to provide neural correlates for these experiences but he will not be able to point to the spatio-temporal location where the experience is and  the subject is located. So presumably the cartesian materialist will have to argue for a strong emergence thesis.

Rather than going down this road Dennett interprets the dichotomy between primary and secondary qualities differently than most contemporary theorists. Dennett has discussed the status of colours throughout his philosophical development: in particular in his 1968 Content and Consciousness, 1991 Consciousness Explained and in his 2005 Sweet Dreams: Philosophical Objections to a Science of Consciousness. I will now try to give a synoptic view of Dennett’s views on the topic of colours. In his first major discussion of colours he noted that while most believed that colours are secondary qualities and do not exist in the external world there are reasons to doubt this conclusion.

He centres his criticism in terms of language and what we are referring to when we use colour words. If we view colours as secondary qualities we are committed to the view that when I refer to something red I am referring to something within my mind. Now if we accept this view then when two people claim that they are referring to something red then they we don’t know whether they are referring to the same thing, as their inner experience of red may be different and, we cannot decide because we would have nothing public to compare their experiences to. Now if we do not want to admit the possibility that a teacher can never know when his pupil has actually learned the meaning of the word ‘red’ we must admit that the reference of colour words is to public observable entities.

One difficulty is that if one accepts the solution to the sceptical problem of colour reference by arguing that words refer to publically observable entities is that it leaves us with a conundrum of where we say that colours exist. They don’t exist in the mind independent world, and they don’t exist in the mind, and there is nowhere else to exist. So one is lead to the silly conclusion that colours do not exist anywhere. This conclusion must be wrong; and Dennett correctly notes that colour words refer to publically observable entities that one can be right or wrong about (Content and Consciousness, p. 161). So they seem to exist and seem to exist in a publically observable sphere.

For Dennett since colours are publically observable entities which we can be right or wrong about then they must be a property of the external world. This leaves Dennett with the question: what property exactly are they? He notes that colours are not reflective properties of surfaces which can be cashed out at a sub-atomic level. This is because:

“Although the sub-atomic characteristics of surfaces that reflect light predominantly of one wavelength can now be described in some detail, those different types of surface do not correspond neatly to the colours we observe things to be.” (ibid, p. 161)

Also different wavelengths of reflected light can cause the same colour experience in a person. So the job of characterising what property colours actually are is more complex than one might assume. Dennett notes that when a person is referring to red we will need to cash-out what property they are referring to in terms like: The person is referring to the reflexive property of x or y or z…(and the disjunction associated with one colour might be very long).

Dennett asks us: if the disjunction of properties which can be associated with a person’s experience of colour have little in common with each other, are we driven to the conclusion that colours do not exist? To think through this possibility considers colour blind people who have poorer discriminative capacities than us and a hypothetical alien who has colour discriminative capacities which are greater than ours. He notes that we would say that the colour blind man who may say that ripe (red for us) apples and grass are the same colour is suffering from a cognitive illusion. On the other hand if an alien had greater discriminative capacities than us so that it would constantly see things as changing colour, we would also say that he was experiencing colour illusions. This is because the meaning of colour terms is defined in terms of OUR discriminative capacities; which means that WE judge certain things in the world to be red, green etc. So relative to our form-of-life these other people would be suffering from a form of cognitive illusion.

Dennett concludes with the following statement:

Colour is not a primary physical property like mass, nor is it a complex of primary properties, a structural feature of surfaces. Nor again is it a private ‘phenomenal’ quality or an ‘emergent’ quality of certain internal states. Colours are what might be called functional properties. A thing is red if and only if when it is viewed under normal conditions by normal human observers it looks red to them, which only means: they are demonstrably non-eccentric users of colour words and they say, sincerely, that the thing looks red. There saying this does not hinge on their perusal of an internal quality, but on their perception of the object, their becoming aware that the thing is red (ibid, p.163)

I am not really sure whether Dennett really manages to avoid the problem of where the experience of red is located. However it should be obvious that he is not denying that colours exist, rather he is claiming that they are not paraded in a Cartesian Theatre.

In Part 1 of this paper we have shown that Dennett is denying that mental images exist. He treats us as robots like Shakey whose programming forces us to judge that we experience mental imagery. He explains away the fact that people report that it seems to them that they experience mental imagery by denying that ‘real seemings’ exist. Dennett’s account fails. He cannot explain why some people report experiencing vivid mental imagery while others report that they do not. His account fails to account for heterophenomenological reports of ordinary human subjects. Furthermore, when judged from a heterophenomenological point of view, his own reports are consistent with those we would expect from a non-imager. So we conclude that when it comes to mental imagery Dennett is guilty of making the Typical Mind Fallacy.

In our introduction we discussed Berman’s Typology: which divided philosophers into three main types. We argued that Dennett was a combination of Type 2 and Type 3 thinkers. We have seen in Part 1 that Dennett denies that things like colour-phi phenomenon, afterimages, and mental images are presented before the mind’s eye. This denial is consistent with the type of claims made by William James. Dennett’s denials are strong indications that he is not a Type 1 Mentalistic Thinker like Descartes; rather, he is like William James, a Type 2 Bodily Thinker. In Part 2 we will show that while Dennett is primarily a Type 2 Bodily Thinker he is also to a lesser extent a Type 3 Socio-linguistic Thinker. We will show this by evaluating Dennett’s views on the nature of pain.

PART 2: DENNETT ON PAIN

            Dennett primarily explored the nature of pain in the early days of his philosophical development between the years 1968-1978. His views on the subject of pain are subtle and understanding his views on the topic will give us an insight into the nature of Dennett’s type of mind.  We will begin by first discussing his views on pain as developed in his 1968 doctoral thesis Content and Consciousness. We will then discuss his later views as developed in his “Why You Can’t Create a Computer That Feels Pain”. Once this is done we will show what his views reveal about the nature of Dennett’s mind.

On page 6 of Content and Consciousness Dennett denies the identity theory. He argues that pains are too different from brain states for one to be identified with the other. He further cites Putnam’s argument that the identity theory must be false because it leads to absurd counter- intuitive results. Thus, if identity theory were true, then this would lead to the absurd claim that only creatures with identical chemistry could think the thought that ‘Spain is in Europe’; this is surely false. Clearly a robot or an alien could think the thought that ‘Spain is in Europe’.  So Dennett argues that the identity theory is false. However, without the identity theory we are faced with all of the problems of dualism (how do the mind and body interact etc.). He argues that a lot of people are drawn to the identity theory because it seems less problematic than the intractable dualistic picture. Nonetheless he argues  that identity theory, like dualism, is false, so we need a new way of dealing with these problems.

Dennett’s suggestion is a kind of blending of Ryle and Quine. Ryle had long noted that it was a category mistake to try and reduce mind talk to brain talk because they belong to different logical types. Critics of Ryle had long noted that he never really spelled out what he meant by ‘different logical levels’ and things belonging to different ‘conceptual categories’. Dennett aimed to make sense of talk of the mental and the physical belonging to different categories by appealing to Quine’s philosophy of language as spelled out in Word and Object. Quine had noted that certain phrases, such as ‘for-the-sake-of’, are not translatable into the syntax of quantification, so had no clear identity conditions which would force us to be ontically committed to the existence of ‘sakes’. So, for example, we can say things like ‘for-the-sake-of-john’, but we cannot say whether there is one ‘sake’ or two. We cannot say of ‘sakes’ that they have attributes such as being ‘a good sake’, or ‘a big sake’ etc. Dennett argues that despite appearances, terms like ‘sakes’ are non-referential.  Confusing terms which are non-referential with referential terms can result in a series of intractable problems.  To see this he asks us to think of the term ‘voice’. ‘Voice’ is used in various different phrases and these phrases give the appearance that ‘voice’ refers to an entity in the world. So when, for example, it is said that ‘John has lost his voice’, if one is an identity theorist one will assume that ‘voice’ is identical with some physical process going on in John, that the word ‘voice’ refers to a physical process x. There are problems with this view of ‘voice’. Suppose one were to identify John’s voice with a particular fact about John’s vocal cords. This identity is not complete; there are digital recordings of John’s voice which will survive even after John has died. Given that the recordings are all in different mediums it makes little sense to identify ‘voice’ with vocal cords, and CD recordings, and records: all these various things have in common is a disposition to cause certain experiences in normal people. Do we say that ‘John’s voice’ denotes various different concrete objects, so that ‘John’s voice’ is a universal? Do we say that ‘John’s Voice’ is somehow identical with all the various instantiations? If so then it is hard to make sense of claims like ‘John’s voice is strained’, ‘John has lost his voice’ etc. When we say that ‘John has lost his voice’ do we mean that his vocal tracts are damaged, and that he has lost all recordings of his voice? Dennett claims that such silly questions can be avoided by accepting that ‘voice’, like ‘sakes’, and ‘dints’ are non-referential terms. They are ways of talking which are useful in some contexts but which do not carve nature at its joints. He suggests that we can treat ‘the mind’ in a similar way to ‘sakes’ and ‘voices’, useful ways of talking which do not pick out entities in the external world.

The concept of pain is one of the central concepts for any theory of the mind. Some people have used the concept of pain as a clear instance of identity theory.  To be in pain state x is to be in neural state y. Dennett notes that this position has its difficulties because pains are nothing like brain states. If pain state x has intrinsic properties which are not shared by neural state y, then by Leibniz’s law they cannot be the same thing. So Dennett admits that brain states and pain states are not the same thing. Since he does not want to defend any kind of emergent dualism claims he instead decides to explain away the problem by analysing away ‘pain’ talk as being non-referential, just like ‘voices’, ‘sakes’ and ‘dints’.

Dennett notes that if we try to explain the nature of pain at the personal level we will run into a series of cul-de-sacs. (1) We cannot analyse away the notion that ‘I am in pain’.  If someone asks me how I know I am in pain, or how I can tell pain from non-pain no real explanation is possible. I could say that pain hurts but non-pain doesn’t, however this merely relies on the unanalysed notion of ‘hurt’, which will presumably have to be explained by invoking ‘pain’, so our purported explanation will be circular. The truth is that we just do discriminate pain from non-pain. It is automatic, and we follow no explicit theory when judging that we have experienced pain. (2) At the personal level we likewise just automatically locate our pains; again we do not follow an explicit theory when doing this.  (3) We cannot answer the question of what it is about painfulness that makes us want to avoid it. Saying we want to avoid pain because it hurts is no answer to the question, because this raises the unanswerable question of why we want to avoid being hurt? So because of the three reasons just noted Dennett correctly notes that we cannot explain pain at the personal level. Any such purported explanation will of necessity presuppose the undefined notion of pain. Since explanations at the personal level have failed, Dennett notes that perhaps we should try to explain things at the sub-personal level. However, he correctly notes that such a sub-personal explanation will necessarily involve a changing of the subject; at the sub-personal level we will no longer be speaking about pain:

But when we abandon the personal level in a very real sense we abandon the subject matter of pains as well. When we abandon mental process talk for physical process talk we cannot say that the mental process analysis of pain is wrong, for our alternative analysis cannot be an analysis of pain at all, but rather of something else-the motions of human bodies or the organization of the nervous system. Indeed, the mental process analysis of pain is correct. Pains are feelings, felt by people, and they hurt. People can discriminate their pains and they do so not by applying any tests, or in virtue of any describable qualities in their sensations. Yet we do talk about the qualities of sensations and we act, react and make decisions in virtue of these qualities we find in our sensations. Abandoning the personal level of explanation is just that: abandoning the pains and not bringing them along to identify with some physical event. The only sort of explanation in which ‘pain’ belongs is non-mechanistic; hence no identification of pain with brain processes makes sense… (Content and Consciousness, p. 103)

So Dennett is arguing that if we give a personal level explanation we will end up with a vacuous explanation, while we can give a sub-personal explanation which will be a real explanation, but will not deal with the personal level phenomenon of pain. Since the personal level explanation is a non-explanation we need an explanation at the sub-personal and this, of course, is not an explanation of the common sense term ‘pain’. So Dennett is an eliminativist about pain when it comes to scientific explanation.  However, it seems to be almost incoherent to claim that Dennett is an eliminativist about pain. In the above quote he claims that at the personal level people can say truly that they are in pain and that the pain is awful. So Dennett could be accused of making contradictory claims about the nature of pain: at the scientific level he is acting as an eliminativist about pain, whereas at the personal level he admits that pain exists and is awful.        On pain of admitting true contradictions into his theory Dennett needs to decide which of the two levels of explanation is correct. Dennett, as a good naturalist, sides with the scientific level of description and aims to explain away the personal level. He begins his attack on the personal level ten years later with his famous 1978 paper “Why you can’t make a computer that feels pain”.  In that paper Dennett showed that the personal level concept of pain was internally contradictory so could be eliminated without a guilty conscience.  While in his earlier discussion of pain in Content and Consciousness Dennett simply claimed that ‘pain’ was a non-referential term; using pain talk was a way of subjects organising their experiences. However, such ‘pain’ talk will disappear when one is describing organisms in a scientific manner.

Dennett’s offers the following physicalistic explanation of what is known at the personal level as pain:

When a person or animal is said to experience a pain there is afferent input which produces efferent output resulting in certain characteristic modes of behaviour centring on avoidance or withdrawal, and genuine pain behaviour is distinguished from feigned pain behaviour in virtue of the strength of the afferent-efferent connections-their capacity to overrule or block out other brain processes which would produce other motions. That is, the compulsion of genuine pain behaviour is given a cerebral foundation. (ibid, p. 106)

Immediately on reading the above passage the reader will object that Dennett is not talking about actual pain, he is rather talking about pain behaviour. Dennett anticipates this objection and replies as follows:

Now would this account of pain behaviour suffice as an account of real pain behaviour, or is there something more that must be going on when a person is really in pain? It might be supposed that one could be suddenly and overwhelmingly compelled to remove one’s finger from a hot stove without the additional ‘phenomenon’ of pain occurring. But although simple withdrawal may be the basic or central response to such stimulation, in man or higher animals it is not the only one. Could any sense be made of the supposition that a person might hit his thumb with a hammer and be suddenly and overwhelmingly  compelled to drop the hammer, suck the thumb, dance about, shriek, moan, cry, etc., and yet still not be experiencing pain? That is, one would not be acting in this case, as on a stage; one would be compelled. One would be physically incapable of responding to polite applause with a smiling bow. Positing some horrible (but otherwise indescribable) quality or phenomenon to accompany such a compelled performance is entirely gratuitous. (ibid, p.106)

Here again we see Dennett pointing out that the term personal level concept of ‘pain’ is explanatorily barren, non-referential, and pointless for a scientific theory. He argues that if we want to explain ‘pain’ we must do it in terms of behaviour and nerve impulses, and of course in this case we will not be explaining the personal level concept of pain at all. This approach of Dennett’s has won him few adherents; it seems to leave out the most important aspect of pain, its horrible feeling. Few people could accept an explanation of pain which left out the actual feeling of pain. Dennett’s later attacks on the notions of qualia and real-seemings were further designed to attack the notion that pain was a thing which we experienced, as opposed to a mere seeming. However, prior to developing his arguments against qualia and real seemings, he attacked the very coherence of the personal level concept of pain.

In his “Why You Can’t Make a Computer That Feels Pain” Dennett developed a detailed thought experiment on whether it would be possible to build a computer which feels pain.  He begins by discussing a computer simulation of the nature of pain. He notes that this simulation of pain will not feel pain any more than a computer simulation of a hurricane will get us wet. His discussion of the pain simulation begins by noting that it will have to be able to make accurate predictions of pain behaviour. Thus it must predict that, for example, a person who has an anvil dropped on his unanaesthetized left foot will jump around in agony on his right foot, with a tear in his eye, screaming. He anticipates that people will be unsatisfied with such a programme as it leaves out the internal aspect of the behaviour. So he firstly improves on the programme by stipulating that it can predict the internal behaviour of c-fibres etc.  To the objection that the program still leaves out the feel of pain, he stipulates that the programme can report pain states and how vivid the states are:

Again we feed in

‘An anvil is dropped from a height of two feet on S’s left foot,’

And this time we get back:

There is a pain, P, of the in-the-left-foot variety in S; P begins as a dull, scarcely noticeable pressure, and then commences to throb; P increased in intensity until it explodes into shimmering hot flashes of stabbing stilettoes of excruciating  anguish (or words to that effect)….; S’s C-fibres are stimulated… (Why You Can’t Make a Computer That Feels Pain p. 418)

Dennett argues that there is no reason in principle that we can’t build a programme capable of making the above predictions. To people who argue that the above description leaves out actual pain experience, Dennett replies that the above is meant to be a simulation so it is no real objection that it leaves out actual pain, any more than it is an objection to a simulation of a hurricane that it does not make you wet.

He argues that another project which the skeptic may ask for is the building of a machine which can actually experience pain. This is a machine which does things in its environment as opposed to merely simulating them. At this point Dennett discusses our old friend Shakey. He argues that what the skeptic actually wants us to do is build a Shakey type robot which can actually feel pain.

Dennett argues that to build our robot we will need to instantiate our simulation/ theory of pain into the robot’s circuits. We will have to modify the theory in ways that connect with the robot’s motor abilities. So it is easy enough to build a robot that can shed a tear. It is obviously a much more complicated matter to build a robot with sufficient motor control to dance around on one foot when struck on the foot. However, none of this is a problem in principle. It just requires complex engineering. Such a robot would have pain behaviour and the ability to report that it was in pain, Dennett even argues that such a robot will have the ability to detect pain intensity:

But what about the rest of our earlier simulation? What happens to the hot flashes and dull throbs mentioned in our descriptive program? These parts of the output we transform into labelled flashing lights and leave them that way: sometimes the ‘dull throb’ light is on (blinking slowly if you like) and sometimes the ‘hot flash’ light is on. If the skeptic insists on more verisimilitude here, what can he be asking for? Remember that these lights are not blinking randomly. The ‘dull throb’ light goes on only at appropriate times, the robot can then say ‘there is a dull, throbbing pain’ and the other apposite side effects of dull, throbbing pains are presumed to be arranged to coincide as well. (ibid, p. 420)

Most people will not be happy with this robot analogue; they will argue that the robot is not actually experiencing pain despite the complex pain behaviour that it is engaging in. Dennett thinks that such people are looking for a robot which can have identical pain to human.  He argues that such people are holding AI researchers to very high standards; nobody would argue that a robot cannot walk because its legs are not made of flesh and bone and hence are not identical with human legs, so why claim that robot pain must be identical to human pain? As long as they are functionally equivalent Dennett sees no problem with equating his imagined robot’s pain with human pain. However, he recognises that most people do not agree with him on functionalism. Such skeptics will not be happy until computers feel actual pain in the same way normal humans do. Dennett, though, argues that since the personal level concept of pain is in such bad shape and so full of contradictions it may be impossible to instantiate in any creature, not just a robot.

He further argues that an assumption above was that we would have no difficulty constructing a theory of pain; our difficulty would come from an attempt to instantiate this theory in the robot. However this assumption can be questioned, so to test it he tries to construct a theory/model of pain which is consistent with our various different intuitions about pain. His model tries to take account of various different neurological facts about our pain receptors. So, for example, he notes that when through injury our nociceptors are stimulated, the signals travel to the brain through two different types of fibres: rapidly through the large myelinated A-fibres, and slowly through the narrow myelinated C-fibres (ibid p 424). He notes that A-fibres and C-fibres have two different functions. C-fibres transmit ‘sharp’, ‘deep’, ‘aching’ or ‘visceral’ pains, while A-fibres cause ‘sharp’, ‘bright’, ‘stabbing’ pains. Following Melzack and Wall, Dennett notes that the neurological evidence indicates that A-fibres inhibit the effect of C-fibres, closing the gate to pain input transmission.  Simply put, the A-fibres transmit to the limbic system (shared with all animals), while the C-fibres are transmitted to higher cortical areas which only higher primates have. The fact that different fibres transmit to different brain areas predicts a distinction between the hurtfulness and awfulness of pain[9].  Dennett goes on to develop his version of Melzack and Wall’s model of pain. He then goes on to test how this model deals with a variety of reports about the nature of pain. So, for example, a lot of people report that when they concentrate on their pain it loses some of its adverse qualities. This fact is explained because our concentration “raises the volume” of A-fibres, thus inhibiting the effects of C-fibres. He extends this discussion to show how the model can handle data such as the effects of yoga, phantom limb phenomenon etc.

He then considers the effect of various different drugs on pain and notes some strange properties which his model can handle:

A further curiosity about morphine is that if it is administered before the onset of pain (for instance, as a pre-surgical medication) the subjects claim not to feel any pain subsequently (though they are not numb or anesthetized-they have sensation in the relevant parts of the bodies); while if morphine is administered after the pain has commenced, the subjects report that the pain continues (and continues to be pain), though they no longer mind it. Our model suggests that morphine and other analgesics must work on the old low path while leaving the new high path relatively in order, and such is the case. (ibid p. 432)

So again Dennett’s model is shown to capture the relevant facts about pain phenomenon.  He goes on to discuss how his model can successfully handle facts of analgesia before returning to his original question of whether the theory he has sketched is really a theory of pain at all.

His model of pain is represented in a flow chart, and he asks whether we can locate pain, as opposed to its typical cause and effects, in his flow chart. He admits that his flow chart does indeed leave something out, the actual experience of pain. The flow chart is pragmatically useful and can explain things like hypnotic suggestion, the effects of drugs, and the effects of meditation, however it leaves the philosophical questions untouched. Because it does not deal with the feel of pain, he argues that explanations at the sub-personal level will leave out the personal level explanation of pain. However this is not necessarily a bad thing, because the personal level concept of pain is incoherent.

To show this he discusses the case of pain asymbolia which is a brain disorder that results in people who are in pain but not minding having the pain. Such people will laugh if pricked violently with a needle. They admit that they feel the pain; they just do not mind having it. The sub-personal level of explanation can account for this phenomenon by showing lesions near the brain which damage people’s ability to have an appropriate response to pain stimuli, though they still feel the pain. However, at the personal level it seems impossible that a person could feel pain but not mind feeling the pain. Dennett claims that this is because the personal level explanation of pain is incoherent and inadequate.

He further bolsters his views on the incoherency of the personal level concept of pain by discussing a variety of other curious phenomena:

The ordinary use of the word ‘pain’ exhibits incoherencies great and small. A textbook announces that nitrous oxide renders one ‘insensible to pain’, a perfectly ordinary turn of phrase which elicits no ‘deviancy’ startle in the acutest ear, but it suggests that nitrous oxide doesn’t prevent the occurrence of pain at all, but merely makes one insensible to it when it does occur (as one can be rendered insensible to the occurrence of flashing lights by a good blindfold). Yet the same book classifies nitrous oxide among analgesics, that is, preventers of pain (one might say painkillers) and we do not bat an eye. (ibid, p. 443)

Dennett notes that the ordinary language concept of pain is full of inconsistencies like the above one. He goes on to claim:

The philosophical questions that an identity theory (or other ‘philosophical’ theory of pain) would be designed to answer are generated by our desire to put together an account that consistently honours all, or at any rate most of our intuitions about what pain is. A prospect that cannot be discounted is that these intuitions do not make a consistent set. (ibid. p. 445)

Again Dennett is arguing that there may be no explanation of the personal concept of pain because such an inconsistent set cannot exist outside of the vague space of beliefs. In ordinary language people sometimes speak as though pains only exist in the mind of the beholder, yet at other times ordinary language implies that we can have unperceived pain. Again he asks how such a contradictory object can exist.  As a result of these contradictions Dennett argues that the ordinary concept of pain is impeached, and what we require is a better concept. Ultimately he recommends we need to do away with the ordinary concept of pain altogether:

I recommend giving up incorrigibility with regard to pain altogether, in fact giving up all ‘essential’ features of pain, and letting pain states be whatever natural kind states the brain scientists find (if they do find any) that normally produce all the normal effects. When that day comes, we will be able to say whether masochists enjoy pain, whether general anesthetics prevent pain or have some other equally acceptable effect, whether there are unfelt pains, and so forth. These will be discoveries based on a somewhat arbitrary decision about what pain is, and calling something pain doesn’t make it pain. (ibid, p. 449).

This discussion of Dennett’s amounts to an admission that the ordinary language concept of pain will not be explained by our scientific theory. However, this is something we do not need to worry about because the ordinary language concept does not really pick out anything; it is just a bad theory. He further concludes that we will not be able to build a robot which feels pain (in the incoherent ordinary language sense of pain), however we will still be able to build a robot that can feel pain in the scientific sense of implementing a theory of pain.

So we have seen that in Dennett’s two main discussions of pain he denies that the ordinary concept of pain refers to a real entity in the external world. In his earlier discussion of pain he argues that pains cannot be identical with brain states without breaking Leibniz’s law. Since he could not find a good reason to break Leibniz’s law and he argued that Dualism was untenable he concluded that a particular brain state cannot be identical with a particular pain. He avoided the claim that pain states actually referred to brain states by analysing pain language as being non-referential in the same way words like ‘sakes’ and ‘dints’ are non-referential. He argued that any scientific explanation of pain behaviour and verbal reports will not account for the ineffable qualia of pain. However he does not think that this is a problem because he has already shown that the ordinary language concept of pain is merely a way of talking; it does not refer to any entity.

Ten years later in his Why You Can’t Make a Computer That Feels Pain” Dennett expanded on the above points. He further analysed the ordinary language concept of pain and found it to be contradictory. This contradictory object, he argued, could not exist so our scientific concept of pain obviously does not need to be consistent with a non-existent object. Such a strange eliminativist view of pain flies in the face of common sense. Surely if we can know anything, we can know we are in pain. It seems absurd to deny something as fundamental as the existence of pain. In one sense, he argues, pains are real; they are as real as dollars, in the sense that they are intentional objects. We can have a scientific theory of dollars, but they will not answer questions like: how much does that cost in real money. The same is true of pain. People will have beliefs and intuitions about the nature of pain, however a scientific theory cannot always capture all of people’s intuitions. Since Dennett thinks he has shown that the ordinary language talk of pain is vague and contradictory he thinks that a scientific theory can disregard this talk. This means, of course, that our theory is no longer a theory of pain.

Obviously, Dennett’s views on pain have serious implications for our theory that Dennett is guilty of the typical mind fallacy. In the case of mental imaging we surmised that Dennett as a non-imager assumed that we were all non-imagers and unjustifiably built his theory on the assumption that we are all like him. One strand of evidence we used in support of this position was Dennett’s comparison of our ability to call up mental imagery with Shakey the robot’s ability to call up mental imagery. Shakey did not, of course, actually experience mental imagery; it merely reported having such experiences. Dennett argued that we were like Shakey in the sense that, for us, it was a case of all tell and no show. We argued that Dennett was right about his kind of mind, but incorrect that all minds were like his.

By parity of reasoning we should treat pain in the same way as mental imagery. So the reader may expect us to argue that Dennett has no pain experience and this is why he is an eliminativist about pain. This reader may then expect us argue that Dennett is guilty of the typical mind fallacy of incorrectly assuming that all people’s pain experiences are like his. In a sense we do argue in this manner. However, before outlining our argument, we will consider an obvious objection to our position.

It is plausible to claim that people have radically different experiences of mental imagery, from non-imagers to eidetic imagers, while not showing much overt behavioural differences. However, it is completely implausible to assume that people who have no pain or radically powerful pain experiences will give off few behavioural signals. In fact we know from empirical studies that people who do not experience pain are easily discovered, and are discovered at a very early age. Such people suffer from what is called congenital analgesia, and they do not typically live very long. Here is a famous description of a person suffering from the condition:

As a child, she had bitten off the tip of her tongue while chewing food, and has suffered third degree burns after kneeling on a hot radiator to look out the window. When examined… she reported that she did not feel pain when noxious stimuli were presented. She felt no pain when parts of her body were subjected to strong electric shock, to hot water at temperatures that usually produce reports of burning pain or to prolonged ice bath…A variety of other stimuli, such as inserting a stick up through the nostrils, pinching tendons, or injections of histamine under the skin-which are normally considered forms of torture-also failed to produce pain.

       Miss C. had severe medical problems. She exhibited pathological changes in her knees, hip and spine, and underwent several orthopaedic operations. Her surgeon attributed these changes to the lack of protection to joints usually given by pain sensation. She apparently failed to shift her weight when standing, to turn over in her sleep, or to avoid certain postures, which normally prevent the inflammation of joints.

All of us quite frequently stumble, fall or wrench a muscle during ordinary activity. After these trivial injuries, we limp a little or we protect the joint so that it remains unstressed during the recovery process. This resting of the damaged area is an essential part of recovery. But those who feel no pain go on using the joint, adding insult to injury. (Melzack and Wall 1988, pp. 4-5)[10]

I inserted this long quote because it illustrates the dramatic nature of congenital analgesia, and shows that Dennett is obviously not suffering from this illness.  We are not making the claim that Dennett does not experience pain. Rather we are claiming that his pain experience is different from those of the ordinary person. We argue that Dennett, being primarily a linguistic and physicalistic thinker, experiences pain in a different manner to other people. Furthermore, the nature of his experiences results in him drawing different philosophical conclusions than those with other types of minds.

In Berman’s 2008 “Philosophical Counselling for Philosophers” we saw how, in the case of Berkeley and Locke’s debate about abstract ideas, their psychological abilities resulted in them having different philosophical views. More recently in his 2013 paper “Do Personality Traits Mean Philosophy is Intrinsically Subjective” Geoffrey Holtzman has demonstrated that philosophers’ personality traits correlate strongly with how plausible they find various famous philosophical thought experiments. These studies both show that philosophers’ psychological abilities can play a big role in determining their philosophical views. Obviously, though, the fact that some psychological variation accounts for some philosophical differences between philosophers tells us nothing about whether this is the case with Dennett and pain.

To establish whether Dennett has different pain experience we need to first establish that such variation exists and then deal with the problem of whether Dennett’s pain experience is indeed different from others. The case of congenital analgesia obviously differs from normal pain experience; likewise pain asymbolia which we discussed above differs from normal pain experience. However, these cases are severe disorders which are easily detectable by behavioural tests, Dennett obviously does not suffer from disorders of this kind. Nonetheless there are some empirical studies which do indeed show that there are variations in the way people experience their own pains. In their 2011 paper “Genetic Basis of Pain Variability: Recent Advances”, Young et al. showed that people do indeed display variability in the intensity of the pain they feel. This variability was demonstrated to be related to race; for example, African-American and non-Caucasian people reported greater pain than Caucasians in the same clinical settings (“Genetic Basis of Pain Variability”: p.1). It was also shown that females reported greater pain than males in the same clinical settings (ibid: p.1). These variations were all variations of the intensity of pain. Young et al. trace these variations to genetic causes. In his 2005  paper “Sex and Gender Issues in Pain” Roger Fillingim has shown that the sexes do indeed vary in the intensity of pain they report in the same clinical settings. Fillingim focuses on social determinants of such pain behaviour as opposed to the genetic focus of Young et al. Another interesting study of note is Thomas Gosden’s 2008 psychology PhD thesis “Images of Pain: Exportation of the Characteristics and Functions of Pain Related Mental Imagery in Chronic Pain”. In this paper Gosden demonstrates that people with chronic pain have more severe pain when it is associated with involuntary mental imagery.

None of the studies I have mentioned directly support our claim that Dennett’s views are derived from his own pain experience. However the papers are very suggestive. The papers show that philosophers’ individual psychological abilities and personality can affect their philosophical views. We have also shown that there is ample empirical evidence that people have wide variation in their subjective experience of pain. Given Dennett’s strange eliminative views on pain the studies we have gathered together show that it is worth researching into whether his strange views are indeed derived from his individual experiences. To examine this more fully we will explore Dennett’s phenomenology of pain as revealed by his theoretical writings on the nature of pain.

There are four key points which indicate that Dennett has a distinct experience of pain:

(1)   Dennett is a Physicalistic-Linguistic Thinker. Hence, like most physicalistic thinkers he is a monist[11]. He explains away Dreams, Pains, and Mental Imagery  in linguistic terms showing that he is predominantly a linguistic thinker. He is incapable of even imagining pain outside of language.

(2)   When discussing pain in Content and Consciousness he focuses on the fact that the word is not referential. Whether the word ‘pain’ is referential or not is irrelevant to whether pain itself is identical with a brain state. The fact that Dennett cannot conceive of pain itself and focuses on the word ‘pain’ shows that he conceives of pain in linguistic terms.

(3)   In his “Why You Can’t Build a Computer that Feels Pain” he again focuses on how the ordinary language word ‘pain’ has contradictory properties and is therefore theoretically useless. He never considers whether it is language itself which is defective as opposed to pain itself which exists independently of our linguistic representation. Again this is because he cannot conceive of experience outside of language.

(4)   We are not arguing that Dennett has no experience of pain, merely that he has no non-linguistic experience of pain. He cannot see beyond language. All of his consciousness is overwritten by his linguistic and physicalistic nature.

Our analysis of Dennett’s views on mental imagery, color-phi, after images, and pain reveal that he has a distinctive type of mind. Like all Type 2 thinkers he has no experience of the Cartesian Theatre. The fact that Dennett does not have any experience of the Cartesian Theatre is evident from his discussions of mental imagery and ‘real seemings’. His discussion of pain reveals that he is also to a lesser degree a Type 3 thinker. Dennett has a bodily experience of pain, however his experience of pain is deeply immersed in his linguistic competence. He is incapable of even conceiving of pain outside of language. This fact is evident from his theoretical views on pain which centre entirely on the ‘concept’ of pain and ignore the actual experience of pain.

We will get a clearer view of Dennett’s experience of pain by contrasting his views with those of other thinkers. We analysed Thomas Nagel as being a blend of a Type 1 and a Type 2 thinker and, as one would expect, his views on pain are entirely different than Dennett’s.

With Nagel and Dennett, buried beneath the surface of their argumentation, what one gets is a clash of intuitions. Or, to put things more precisely, Nagel has stronger intuitions on certain topics than Dennett, and this influences how they react to philosophical arguments. Throughout the years as Nagel and Dennett have sketched their alternative theories of the mind the issue of intuition has come up over and over again. Here is Nagel in his The View From Nowhere:

But philosophy is not like a particular language. Its sources are preverbal and often precultural, and one of its most difficult tasks is to express unformed but intuitively felt problems in language without losing them… ( The View From Nowhere p.11)

Dennett has often been perplexed by this attitude of Nagel’s, and has correctly pointed out that a lot of scientific progress has been made by not treating our intuitions as sacrosanct. Here is Dennett criticising Nagel’s appeals to intuition:

Nagel is the most eloquent contemporary defender of the mysteries and anyone who suspects I have underestimated the problems I pose for my theory will be braced by Nagel’s contrary assertions. Assertions, not arguments. Since Nagel and I start from different perspectives, his arguments beg the question against a position like mine:  what counts for him as flat obvious, and in need of no further support, often fails to impress me. I assume that whatever the true theory of mind turns out to be, it will overturn some of our prior convictions, so I am not cowed by having counterintuitive implications pointed out to me. No doubt, Nagel who calls his book ‘deliberately reactionary’, is equally unshaken when it is pointed out that his allegiance to certain intuitions is all that prevents him from escaping his perplexity down various promising avenues of scientific research ( The Intentional Stance p.6)

Dennett admits that he and Nagel are begging the question against each other; however he argues that his approach has yielded more pragmatic success so he will continue using it.  Our contention is that both thinkers adopt the stance they do because of the nature of their minds.

This comes out when we see how Nagel views the phenomenon of pain. Nagel’s views on pain are diametrically opposed to Dennett’s.  Nagel has very definite views on the nature of pain. He makes the claim that pain is something that is intrinsically bad, not just something we hate (Mind and Cosmos, p.110). He further argues that an instrumentalist account of pain does not capture the actual badness/wrongness of pain. An instrumentalist account of pain is a Darwinian type explanation. So, for example, it could be argued that the feeling of pain is a warning to people that something is wrong, and that action needs to be taken.  People who are incapable of feeling pain would be at a distinctive selective disadvantage. Nagel does not doubt that an explanation of this kind is possible. However, he argues that this Darwinian explanation does not explain the objective judgement that pain is something that is intrinsically a bad thing. For Nagel, pain has a dual existence: (1) a feeling for which we can give an evolutionary explanation, (2) an object which we can reflect on and judge to be intrinsically bad. He even makes the claim that his objective value judgements are so strongly felt by him that even if Darwinian theory contradicts his views he will reject the Darwinian theory (ibid p. 110).

Dennett of course takes the opposite view to Nagel. For him our intuitions take a secondary role to scientific discoveries. Obviously, part of the reason that Dennett adopts this attitude is because of the fact that in physics experimental research has shown how incredibly unreliable our intuitions are. Our intuitive conception that the world works according to contact mechanics was refuted by Newton three centuries ago. So Dennett thinks that we should be open to similar things happening in the philosophy of mind. We argue that Dennett, though partly motivated by reasons, is also influenced by the nature of his own experiences.

Dennett and Nagel have clearly got entirely different ideas of what the nature of pain is. Nagel thinks that pain is something we feel which goes beyond anything we can express in language (The View From Nowhere, p. 11). While Dennett views pain as something immersed in language, he thinks of us as Joycean machines, creatures who are overridden by language (Consciousness Explained, p. 275). We argue that their different conceptions of pain are based on their own inner experiences. Their different theoretical descriptions of the nature of pain clearly shows how they experience pain differently.

SOME OBJECTIONS

When discussing Dennett’s views on mental imagery, colour phi, and after images we came to the conclusion that he was denying the existence of phenomenal consciousness. We argued that on Dennett’s view we are told by our brain that we have had such and such an experience, and that this report grounds our belief, not a direct experience of phenomenal consciousness.  However, Dennett’s discussion of the phenomenon of change blindness complicates my analysis of his views on the nature of consciousness. Before going into how his views on change blindness complicate our interpretation of his theory, we will first outline what the phenomenon of change blindness is, and how Dennett interprets this phenomenon.

Change blindness is a well-known phenomenon in psychology.  A famous example of change blindness was studied by Resnik, O’Regan and Clark (1997). They present subjects with near identical photographs for 250 milliseconds each, while between each photograph a blank screen is presented for 290milliseconds (Dennett: Sweet Dreams, p. 83). The near identical photographs are flashed back and forth for subjects to watch. Subjects typically do not register any changes in the photographs for at least 30 seconds.  Once they register the change (a white door turning brown at the corner of the picture), or are told about it, the change becomes obvious to them. Furthermore, prior to registering the change on a conscious level their body seems to be aware of the changes occurring. Thus Dennett notes:

In the phenomenon of change blindness for colour changes, for instance, we know that the colour sensitive cones in the relevant region of your retina were flashing back and forth, in perfect synchrony with the white/brown quadrangle. (ibid, p. 88)

He argues that we need to do further tests to understand what areas of the brain are lit up after people become aware of the changes in what they are viewing.  Dennett argues that the phenomenon of change blindness poses serious difficulties for the notion of qualia. To illustrate this point he typically gives the change blindness test to people in lectures.  After people figure out that the change has occurred he asks them the following question:

Now before you noticed the panel changing color, were your color qualia for that region changing? We know that the cones in your retinas in the regions where the light from the panel fell were responding differently every quarter of a second, and we can be sure that these differences in transducer output were creating differences farther up the pathways of colour vision in your cortex. But were your qualia changing back and forth white/brown/white/brown-in time with the colour changes on the screen? Since one of the defining properties of qualia is their subjectivity, their “first-person accessibility,” presumably nobody knows- or could know- the answer to this question better than you. So what is your answer? Were your qualia changing or not? (Sweet Dreams p. 83)

Dennett argues that whatever answer is given to this question will make trouble for the theoretical notion of qualia.  The question we want to consider is whether Dennett’s views on change blindness affect our claim that he is denying the reality of phenomenal experience?

If one considers Dennett’s solution to the colour phi phenomenon, the problem was that the colour change was not happening in the external world; likewise he denied that there was a mental presentation of a red-spot changing into a green spot in a Cartesian theatre. So we had a problem of saying what was occurring with the color phi phenomenon. Dennett argued that our brain tells us that a red light turned into a green spot. The colour phi phenomenon is not presented to the mind’s eye; rather, our brain just tells us that we saw the change. Dennett’s solution has led people to accuse him of only explaining access consciousness but of ignoring phenomenal consciousness.

His position on change blindness helps us better understand why he seems to ignore phenomenal consciousness. For some people the idea of phenomenal consciousness is parasitic on the notion of qualia. When people claim that phenomenal consciousness is a basic fact of our experience they are typically referring to our subjective experience of qualia. So qualia is the basic feel of an experience;  a taste of beer, or a taste of coffee, the feel of touching a smooth or rough surface etc. Qualia is the first person experience. Philosophical lore has it that no amount of third-person scientific research can reveal what the intrinsic experience of qualia in fact is. This is because qualia is an intrinsic first-person phenomenon. Qualia, our intrinsic first-person experience, is of course our phenomenal consciousness. In the case of colour-phi Dennett denied that anything is presented before the mind, while in the case of change blindness he went further and argued that the notion of qualia makes little sense. So it could be argued that his change blindness position is even more radical than his colour-phi position.

However, it still seems to be a bit of a stretch to understand Dennett’s claims on change blindness to be an argument against the notion of phenomenal consciousness. Dennett does after all argue that after about 30 seconds people come to recognise the changes between the two pictures. To some, this indicates that for Dennett, when people become aware of the changes in the pictures the change enters phenomenal consciousness. To think this is to radically misunderstand Dennett’s views on consciousness. Dennett argues that there are three different ways to answer his above question about whether our colour qualia were changing prior to our noticing the changing pictures: (1) Yes, (2) No, (3) I do not know. If someone answers yes then this shows that one of the key criteria of qualia is falsified; the criterion is that qualia are experienced directly in a first-person manner. A person who answers yes is inferring the nature of qualia from a third-person point of view. If a person answers no, he argues, they risk trivialising the notion of qualia. Our qualia get reduced to our judgements that they occurred. This of course means that our qualia can no longer be considered to have an intrinsic nature; their nature will be relational in that they rely on a judgement of a subject for them to exist. Furthermore, since zombies are behaviourally indistinguishable from ordinary people from the point of view of behaviours then a person who answers no to the above question will be committed to the view that the zombie has qualia. After all the zombie will make the same qualia judgements as everyone else.

If people answer that they do not know whether the qualia are changing back and forth then there can be two reasons for their confusion: (1) Despite what they originally thought they do not really know what qualia are, or (2) They do not know what the answer is because qualia have properties which go beyond the reach of both third-person and first-person science. Dennett argues that a person who makes Claim 2 is placing themselves outside science so their views are not really candidates for serious consideration. So he basically argues that Claim 2 is the correct conclusion to draw from the fact of change blindness.

So Dennett’s change blindness argument purports to show that the philosophical notion of qualia is a mess. He further supports this claim with his thought experiment “The Strange Case of Mrs Clapgras”. In his earlier paper “Quining Qualia” Dennett presented a series of thought experiments which were designed to show that the philosophical notion of qualia was incoherent.  The combined force of his various arguments is designed to make people less comfortable with using the term qualia, to make them realise that despite appearances the term qualia is senseless.

If we take Dennett’s arguments against qualia seriously this will undercut one of the primary objections against his views on the colour-phi phenomenon. The objection to Dennett’s colour-phi phenomenon was that it was phenomenologically inaccurate. It is argued that we know from experience that we observe the colour change from red to green, and that we then form our judgements based on the observation of this change. We do not form the judgement independently of the observation. This argument is designed to show that from the point of view of first-person experience we know that Dennett’s conclusions on colour-phi are false; we directly experience the phenomenon despite what he thinks.  However, if one takes his arguments against the notion of qualia seriously, one is given pause to doubt whether the argument from first person experience is sufficient to refute Dennett’s claim.

A person who used first-person experience as an objection against Dennett’s views on color-phi, mental imagery etc. would be arguing that despite the theoretical problems with these phenomena (no evidence for the existence of figment, no evidence for the existence of a mind’s eye), our first-person experiences shows that the phenomena exists. They would argue that our phenomenal experience grounds our judgement; it is not the case that our brain merely tells us that we experience mental imagery; we actually experience the qualia of an image being presented before the mind’s eye.  Dennett’s discussion of change blindness (and his other thought experiments) show that the above argument against his view of the mind does not work.

Qualia are often presented as the most basic thing we can know. We can doubt all else but we cannot doubt how our most basic experiences seem to us. What Dennett’s thought-experiments purported to show was that despite what we may believe we do not really have any firm handle on what these qualia are. He showed that when we examine these qualia closely we do not in fact have a clear grasp of their nature. So, for example, the card trick showed that while we may think we are experiencing definite qualia in our peripheral vision we actually are not. In the case of change blindness we have no clue as to whether qualia are being flashed back and forth prior to our noticing. So all of Dennett’s thought-experiments combined do seem to show that the philosophical notion of qualia is a mess. This fact has serious implications for those who argue against Dennett’s explanation of the colour-phi phenomenon by appeal to facts about how things seem to them. Since Dennett has cast serious doubt on the reliability of our claims about how things seem to us, then appeal to such seemings as an incorrigible foundation to refute his views on mental imagery and colour phi is questionable to say the least.

Critics of our position could argue that Dennett’s arguments against qualia undercut our claims that he is guilty of the typical mind fallacy. A central aspect of our argument centred on people’s introspective reports about what they experience, while Dennett’s various different arguments show that people are not authoritative about what they experience and that they do not really know what they mean when they use the term qualia. We argue, on the contrary, that Dennett’s arguments against qualia have no effect on our central claim.

If we argued entirely from people’s introspective reports and nothing else then Dennett’s arguments about qualia would undercut our claims. However we provide much more evidence than this. Firstly we rely on neurological reports to support our claims. We have discussed evidence from people like Kosslyn, Ganis et al. which shows that people who are imagers have the occipital lobe activated to a greater degree than non-imagers. So our introspective reports about variations in people’s abilities to form mental imagery are backed up by impressive neurological evidence, and by the impressive behavioural evidence summarised earlier in the paper. A critic could reply that based on standard definitions of qualia we cannot appeal to neurological evidence to support our claims. This criticism does not apply to us; we are not using the theory-laden notion of qualia, rather we just speak of experience. We see no valid reason that we should be barred from using neurological evidence and behavioural evidence to support our claims. We agree with Dennett that fantasies about zombies which are identical to ordinary people but which lack consciousness are pointless. When we speak of people who have different mental abilities we argue that these different abilities can be detected via behavioural tests, introspection, verbal reports and neuroscientific tests etc. Using such diverse strands of evidence has led to discoveries of previously unknown subjective variation in mental imagery. It has led to a diagnosis of coloUr blindness, synaesthesia etc.

Another type of evidence which we rely on is behavioural evidence, more precisely verbal behaviour, in the form of texts that our subjects produce. People’s philosophical views on various topics often inadvertently reveal the nature of their type of mind. So bearing in mind the fact that our evidence is not limited to simply introspective reports, then it should be obvious that Dennett’s evidence that introspection is sometimes unreliable and gives us incomplete knowledge on a topic does not refute our Typical Mind Fallacy argument.

Nonetheless it does raise a more fundamental question for our theory. We have been assuming that Dennett disagrees with us that people have variations in their imagery abilities. However this does not have to be the case. Consistent with his own theory of consciousness he could argue that it is true that people report variations in their ability to form mental images, also that it is true people who claim to be non-imagers and people who claim to be imagers have different patterns of neural activation. So far all of this is consistent with his theory of consciousness.  Nothing in Dennett’s theory forces him to deny any of the previous facts about imagery. What Dennett has to argue is that in the case of the imager and the non-imager neither of them is presented with any imagery. So imagers will have some process in their brain which results in their brain forcing them to form the judgement that they experience mental imagery (in a similar way that we form the judgement that we experience colour-phi). Non-imagers’ brains will not form judgements of experiencing imagery. Now obviously if Dennett wants to form such a model he will need to provide neurological evidence to support it; however he would argue that he is more likely to find evidence for his model in the brain than he is to find a Cartesian theatre in the brain. The main point is that there is no problem in principle with him acknowledging that people have variations in their mental imagery.

The problem with Dennett’s model is that it does not do justice to the phenomenology of mental imagery. People (even poor imagers like me) experience mental imagery. It is more than being told that they experience such imagery, it is an actual experience. This reply of course brings us back to our original debate. Can we be so sure that we have a direct experience of something? We would argue that we can. Our experiences are foundational and nothing Dennett has said changes this fact. His arguments against qualia are arguments against a philosophical construction; they are not arguments against experience. Nobody who has direct experience of colour-phi, mental imagery and pain could deny that direct experience. However, for a thinker like Dennett whose experience is primarily bodily and linguistic his theory of consciousness will make perfect sense. In fact we would expect a person with his type of consciousness to endorse the type of views that he does.

 

CONCLUSION

Dennett’s eliminativist views on consciousness have puzzled interpreters for years. Some people find his arguments utterly compelling, while others believe that his arguments leave out key aspects of the mind. We have tried to make plausible the claim that such disagreements result from people having different types of mind and not factoring these differences into their theories. We suggest that philosophers can avoid making such mistakes by engaging in introspective research, submitting themselves to psychological tests etc. Understanding individual variation of minds and how such variation influences theorising can only help to make philosophy more objective than it has heretofore been. Ours is but a first step into what we hope to be a more scientific way of practising philosophy.


[1] William James quotes taken from Berman’s Penult.

[2] A non-imager interviewed by Galton, the astronomer Major John Herschel held a similar view to Dennett:      “ The questions presuppose assent to some kind of proposition regarding ‘the minds eye’ and the ‘images which it sees… It is  only by a figure of speech that I can describe my recollection of a scene as a “mental image” .” (Galton 1907). Galton’s tests revealed that Herschel was in fact a non-imager. We make the same claim about Dennett.

[3] In his 1958 paper “Concerning Mental Pictures” Arthur Danto describes his own imaging abilities whereby he can perform precisely the tasks  which Dennett  claims are impossible. This again indicates that Dennett is describing his own types of images rather than the images which other subjects have.

[4] See David Berman: A Manual of Experimental Philosophy (2009).

[5] See Bill Faw: Outlining a Brain Model of Mental Imaging Abilities (1997).

[6] See the Introduction of Berman’s Berkeley and Irish Philosophy,   his A Manual of Experimental Philosophy, and his Philosophical Counselling for Philosophers.  Also  William James Principles of Psychology for historical evidence of the typical mind fallacy.

[7] The data on Dream Experiences and Mental Imagery is only suggestive. We are not claiming a proof that people who are non-imagers are always vivid dreamers.  Rigorous experimental tests will need to be done in order to prove this conclusively.

[8] See Kolers, P. A and Von Grunau (1985) “Shape and Color in Apparent Motion” Vision Research 16 pp329-335.

[9] I am here simplifying Dennett’s model, however the simplification does not effect the argument of the paper.

[10] Quote taken from Nikola Grahek: Feeling Pain and Being in Pain p 8.

[11] The link between Monisms an physicalistic types is explored in more detail in Berman 2008 Penult.

Poverty of Stimulus Arguments and Behaviourism.

POVERTY OF STIMULUS ARGUMENTS AND BEHAVIOURISM

PART 1: CHALLENGES TO THE APS

SECTION 1: IMPLICATIONS FOR CHOMSKY AND QUINE

When it comes to the details of how children learn their first language there is a substantive difference between Chomsky and Quine. The primary difference between them centres on the role that they think reinforcement plays in a child learning his first language. The Quinean picture of a child learning his first language involves the child using his innate babbling instinct as he mouths various different words. The parent reinforces these emissions positively and negatively until the child’s pattern of verbal behaviour is battered into the external shape of his social environment. As Quine put it in Word and Object:

People growing up in the same language are like different bushes trimmed and trained to take the shape of identical elephants. The anatomical details of twigs and branches will fulfil the elephantine form differently from bush to bush, but the overall outward results are alike (1960, 8)

 

Linguistic nativist Chomsky disagrees with this Quinean picture. He thinks that the outward shape of language results not from the child’s faltering attempts at speech being corrected by his peers, but from the child using his innate universal grammar to structure the data of experience which the child contingently encounters.

            The issue between Chomsky and Quine on this point is a purely empirical one. In the last twenty years much detailed evidence has emerged which can be used to decide between the two theorists.  The central idea around which nativism was built has been poverty of stimulus arguments. Chomsky has argued that children display knowledge of language, and that this knowledge is not provided by the environment, and therefore it must be innate. Typically Chomsky uses the subject-auxiliary inversion rule to illustrate how poverty of stimulus arguments work.  In their paper ‘‘Empirical Assessment of Stimulus Poverty Arguments’’ Geoffrey Pullum and Barbara Scholz call the subject-auxiliary inversion rule the paradigm case which nativists use to illustrate poverty of stimulus arguments.  They cite eight different occasions that Chomsky uses the example (Chomsky 1965, 55-56; 1968, 51-52; 1971, 29-33; 1972, 30-33; 1975, 153-154; 1986, 7-8; 1988, 41-47). They also cite other Chomskian thinkers (including linguists such as Lightfoot, 1991, 2-4; Uriagereka, 1998, 9-10; Carstairs-McCarthy, 1999, 4-5; Smith, 1999, 53-54; Lasnik, 2000, 6-9; and psychologists such as Crain, 1991, 602; Macrus, 1993, 80; Pinker, 1994, 40-42, 233-234) who here endorsed the claim. They argue that this supposed instance of an APS is being passed around and repeated over and over again. No surprise, then, that without having knowledge of Pullum and Scholz’s article, I chose the subject-auxiliary inversion as my paradigm example of an APS.  I had fallen into the same pattern of passing around this well-worn example. In this paper, I examine whether the APS argument (as applied to syntactic knowledge) actually works. I will discuss how, if sound, the APS affects Quine’s view of language learning.  It will be shown how Quine’s theories on language acquisition will be affected if the APS turns out to be false.

            Pullum and Scholz (2002) show that poverty of stimulus arguments are used in a variety of not always consistent ways in the literature. Having surveyed some of the literature on the APS, they isolate what they believe to be the strongest version of the argument.  The argument they construct is as follows:

(A) Human infants learn their first languages either by data-driven learning or by innately-primed learning.

(B) If human infants acquire their first languages via data-driven learning, then they can never learn anything for which they lack crucial evidence.

(C) But infants do in fact learn things for which they lack crucial evidence

(D) Thus human infants do not learn their first languages by means of data-driven learning.

(E) Conclusion: Humans learn their first languages by means of innately primed learning.

 

This gloss on the APS is one that Chomsky would accept as an appropriate schematisation of the APS, though Chomsky believes that there is more evidence to support a belief in innate domain-specific knowledge than the APS[1]. Pullum and Scholz claim that the key to evaluating the soundness of the APS is premise (C) which is the empirical premise of the argument. So to evaluate the argument, they study the linguistic environment of children. Their aim is to check if there really is no evidence provided by the environment which the child can use to formulate a hypothesis of a particular rule. For example, will a child be presented with data such as ‘Is the man who is at the shop happy?’, which can help them learn the subject-auxiliary inversion? Pullum and Scholz’s research programme involves searching the Wall Street Journal corpus to discover if constructions which Chomsky claims a person could go much or all of their life without encountering are, in fact, more frequent than he would lead us to believe. However, prior to discussing what the evidence tells us about the frequency of the sentences, I first want to discuss what Quine would make of the APS as discussed by Pullum.

            The first premise of Pullum’s version of the APS is that a child learns language either by data-driven learning or by innately-primed learning.  Quine maintained throughout his entire philosophical career that our linguistic abilities are not distinct from our overall theory of the world. In fact, he has consistently maintained that learning a language is learning a theory of the world, and, furthermore that learning a scientific language is learning a more explicit regimented form of ordinary language. According to the picture presented in Word and Object, a child begins by babbling various different sounds and has these sounds reinforced in various different ways. Through the process of conditioning and reinforcement, the child eventually learns when it is appropriate to use which sounds. According to Quine, at this stage the child has not learned any concepts. Quine argues that through processes such as analogical reasoning, abstraction etc., children eventually learn to structure some of these sounds into syntactic units. It is only after we have mastered this syntax and can then speak of certain objects as being the same as or different than other objects, that we can be said to have grasped the concept of an object, and learned to speak about objects in the world. The important point is that for Quine, the processes which a child uses to learn a language are the same as the processes he uses to learn about the world. So Quine would not accept that language is learned by innately-primed learning (in the sense of innate domain-specific knowledge). The question of whether a child learns his first language by data-driven learning is a more complicated question on the Quinean picture.

            On some versions of data-driven learning, the child is presented as a passive observer of verbal behaviour. From when they are born (strictly speaking, when in utero as well), children are bombarded with verbal behaviour. So, on one data-driven learning picture, children (unconsciously) observe the various different patterns of verbal behaviour; circumstances of occurrence, order of occurrence, tone used etc. and unconsciously construct a model of the language they are presented with.

            Quine does not deny that the child uses such statistical methods to organise the data of experience; so in this sense he agrees with the statement that a child learns by data-driven learning. However, for Quine, the word ‘data’ has a much wider meaning than mere models constructed based on observed linguistic regularities. For Quine, an important part of the data is the type of reinforcement that the child receives. The child elicits utterances and receives various different types of reinforcement, either negative or positive depending on the appropriateness of the utterance. So on the Quinean picture, as the child is learning his first language, he might be reinforced for putting forth a question such as ‘Will Mama feed me?’. Now suppose the child had been constructing questions by moving the first auxiliary of various statements to the front of the sentence, and suppose further that the child had been positively reinforced for this behaviour. Given this state of affairs, the child will continue to emit questioning behaviour like this until he receives negative reinforcement. Now suppose that the Quinean child wants to ask a more complicated question; suppose he wants to discover whether the sentence ‘The man who is tall is sad’ is true. The child will continue along the pattern of previous questions and will turn the statement into a question and ask ‘Is the man who tall is sad?*. On the Quinean picture, this questioning behaviour will be negatively reinforced. The child will continue to try different constructions based on past experience and reinforcement until eventually their language output is moulded into the shape of the child’s community.  So for Quine, the data the child learns from is not merely observation of the statistical patterns of the language he is exposed to, but also includes the ways various constructions are reinforced negatively and positively.  The important point is that in order for Quine to accept premise (A) of Pullum and Scholz’s reconstruction of the APS, he would have to understand data in a wider manner than that obtained by mere passive observation.

            Quine would accept premise (B) as long as data-driven learning is considered in this expanded sense (reinforcement, plus statistical regularities in the environment).  Premise (C) is the crucial empirical premise: but infants do in fact learn things for which they lack crucial evidence. The “parade case” of linguistic nativists where children display knowledge where they have not been provided crucial evidence is the subject-auxiliary inversion. Quine emphasises induction, analogy, and reinforcement as the primary tools in language learning. He has never endorsed the claim that children have knowledge of the rules of language for which they have not received data from their linguistic environment. For Quine, any sentence a child utters is either learned inductively from the child’s PLD or is constructed through an analogy with previously heard utterances in the PLD. Through induction, analogical reasoning and reinforcement, the child will eventually arrive at the language of his peers. So Quine would deny the truth of the crucial empirical premise (C). Furthermore, premise (C) is a crucial test of Quine’s theory of language acquisition: if it could be demonstrated that a child has knowledge of a rule of language which was not learned by experience, analogy or reinforcement, then this would demonstrate that Quine’s theory of language acquisition is seriously incomplete.

 By reviewing the Wall Street Journal corpus, Pullum and Scholz have provided evidence that the constructions which Chomsky claims a child will never be exposed to in their lifetime do, in fact, occur. They used the Wall Street Journal because it is easy to obtain and free. People have justly complained that the Wall Street Journal is obviously not a representative of the type of data a child will be exposed to. To this they have replied that since Chomsky claimed that the type of sentences which a child needs to be exposed to in order to learn structure dependence rules are so rare that a child can go much or all of his life without encountering them, then the Wall Street Journal is therefore evidence that Chomsky is wrong on this point.  Geoffrey Sampson, in his book The ‘Language Instinct’ Debate, has provided evidence that the type of constructions which Chomsky claims are vanishingly rare occur in children’s books. Furthermore, he has searched the British National Corpus (including a search of child-parent interaction) and found hundreds of examples of the relevant constructions[2].

            Pullum et al. think that since they have shown that premise (C) is not in fact true, then the overall argument, while valid, is not sound and therefore the argument for linguistic nativism does not go through. Let us assume that Pullum is correct and that Chomsky’s argument is not sound: what then are the implications of this for Quine?

            As we have seen, Quine thinks that children learn language through data-driven learning, in a broad sense. One of the primary objections to the Quinean picture of language learning is that negative reinforcement does not play the role in language learning that Quine thinks it does.  A wide variety of experimental evidence has been put forward by psychologists who claim that this evidence shows that children are not corrected when they speak ungrammatical sentences (see for example, Marcus 1993, 53-85; Gropen, J., S. Pinker, et al.,  1989, 203-57; Crain, S. and M.Nakayama 1987, 113-25). At a superficial level, this seems to show that Quine’s picture of language acquisition is incorrect.  The empirical evidence seems to indicate that the picture of a child mouthing constructions such as ‘Is the man who tall is sad?’*, and receiving negative reinforcement is, in fact, incorrect. Therefore, one could conclude that even if Pullum is correct that the child is exposed to some examples which help the child learn the structure-dependent rule, this view will not help the Quinean conception of language learning.

            However, it does not automatically follow that because explicit reinforcement is not involved in language learning that a more subtle kind of reinforcement is not used. Whether or not reinforcement is explicitly used in learning complex grammatical utterances, it is unquestionable that children do receive positive reinforcement for speaking.  When a child begins to speak first, every utterance is encouraged and rewarded with affection. In Word and Object, Quine notes that any reinforcement that the child receives will be concomitant with a variety of different stimulations. As he writes:

The original utterance of ‘Mama’ will have occurred in the midst of sundry stimulations, certainly; the mother’s face will not have been all. There was simultaneously, we may imagine, a sudden breeze. Also there was the sound of ‘Mama’ itself, heard by the child from its own lips. (1960, 81)

 

So, for Quine, the effect of the reinforcement will be that the child will repeat the word in the presence of Mama’s face, in the presence of a mild breeze, and upon hearing the sound mama. However, the child will not receive reinforcement for saying ‘Mama’ in the presence of a sudden breeze, so will eventually stop emitting this behaviour.  The child will, however, receive reinforcement for saying ‘Mama’ in the presence of mama, and for repeating the word ‘Mama’ upon hearing someone near him speak it.  One helpful consequence of this type of reinforcement, according to Quine, is that the child who is being reinforced for repeating ‘Mama’ when someone says ‘Mama’ will, from the parent’s points of view, appear to be engaging in mimicry. If the child can recognise that he receives reinforcement not just for sounds used in certain appropriate contexts but also for mimicking the behaviour of his peers, then he  will have had a very useful tool reinforced. The child will have realised that it pays to listen to his peers and to try to imitate their behaviour. To this end the type of statistical abilities postulated by people such as Lappin and Clark will be obviously useful in helping the child learn his first language. Furthermore, if what is being reinforced is mimicking behaviour, then the fact that certain sentences which Chomsky claims do not occur in the data do, in fact, occur, this will obviously be of vital importance for Quine’s theory. Obviously, Quine’s mimicking theory will only work if the child experiences the constructions which he displays knowledge of. All of this is schematic. While it does not show that Quine’s theory of language acquisition is correct, it shows that recent research which purports to show that Chomsky’s APS arguments in syntax do not work, can also play a role in supporting Quine’s theory of language acquisition.

           

                      SECTION 2: RECENT CRITICISMS OF THE APS

The first criticism that I will consider is a logical argument which has been put forth by Geoffrey Sampson. In his The ‘Language Instinct’ Debate, Sampson claims that it doesn’t matter whether there is data which refutes Chomsky’s APS, because the argument is self-refuting in itself. He attributes to Chomsky the following claim: ‘Language has certain properties no evidence of which is available in the data to which we are exposed to when learning the language.’ He then asks how Chomsky can possibly know this. The adult’s conscious knowledge of the properties of the language is based on observations of the language, but Chomsky claims that such observations are insufficient to determine the properties of the language. Therefore, if there is a grammatical rule which a language learner rarely or never encounters in their data, then there seems no reason why a linguist would encounter such a grammatical rule either. Both the language learner and the linguist are exposed to the same data which Chomsky claims will not determine the rules of the language at all.

            In essence, what Sampson claims is that if there is no evidence in the data from which a child can learn the rule, then there is no evidence in the data which justifies the linguist in postulating the rule. Hence, for Sampson, the APS is self-defeating. One possible way for the linguist to overcome this difficulty would be to claim that he uses his innate knowledge of grammar as well as observation to discover that the rule obtains. However, Sampson correctly notes that to argue thusly is to beg the question against your opponent. So he concludes that the APS is either self-defeating or a mere question-begging stipulation.

            A key aspect of Sampson’s argument is his emphasis on Chomsky’s claim that children will never encounter certain constructions in their experience which could help them learn the relevant rule. He quotes the following statement of Chomsky’s given in a 1980 lecture:

The child could not generally determine by passive observation whether one or the other hypothesis is true, because cases of this kind rarely arise; you can easily live your whole life without ever producing a relevant example…you can go over a vast amount of data of experience without ever finding such a case… (1980, 121)

 

In the above quote Chomsky claims that sentences which confirm the subject-auxiliary inversion rule are virtually never encountered. Sampson then asks rhetorically: if such sentences are never encountered, what reason would we have to say the rule exists?

            The main difficulty with Sampson’s argument is that the data a professional linguist is exposed to obviously far exceed the data a language learner would encounter. A child from a professional background will be exposed to about 30 million word tokens by the time they are three years old[3]. A linguist from a similar background (assuming that he has completed a PhD and is around 27) will have been exposed to about 240 million word tokens[4]. So a typical linguist will have encountered at least nine times the number of words that a typical child has. Obviously, if we accept Chomsky’s claim that a child can go much or all of his life without encountering the relevant constructions, then the data the child is exposed to will be irrelevant. However, when Chomsky and other linguists discuss APS examples, what they typically state is that the data is insufficient for the child to learn the general rule, not that that there is no data at all.  So, bearing in mind the fact that a linguist is exposed to at the very least nine times the data that a language learner is, it is quite possible that the linguist will be exposed to enough examples to learn of the existence of the auxiliary inversion rule, while the child may not have been exposed to enough data to learn this rule from his PLD. Furthermore, the linguist will have access to other languages with which to compare his data from English. He will have conversational partners to discuss his findings with, and will have access to thousands of books and articles detailing the discoveries of other linguists. This evidence indicates that the linguist will have been exposed to much more than nine times the amount of linguistic data which your average child is. On these grounds, it is clear that Sampson’s argument is inconclusive at best. To show that Chomsky is making a claim that is self-refuting, Sampson needs to demonstrate that a child and a professional linguist are exposed to the same amount of linguistic data. Such a claim is of course patently absurd.

            While Sampson’s argument does not work as a demonstration that Chomsky’s APS is self-refuting, it does reveal a real weakness in Chomsky’s APS. Chomsky is making claims about the child’s PLD for which he has provided no evidence. So Sampson’s argument at least demonstrates the necessity of Chomsky providing evidence for the controversial claims he is making.

            Pullum and Scholz did the first detailed study of how often sentences relevant to the structure-dependent APS appear in the data a child is exposed to.  As I discussed above, they began by making the logic of the APS explicit by structuring it as a logical argument. They isolated the third premise which claims that data relevant to learning the structure-dependent nature of language do not occur enough in the child’s PLD for him to learn the relevant construction.  They set out to test this claim by checking a corpus of linguistic text; they used the Wall Street Journal as their corpus because it was freely and easily available.

            In order to test how often a construction is encountered by a child learning a language, it is first necessary to test how much linguistic data a child is exposed to. Pullum and Scholz relied on the work of the psychologists Hart and Risely, who in their 1995 Meaningful Differences in the Everyday Experiences of Young Children, detailed the amount of linguistic data a child is exposed to. Hart and Risely documented the vocabulary development of forty-two children aged 1-3. The authors noted the production and use of language of the children as well as the language they were exposed to.  They also noted that the amount of linguistic data a child is exposed to depends greatly on the socio-economic class that they belong to. According to their study, a child from a professional household will have been exposed to about 30 million word tokens. A child from a working-class family will have been exposed to 20 million word tokens. And a child from a family on welfare will have been exposed to 10 million word tokens.

            Pullum and Scholz also report findings from Hart and Risely’s book which indicates that 30% of the speech directed at children is in the form of interrogatives. Hart and Risely also estimate that the mean length of utterances directed to children is four words long. Pullum and Scholz then argue that if we take the statistic of a child whose family are on welfare, being exposed to 10 million word tokens, divided into sentences four words long, we arrive at the conclusion that the child is exposed to 2.5 million sentences every three years. And furthermore since 30% of those sentences are interrogatives, we can argue that the child is exposed to seven hundred and fifty thousand questions every three years, i.e. a quarter of a million questions per year. In their research of the Wall Street Journal they discovered that the questions relevant to learning the structure-dependent rule occur in 1% of interrogatives in the corpus. From this they conclude that a child will typically be exposed to seven thousand five hundred relevant examples in three years. This means that the child will be exposed to two thousand five hundred examples per year; therefore on average the child will be exposed to seven relevant questions a day. They conclude their paper by asking if seven relevant questions a day is enough to learn such a rule. Furthermore, they correctly claim that if nativists think that it is not, they need to explicitly set out a learning theory which shows why it is not.

            The obvious objection to the above argument is that Pullum gets his data from The Wall Street Journal, and such data is hardly representative of the linguistic experience of the child. Pullum cites some research which shows uniformity across linguistic texts as evidence that Wall Street Journal may, in fact, be representative of the child’s linguistic experience. However, the fact that the Hart and Risely research claims that child-directed sentences are typically four words long shows this to be incorrect. The average length of sentences in the WSJ will obviously be much longer than four words.  Geoffrey Sampson’s research taken from the British National Corpus (not available in America at the time Pullum and Scholz were writing) uses samples of speech between child and parent as well as the ordinary speech of adults, so it avoids some of the difficulties of Pullum and Scholz’s research.

            In his (2002) paper ‘‘Exploring the Richness of the Stimulus’’, Samson largely agrees with Pullum and Scholz’s research; however, he claims that while his research is complementary to theirs it is not subject to the same objections. He sampled the normal conversational speech which people typically have with each other and which a child is routinely exposed to. To this end he used the British National Corpus (henceforth BNC). He used the demographically sampled speech section of the BNC which he claimed contains 4.2 million words. This section of the BNC was constructed by giving recording equipment to individuals selected to be representative of the national population with respect to age, social class, and region (2002, 3). By exploring this corpus, Sampson aimed to avoid the criticisms directed at Pullum and Scholz which claimed that their corpus did not accurately represent the data a child is exposed to when learning a construction.

            Sampson begins his discussion by making a terminological point. Whereas Pullum and Scholz use the term ‘auxiliary verb’ as something that can be the main and sole verb of a clause, Sampson calls a verb ‘auxiliary’ only if it is followed by another verb. For this reason, while Pullum and Scholz would call the following sentences auxiliary inversions, Sampson would call them ‘verb-fronting questions’.

                                     

 

                       VERB-FRONTED CONSTRUCTIONS

 

Here we will discuss what Pullum and Scholz refer to as ‘auxiliary-initial clauses’. Poverty of stimulus theorists claim that children typically will not hear examples  of questions formed by fronting verbs which in the corresponding declarative statements are preceded by complex constituents. Sampson aimed to test what people actually say when speaking to each other. He did this to help him understand whether the poverty of stimulus theorists were correct. However, when trying to analyse the data he found an unexpected complication. There are two different types of verb-fronting sentences, both of which Pullum and Scholz include in their WSJ search. These different types of constructions occur in radically different magnitudes in spoken speech.

            The first type of verb fronting is of the following form:

(1) Will those who are coming raise their hands?

(1a) Those who are coming will raise their hands

Sampson reminds us that in the above constructions, the complex constituent is the subject of the fronted verb. So he calls sentences 1 and 1a verb-fronting sentences which involve complex preverbal subjects.

The second type of verb fronting has the following form:

(2) If you do not need this, can I have it?

(2a) If you do not need this, I can have it.

Sampson reminds us that in 2 the main clause is preceded by an adverbial clause. He calls sentences like 2 and 2a ‘‘verb-fronting sentences’’ involving initial adverbial clauses. He first begins to consider questions of the form of 2 which he calls initial adverbial clauses.

                                    

                          INITIAL ADVERBIAL CLAUSES

Sampson searched for adverbial initial clauses in the BNC-demographic (which contains 4.2 million words). He claimed that his search was not exhaustive because such an exhaustive search would be extremely difficult with this grammatical pattern and the BNC corpus. He did not offer any reasons why this particular grammatical pattern would make an exhaustive search so difficult. However, he did claim that such a detailed search was not necessary since Chomsky had claimed that ‘a person might go through much or all of his life without being exposed to a relevant construction’, and that therefore finding any examples of the constructions would refute Chomsky.

            In attempting to find such examples, Sampson targeted cases where the adverbial clause begins with if. He found twenty-two clear cases of initial adverbial clauses.  He furthermore claimed that Wh-questions could also be considered relevant. Wh-questions also involve moving an auxiliary of the main clause, rather than one in the preceding adverbial clause. And he claimed that that if this class is relevant, then he had a further twenty-three cases. However, he realised that counting Wh- questions would be controversial, so he only counted the twenty-two constructions which he found for initial adverbial clauses.

            Sampson uses Hart and Risely’s estimates of how many words a person is exposed to every three years. He takes the figure they provide that a working class person is exposed to twenty million words every three years.  This choice itself is controversial; there is no reason to focus on the stimuli that a working class child is exposed to rather than the stimuli that a professional child is exposed to, or the stimuli that a child from a family on welfare is exposed to. If we accept that children from linguistically deprived backgrounds develop normal linguistic abilities, then the figure of ten million should be used because children develop such abilities despite only being exposed to this amount of linguistic data. Furthermore, if the relevant constructions do not occur in the data, and children display competence of the rules, then this shows that the rule must be innate. However, Sampson would probably reply to this that the argument relies on the untested assertion that people from linguistically deprived environments have languages as richly structured as those of ordinary members of the linguistic community. Sampson has long argued against the dogma of convergence, the view that all speakers from all societies speak languages which are equally complex. He holds that if we are to establish that children from linguistically deprived environments have language as complex as their better educated colleagues, then we will need evidence to support this claim. And he holds further that nativists have so far not provided us with any evidence of this kind.

So to avoid begging the question against either nativists or anti-nativists, it is best to start, as Sampson does, with Hart and Risely’s figure of twenty million words every three years. So let us work out the numbers. Sampson found twenty-two constructions out of a corpus consisting of 4.2 million words. Using Hart and Risely’s data, we can estimate that the average length of each construction for a child up to three is four words long. So we can estimate that Sampson’s 4.2 million words amounts to about 1.1 million sentences in the corpus. Hart and Risely estimate that a working class child will be exposed to five million sentences (of four words long) in the first three years of their lives. So if Sampson finds the relevant data twenty-two times out of  1.1 million sentences, then we can expect that he will find at least one hundred and ten examples in five million sentences. This would work out at about thirty seven relevant examples per year. So a child could expect to encounter a relevant construction at least once every ten days.[5]

            The question which Pullum and Scholz raise in their paper can be fruitfully  asked of Sampson’s results: is one example every ten days enough for the child to learn the construction? The nativist who is claiming that innate domain-specific knowledge is the only explanation for our competence in the relevant construction owes us an answer as to why we cannot learn it from one example every ten days.  Typically nativists have not met this challenge; they have merely pointed to the supposed poverty of stimulus as evidence that the construction must be innate.  However, likewise, if anti-nativists claim that the relevant construction can be learned using some kind of data-driven learning, then they owe us a model of how this is done. Assessing whether such constructions can be learned by experience will require mathematical models of how learning from such few constructions is possible. Other possible tests may involve developing computer programmes which can learn from this amount of data. Such programmes have been developed already. So, for example, Clark and Eyraud (2007), Perfors et al. (2006), Reali and Christiansen (2005) have all developed programmes which can learn from less data than discovered by Pullum, Scholz and Sampson. I will review these models at the end of this paper. Ultimately what we have learned from this data is that Chomsky’s confident assertions that children cannot learn certain constructions from the data they experience have not been justified with enough evidence.

            The other type of verb fronting which Sampson discusses is the type of construction where the complex constituent is the subject of the fronted verb.  An example of this type of construction is:

 (1) Those who are coming will raise their hand.

(2) Will those who are coming raise their hand?

Here Sampson found some surprising results. Sampson discovered that on this point Chomsky was correct. In the 4.2 million word BNC, Sampson found no constructions of the relevant kind. However, he did not view this as providing support for Poverty of Stimulus theorists. He claimed, on the contrary, that the reason that the construction did not occur in the BNC is because the construction is not an idiom of ordinary English speech. It is rather an idiom of written English.

            Sampson’s search of the speech-directed portion of the corpus showed that the relevant construction never occurred in 4.2million word tokens. He did not do an exhaustive search of the written-language section of the BNC; instead he merely provided examples from random searches of the corpus. Here are some of the examples he found:

(14a) Did the fact that he is accompanied by a doctor on the campaign trail help to lose him last week’s TV showdown with Clinton? CAT.00742 (Punch magazine, 1992)

(15b) Did Mr Mortimer, 69, who has an Equity card, enjoy himself? CBC.08606 (Today newspaper, 1992)      

(16c) ‘Is the lady who plays Alice a child or a teenager?’ asked my six-year-old’ B0300647. (Alton Herald newspaper, Farnham, Surrey, 1992)

(17d) Is a clause which is known to be unenforceable in certain circumstances an unreasonable one? J6T.00908 (R. Christou, Drafting Commercial Agreements, Longman, 1993)

(18e) Will whoever is ripping the pages out of the stony new route book please grow up. CG2.1379 (Climber and Hill Walker magazine, George Outram and Co., Glasgow, 1991)

(2002, 18)

Sampson thinks that these examples show that children do not typically form questions using auxiliary fronting when speaking. He argues that constructing questions using auxiliary fronting is restricted to written questions. However, he does not provide any evidence as to how often such constructions occur in written work. His primary point is that, in order for an APS theorist to use a lack of examples in speech of  yes/no questions  formed by fronting a main-clause verb as evidence for innate knowledge, they have to rule out the possibility that children learn the rule from written language.  The fact that people will judge certain constructions as grammatical, despite not encountering them in spoken language, is not that important if the person has encountered them in written language. Here, in short, Sampson is shifting the burden of proof onto the APS theorist to show that the child cannot learn such rules from written language. And in absence of such a proof, he is assuming that the APS does not hold.

            What Sampson and Pullman and Scholz have shown is that the “parade case” of APS as put forth by Chomsky does not offer clear evidence at all.  Obviously much more research is needed on the topic. The important point to note is that this APS has been shown to be incorrect in claiming that children learn a particular rule in the absence of experience. Hence, this particular APS does not establish that Quine’s conception of language learning is incorrect. The question of the viability of Quine’s story of how the child learns his first language remains open. Nor of course can this APS be used to support Chomsky’s claim that we need to postulate innate domain-specific knowledge to explain language acquisition.

                                    SECTION 3: NEGATIVE EVIDENCE

In order to achieve a complete picture of what contemporary evidence tells us about the debate between Chomsky’s and Quine’s picture of language learning, we will now need to evaluate what the state of play is in regard to the issue of negative evidence. Most linguists believe that the issue of negative evidence is crucial to understanding language acquisition. The issue of negative evidence centres on the fact that children do not typically encounter ungrammatical sentences which are marked as such. A child will not, for example, hear a sentence such as ‘Is the child who beside the man is happy?’ along with a tag to indicate that the sentence is deviant. So the question arises as to how children know that these sentences are ungrammatical. The children are not presented with these sentences and told they are ungrammatical. Nor (so the theory goes) do they produce these ungrammatical sentences only to be systematically corrected by their peers. Hence, it is argued that the only way to explain how a subject tested by a linguist can clearly tag certain sentences as grammatical, and certain as ungrammatical, is to postulate innate domain-specific linguistic knowledge.

A key premise in the above argument is that children are not systematically corrected for their grammatical mistakes by their peers. This claim goes back to the experimental research of Brown and Hanlon (1970). In particular, Crain and Nakayama (1987) test the claim that children try out grammatical theories and weed out the false ones through explicit teaching from their peers.

Crain and Nakayama elicited yes/no questions from children between the ages of 3;2 and 5;11 in response to prompts such as Ask Jabba if the boy who is watching Mickey Mouse is happy. They found that (with different frequencies at different ages) children sometimes produced correct forms such as (15a) and they sometimes produced various incorrect forms, one example being (15b). However, they never produced the kind of incorrect form predicted by the ‘structure-independent hypothesis’ such as (15c). They offered this as support for the theory of innate linguistic knowledge[6].  Below are the two examples which children sometimes produced, and the structure-independent hypothesis which children never reproduced:

(15a) Is the boy that is watching Mickey Mouse happy?

(15b) Is the boy who’s watching Mickey Mouse is happy?

(15c) Is the boy who watching Mickey Mouse is happy? (2002, 20)

 

Examples like (15c) are the type of production which one would predict based on the structure-independent hypothesis. However utterances like these do not ever occur. In particular they claim that this shows that children are innately predisposed to prefer structure-dependent rules to organise the data of experience (Sampson 2002, 20). So here the issue is that children do not try out constructions like (15c) and have them criticised by their peers.  They automatically construct questions like (15a) and (15b) which are structure-dependent rules. That is, independent of poverty of stimulus considerations, Crain and Nakayama claim to have shown that children do not construct structure-independent rules, and so do not receive any negative evidence that sentences (15c) are ungrammatical. If we add to Crain and Nakayama’s claim the fact that children do not hear sentences like (15c) spoken yet know that they are ungrammatical, we have an argument for innate knowledge based on a lack of negative evidence.

            The argument of Crain and Nakayama is of vital importance to this paper. It offers support to Chomsky’s claim that children are born with an innate language faculty. It also contradicts Quine’s picture of a child learning the rules of syntax through positive and negative reinforcement. Obviously, if children do not utter constructions such as (15c), then Quine’s claim that such constructions are shown to be incorrect through negative reinforcement must be false[7]

            The supposed lack of negative evidence in the instance of auxiliary inversion may not be as damming to Quine’s picture of language learning as it appears. The data which Pullum and Scholz gathered from the Wall Street Journal indicates that in the case of subject-auxiliary inversion, children encounter about seven relevant constructions every day. Using statistical reasoning,[8] the child exposed to this type of experience from passive observation alone would within a few days of birth have evidence that the structure-dependent hypothesis was superior to the structure-independent one. A child who was unconsciously analysing the data of experience would then not even try the structure-independent rule of (15c) though he may try structure-dependent rules such as (15a) and (15b).

            However this argument fails when one takes account of Geoffrey Sampson’s work. His data shows that examples of verb-fronted sentences (excluding cases where the subordinate clause precedes the main subject which we discussed above) in actual speech is zero. So the child cannot learn the rule inductively. If Crain and Nakayama are correct that children never try out the barred interpretation, then this indicates that they are correct that the rule is innate.

            So it could be argued that Crain and Nakayama’s result combined with Sampson’s show that Quine’s conception of language acquisition is incorrect. A critic could argue that Sampson and Crain and Nakayama’s research shows that Quine is incorrect because they show that induction and negative and positive reinforcement do not play any role in learning that this particular rule is incorrect. However, given that Crain and Nakayama’s experiment only relates to the subject-auxiliary inversion, it could be argued that it is obviously not equipped to rule out stimulus/response learning entirely either. However, there is a more fundamental reply which could be raised to this experiment. The reply is Geoffrey Sampson’s and uses data from how people actually speak which is used to cast doubt on Crain and Nakayama’s experiment.

            Sampson has raised objections to this experiment based on his discoveries that children do not typically form questions using auxiliary inversion. He correctly notes that based on his corpus research children would not be expected to reply in the manner they do in the experiment. Crain and Nakayama use the fact that children never try out (15c) to support their claim that children do not use the structure-independent hypothesis. However, Sampson points out that in ordinary speech, as revealed by his corpus analysis, people do not use auxiliary inversion to form questions. According to his analysis, children should form the question in the following manners:

(16a) Is he happy, the boy who’s watching Mickey Mouse?

(16b) The boy who’s watching Mickey Mouse is happy, isn’t he?

Sampson correctly notes that, since we know from the corpus analysis that children do not typically form questions like (15a,b) in speech, it is odd that children would answer in this way in the experiment. He points out that Crain and Nakayama only give figures for children’s ‘correct’ question formation, so it is impossible to tell whether children tried out ‘(16a), and (16b)’. He further speculates that the fact that children use an idiom of speech not ever found in ordinary discourse may indicate that the children were primed for the experiment. Sampson’s discussion does not refute Crain and Nakayama’s experiment; however, it does demonstrate that the experiment is far from conclusive. So Crain and Nakayama’s experiment does not refute Quine’s trial and error position of language acquisition. The Crain and Nakayama experiment would need to be replicated and done in different cultures to be viewed as anything more than a suggestive idea. The other experiment which is typically offered as evidence that children do not learn their language through trial and error is Brown and Hanlon’s paper.

Brown and Hanlon’s 1970 paper purports to show that children do not learn by explicit instruction. However, their paper has negligible impact on Quine’s position, because behaviourism is not committed to reinforcement being explicit. In fact contemporary research in psycholinguistics supports the view that much language instruction is implicit rather than explicit. In the section below I will enumerate experimental research which bears on this point.

                     SECTION 4: EVIDENCE OF NEGATIVE EVIDENCE

 

In their 2003 paper ‘‘Adult Reformulations of Child Errors as Negative Evidence’’, Chouinard and Clark constructed an experiment which was designed to test whether adults were implicitly instructing their children about the rules of their language. The test aimed to discover if adults were using side sequences and embedded constructions as ways to correct the children’s utterances. On pages 9 and 10 of their paper they gave the following examples of side sequences and embedded corrections.

An example of a side sequence: (indented sequence is the correction).

(1) Roger: now-, um do you and your husband have a j-car.

           Nina: have a car?

            Roger: yeah.

       Nina: no-                                               (Startvik and Quirk 1980: 8.2a 335)

An example of embedded corrections.

(2) Customer in a hardware store looking for a piece of piping:

Customer: Mm, the wales are wider apart than that.

Salesman: Okay, let me see if I can find one with wider threads.

(looks through stock) how’s this?

Customer: Nope, the threads are even wider than that.

 

They claimed that adults made use of side sequences and embedded corrections to correct children’s errors and to keep track of what the children meant to say. The adult reformulations indicate to children (a) that they have made an error, (b) what the error was, and (c) the form needed to correct the error. 

In their experiment they set out to test the following four claims:

(1)Negative evidence is available in adult reformulations.

(2)Negative evidence is available to children learning different languages, and for different types of errors.

(3) More reformulations are available to younger children.

(4) Children detect and make use of the corrections in reformulations. (2003, 12)

 

                              METHODS USED IN THE EXPERIMENT

The experimenters got their data from five corpora in the CHILDES Archive. Three of the children were acquiring English (Abe from the Kuczaj corpus, Sarah from the Brown corpus, and Naomi from the Sachs corpus) and two were acquiring French (Philippe from the Leveille and Suppes corpus and Gregoire from the Champaud corpus) (Ibid., 13).  In order to analyse child errors, the experimenters included all spontaneous child utterances in the transcript, with the exception of utterances with unintelligible speech and child utterances preceded or followed by unintelligible speech on the part of adults. The experimenters first tested whether the children’s strings were adequate. If the string contained an error, they categorised what sort of an error it was, i.e. morphological, syntactic etc. They then checked whether the next adult utterance was a reformulation. The utterance was a reformulation if it repeated in corrected form the portion of the child’s utterance which had contained an error. They further coded the correction by noting whether it was side sequence or an embedded correction. They finally checked whether children took up the repeated change that had been made, rejected it or tacitly accepted it.[9]

            For the analysis of conventional child utterances, they took a random sample of 200 utterances for every age slice for each child. They identified all the error-free child utterances in the sample and tabulated how many of the constructions were replayed by the adult in the next turn. If the adult just repeated what the child said, they called it a replay (ibid., 14). They got two different researchers to code each transcript and they agreed on their codes 90% of the time. Where the two researchers disagreed, they resolved their disagreement by discussion.

            Once they had coded the transcripts, they coded each of the lines for detailed analysis. For each of the children, they enumerated the total coded lines and the total number of erroneous utterances. They then divided the data into age slices to track developmental trends.

                                        RESULTS

The following is a list of their results which bear on the five hypotheses which they put forth at the beginning of their paper.

(1) Negative evidence is available in adult reformulations.

They devised a table to represent the four different age slices of the three children in the English corpora, and they divided the adult replies into conventional and erroneous. They found that adults repeated erroneous utterances far more than they repeated conventional utterances. On average, they repeat erroneous utterances more than twice as often as they repeat conventional utterances. More interestingly, the percentage of corrections of erroneous utterances is extremely high. In the age slice 2.0-2.5, (of the three English-speaking children) the following pattern was observed: Abe had 67% erroneous utterances reformulated, Sarah had 65%, Naomi had 48%. In the French corpora, Philippe had 67% reformulations, and Gregoire had 60%. So in the age range 2.0-2.5, most of the children had at least 60% of their utterances reformulated, and Naomi, who had the lowest number of reformulations still received almost 50% corrections of her reformulations. Out of the other age-slices, the lowest number of reformulations for incorrect utterances was for the ages of 3.6 to 3.11. Here there was not enough data to complete the French reformulations; however, the reformulations for the English speakers were as follows. Abe received 28% reformulations for incorrect utterances, Sarah 41%, and Naomi 20%. So it is certain that children do receive reformulations of incorrect utterances. Even in the worst case that of Naomi between the ages of 3.6-3.11 20% of her incorrect utterances were corrected. Below, I discuss whether 20% correction is enough for a child to learn various different rules. Of the corrections given side sequences made up the majority of corrections the children heard as opposed to embedded corrections. Chouinard and Clark (ibid., 21) claimed that of the five children Abe, Sarah, Naomi, Philippe and Gregoire, the amount of side sequence corrections respectively was 57%, 70%, 70%, 73%, and 62%. In other words the reformulations in the majority of cases were designed to check what the child had meant.

NEGATIVE EVIDENCE IS AVAILABLE FOR CHILDREN LEARNING DIFFERENT LANGUAGES, AND FOR DIFFERENT TYPES OF ERRORS

The study found that negative evidence was available for each of the children whether they were French or English. Furthermore, negative evidence was used at comparable rate whether the error was phonological, morphological, lexical or syntactic. And again, reformulations occurred at a much higher rate than repeating of conventional utterances.

MORE  REFORMULATIONS ARE AVAILABLE TO YOUNGER CHILDREN

In general, this prediction was shown to be correct. Adults tend to decrease their reformulations as children get older and make fewer mistakes. However, there was one exception to this trend: as Naomi got older, her errors were reformulated more. So this question needs to be looked into further.

CHILDREN DETECT AND MAKE USE OF THE CORRECTIONS IN REFORMULATIONS

Obviously, just because adults use reformulations it does not follow that such reformulations are understood and used by the children. Evidence that children understand and use such reformulations can only be discovered by noting how the children respond to them. Chouinard and Clark discuss four possible ways that children could respond to an adult reformulation: (1) they can take up the reformulation explicitly by repeating it and, in doing so, correcting at least part of their original utterance; (2) they may overtly reject the adult’s reformulation, thereby signalling that the parent has misinterpreted what the child intended, and when the parent tries a different reformulation the child may accept it; (3) after hearing a conversation, they may acknowledge it at the start of the next turn in the conversation. (4) they can simply continue with the conversation without overtly acknowledging the change or taking it up; such continuations could be counted as tacit acceptances of adult reformulations. Overall, the responses where children acknowledged a reformulation or repeated new information, alongside those where they either took up or else rejected the formulation were as follows: Abe 56%-72%, Sarah 25%-38%, Naomi 39%-100%, Philippe 39%-75%, Gregorie 25%. By any standard, this shows that children do attend to reformulations a sizable percentage of the time.

                                          GENERAL DISCUSSION

Four of the children used in the experiment had a parent who was college-educated; however, as Chouinard and Clark acknowledge, it is unclear whether this experiment will generalise across social classes.[10]Furthermore, only two cultures were used in the experiment, so it is unclear whether the experiment can generalize across cultures. Chouinard and Clark further discuss the oft cited evidence from Ochs and Schieffelin (1984), who claim that in some cultures negative evidence is not presented to children, because the child’s parents do not interact with the child till they are competent speakers. Their paper is of central importance because it is usually offered as key evidence in favour of the nativist’s argument for innate domain-specific linguistic knowledge. Usually nativists will point to Brown and Hanlon’s 1970 paper, as well as Crain and Nakayama’s 1987 paper, to demonstrate that children to not receive explicit negative evidence. However, typically anti-nativists point out that, while there is evidence that children do not receive explicit instruction, they do in fact receive implicit instruction. To this nativists reply that, while this may be so in our culture, it is certainly not so in all cultures. To demonstrate this point, they cite Och and Schieffelin’s paper. It is claimed that since all of the members of the various different human cultures learn a language, and only some receive negative evidence, then negative evidence is not a key factor in learning language. This objection is clearly relevant to the work of Chouinard and Clark as they only considered two cultures in their experiment. However, Chouinard and Clark consider this objection (ibid., 39) and offer a criticism of it. Ochs and Schieffelin had claimed that in Kaluli and Samoan cultures parents do not converse with children who are not yet competent users of language. They claim that, contrary to what is typically believed, Ochs and Schieffelin’s paper is in fact largely consistent with their own findings. It is true that in the Kaluli and Samoan cultures adults do not converse with children who are not yet competent speakers of the language. However two points need to be made about this. Firstly, the fact that adults do not engage in conversation with children of this age may not be as important as sometimes considered if older children in the community converse with the younger children of the community.[11] Secondly, even if parents do not converse with children who are not yet competent users of language, it does not follow that they do not correct their language use. Chouinard and Clark cite a section of Ochs and Schieffelin which indicates that children of the Kaluli do indeed receive negative feedback:

Kaluli mothers pay attention to the form of their children’s utterances. Kaluli correct the phonological, morphological, or lexical form of an utterance or its pragmatic or semantic meaning. (1984, 293)

 

Chouinard and Clark claim that in the Kaluli culture feedback takes the form of adults telling the children what to say on different occasions. So, for example, the adult would prefix a child’s statement with the instruction ‘elema’ (meaning ‘say like that’). If the child makes a statement which is grammatically incorrect when talking to another person, the adult will face that person and say ‘elema’ followed by the grammatically correct utterance. So clearly in this culture explicit instruction is used to teach the child how to speak.  However, contrary to what is typically reported, Ochs and Schieffelin do not provide evidence against anti-nativist theories of language acquisition.

            The overall conclusion of Part 1 of this paper is that contemporary evidence is still largely consistent with the picture of language acquisition sketched by Quine in the 1960’s. The arguments by Chomsky and those influenced by his paradigm have not shown that Quine’s view of language acquisition is incorrect.  However, there are some nativists who claim that Pullum, Scholz, and Sampson, et al. are guilty of attacking a straw man.  It is claimed, that the APS they criticise is not really the APS which Chomsky uses. In Part 2 of this paper I will consider whether the alternative conception of the APS argued for by John Collins avoids the counter arguments by Pullum et al. I will further analyse how this alternative APS affects Quine’s picture of language acquisition. I will then discuss Chomsky’s latest attempt to defend the poverty of stimulus argument and will demonstrate that it fares no better than earlier versions of the argument.

                                PART 2: CONCEPTIONS OF THE APS

                                            INTRODUCTION

In this part 1 of this section, I consider an argument by the philosopher and Chomsky scholar John Collins which purports to show that Pullum et al. have misunderstood the nature of Chomsky’s APS. Collins directs his arguments against Fiona Cowie, a philosopher who in her book What’s Within, uses Pullum and Sampson’s data to argue against Chomsky. Since Cowie uses Pullum’s reconstruction of Chomsky’s APS Collins’s reply to her can be taken as a reply to Sampson, Pullum and Scholz.

            I will illustrate precisely what Collins takes the real APS to be. I will then show that it is Collins who misconstrues the APS, not Pullum et al. I will therefore argue that, contrary to what some Chomskians claim, Pullum et al. raise serious difficulties with the research program of generative linguistics. Furthermore these difficulties with Chomsky’s programme show that despite the rhetoric employed by Chomsky, Quine’s alternative conception of language acquisition is still very much a live option.

                            SECTION 1: THE REAL APS?

 

John Collins’s paper ‘‘Cowie on the Poverty of Stimulus’’ is a review of Fiona Cowie’s book What’s Within. Collins focuses on the sections of her book which criticize Chomsky’s poverty of stimulus arguments.  He argues that Cowie has seriously misunderstood the aims and methods of Chomsky’s paradigm in linguistics.  Much of Cowie’s criticisms of Chomsky are derived from data from a 1996 paper which Pullum gave called ‘‘Learnability, Hyperlearnability and Poverty of Stimulus’’. In this paper, Pullum argued that the data which a child is exposed to when learning his first language is not as impoverished as Chomsky and his followers claimed.  While Cowie presents Pullum as claiming that his data refutes nativism, Pullum in a later paper (2002 co-authored with Scholz) chastised Cowie for this, and argued that their data only show that more research is needed into the child’s PLD, not that nativism has been refuted. I am including Collins’s criticism of Cowie in this section, because his reply to Cowie’s use of Pullum’s data indicates that he thinks that both Cowie and Pullum have misconstrued the nature of the APS. Other authors, for example Legate and Yang (2002), try to meet Pullum and Scholz’s challenge by showing that the data Pullum and Scholz discovered is, in fact, insufficient to learn the structure-dependence rule.  Collins, in contrast, argues that Pullum’s construal of the APS is incorrect and that because of this his data does not cast doubt on the real APS.

            Thus, for example, Collins writes:

Cowie, to be fair, does have Pullum’s reconstruction of the ‘Chomskian argument’ in mind. Pullum presents the argument so as to refute it, but Cowie finds it an ‘irresistible target’, for it is ‘so much more clearly and forcefully stated than the nativists own versions’. The nativists’ versions are not ‘clearly and forcefully stated’, I have suggested, because no-one serious is interested in knock-down arguments; there are certain empirical and theoretical constraints and a substantive proposal to satisfy them.  (2003., 21-22)

 

I turn next to Collins’s criticism of Pullum’s APS

            COLLINS ON PULLUM AND SCHOLZ’S VERSION OF THE APS

 

Pullum and Scholz’s version of the APS is admirably clear. It isolates the key empirical premise of the argument, and proceeds to analyse the empirical data to check whether the key premise is in fact correct. 

            Collins however does not view Pullum and Scholz’s APS as representing Chomsky’s real APS. One of his first criticisms of Pullum and Scholz’s construal is that it implies that Chomskians are searching for a knockdown argument which will prove that language is innate. So, on their conception, a key premise of the APS is that a child has knowledge which he could not have learned from his environment. According to them, if this key premise is proven correct, then Chomsky has a knockdown argument for innateness.  We have seen in Part 1 of this chapter that Chomsky does indeed use auxiliary inversion to support his claim that children know a rule of language that they could not have learned from their linguistic environment. Collins does not deny that Chomsky has sometimes argued in this manner. However, according to Collins, when Chomsky argues thusly, he is merely using the auxiliary inversion as an expository device to indicate the way linguists’ reason. Chomsky did not intend to use it as a knockdown argument against the empiricist.

            Critics of Chomskian nativism have expressed frustration with this mode of arguing. They argue that every time a candidate is presented as evidence for nativism, and is then shown to be inadequate, the nativist replies that this example was not the real argument for nativism. Thus Cowie vents her frustration at this perceived dishonest mode of nativist argumentation:

The nativist- say, Chomsky- articulates a version of the argument. The empiricist counters it by pointing to its evidential short-falls and/or its failure to do justice to empiricism’s explanatory potential. But no sooner is one rendition of the APS cut down than myriad other variations on the same argumentative theme spring up to take its place. For every non-obvious rule of grammar (and most of them are non-obvious), there is an argument from poverty of stimulus standing by to make a case for nativism. And for every such argument (or at least for the ones I have seen), there are empiricist counter examples of exactly the kinds we have reviewed in this chapter, waiting, swords at the ready, to take it on.  (1999, 203)

 

When Cowie speaks of a line of rules being put forth and refuted by the empiricist, she is clearly thinking of Pullum and Scholz’s version of the APS. Such postulated rules are the subject-auxiliary inversion, subject (verb-object) asymmetry, anaphoric one etc. The frustration which Cowie feels is that the subject-auxiliary rule was used by Chomsky on countless occasions, and by many others influenced by him. It is unquestionably used as the paradigm example of an innate rule,[12] so to be told that it is not the real argument is frustrating to say the least.  However, Cowie massively overstates the strength of her position here. The paradigm example of auxiliary inversion is taken to indicate that a child has an innate preference for structure-dependent rules. Pullum and Scholz have shown that Chomsky has overstated the strength of his position by claiming that occurrences which would indicate that the structure dependence rule is the correct one are vanishingly rare.  From the fact that Chomsky has overstated the strength of his position, it does not follow that the empiricist has proven that he is wrong.  Pullum and Scholz themselves correctly claim that their data is suggestive at best, and they call for further studies into the PLD. Their research certainly is not a knockdown argument against nativism; it is rather a timely reminder that, polemics aside, the case for nativism has not been proven. In this sense, Legate and Yang (2002) is an attempt to provide justification for the nativist research programme by situating it in a comparative setting. They consider a construction that both sides admit is learned (the use of null subjects) and analyse the PLD empirically to see how many examples of null subjects the child is presented with when learning the rule. They then compared the result of this with the amount of times the child is presented with the subject-auxiliary inversion. Based on their comparison, they claimed that the child was presented with less than half the amount of evidence in the case of auxiliary inversion. So they argued (incorrectly in my view) that the data the child was presented with was not enough for the child to learn the structure-dependent rule of question formation. I do not want to discuss Legate and Yang’s paper here. I merely want to show that Cowie’s claim that each innate rule postulated by Chomsky et al. has been refuted by attention to the PLD is false. On the contrary, all that has been shown is that more attention to the PLD needs to be given by both nativists and empiricists.  Legate and Yang’s paper can be seen as a nativist response to this request by empiricists such as Pullum.

            Collins obviously would not deny that more research needs to be done on the PLD. He would presumably welcome Yang and Legate’s attempt to answer the criticisms of Pullum and Scholz. However, he would also argue that Cowie and Pullum and Scholz have misunderstood the nature of Chomsky’s APS and, that replies such as Legate and Yang’s, while useful, concede too much to the empiricist by accepting their reconstrual of the APS.  Using Pullum and Scholz’s neat deductive reconstrual of the APS, Cowie seems to view nativists as merely putting forth an empirical premise into the argument, only to have it refuted, and to then put a different empirical premise in response to this and so on. Each time the empirical premise is shown to be false, another one is added as quickly. Collins argues that this way of viewing the matter badly misconstrues things; on his view of the matter, the APS is not the deductively neat argument that Pullum and Cowie et al. think that it is.

            Collins’s reconstrual of the APS is less aesthetically pleasing than Pullum’s deductive version. However, it does represent a type of APS which Chomsky has used from time to time. Nonetheless, I will argue that Chomsky primarily uses the APS which Pullum et al, critique.

            Collins begins by noting certain features of our linguistic competence. He notes the obvious fact that all humans (bar congenital defect) acquire a particular language, while no other animal does. Furthermore, if you move children from their birth place to another country, they will end up speaking a different language, while all other animals will speak no language no matter where they are brought up.  He acknowledges that these considerations do not militate towards nativism; on the contrary, they merely show that the language we do acquire must be acquired as a result of some innate species-specific machinery. This fact is merely a truism and is accepted by both sides of the debate. It in no way shows that innate domain specific knowledge is required for a child to learn his first language.

            However, the above facts do indicate a problem which any linguistic theory worth its salt must solve. The problem in a nut shell is this; how do we construct a theory of language acquisition which is both descriptively and explanatorily adequate. Collins puts it in the following way:

The descriptive adequacy, therefore, of a general theory of linguistic competence would appear to involve a delineation of the seemingly infinite variety of languages upon which a child may fixate. On the other hand, if our general theory is to be explanatorily adequate, then we need to explain how a child may fixate on any point in this infinity without any such point being favoured prior to the child’s exposure to language. (2003, 30)

 

Any theory of linguistic competence needs to deal with this criterion. Collins correctly notes that these constraints do not of themselves tell us what (1) the child’s initial state is, (2) what his final state is, nor (3) what data the typical child is exposed to which helps it move from the initial state to the final state. If there were only one language, for example English, we could answer questions (1) and (2) instantly. The child’s initial state would be English and his end state would also be English. According to Collins, question (3) would also be answered, because the child would need no data to decide amongst languages, as there would only be one language that the child could represent.

            Obviously there are more languages than English. In fact, if one considers English in different epochs, the English spoken by Chaucer, by Shakespeare, by Orwell then one must face the fact that English itself consists of more than one language. Rough estimates make it 7000 languages today[13]. If one wants to explain a child’s linguistic competence, then he needs to account for the fact that a child born in a different place or time could learn any of the 7000 languages spoken today. Furthermore, the child could learn any of the different languages spoken in other eras, or the various possible languages to be spoken in the future.  So a descriptively adequate theory will have to account for children’s linguistic competence which enables them to acquire the different types of possible or actual human languages. What is the initial state of the child that makes it possible for him to grasp any of these languages he is exposed to? If in order for the child to learn a language he needs the capacity to represent the rules of the language, then the more languages there are which the child can learn, the more inclusive must be the child’s initial ability to represent the grammar. The child will of course also need to have the capacity to represent the rules of possible human languages. So we will need finer and finer data in the particular child’s environment to help the child decide which language he is supposed to learn. This data will also need to be so detailed that it stops the child from keying into other languages that it is possible for him to learn.

            The difficulty with this approach is with the necessity of postulating richer and richer data to explain the child zeroing in on their grammar.  The reason that this is a problem becomes apparent when we consider the data which a linguist has at his disposal when he is trying to discover the nature of UG or of a particular I-language such as English.  The linguist has as much data on the grammar that he could wish for. He has the ability to reflect on it theoretically. He can compare the language with a variety of other languages. Yet, despite all of this data, the linguist still cannot discover the rules which govern the English language.  Collins claims that if we argue that the child learns his language through data-driven learning we will be claiming that the child who learns English has enough data to figure out what linguists have been unable to figure out over the last 2000 years:

But here’s the rub! The linguist has as much data on the grammar of English, say, as he could wish for, he also has the capacity to reflect on it, theoretically or otherwise, and the advantage of comparing it with data from other languages, but he still cannot figure out the grammar of English – that is inter alia, we have linguistics for! If, then, we content ourselves with the bland remark about nativism, we are led to think of the child who successfully acquires English as having enough data to figure out what self-reflective linguistic inquiry has been banging its head against for the last couple of millennia. Something is wrong. (ibid., 4)

 

 He argues that the only explanation for the child achieving what linguists cannot achieve through thousands of years of inquiry is the postulation of children being born with innate apparatus:

What the child’s innate equipment is required to do, it seems, is actively constrain its ‘choices’ as to what is part of the language to be attained. But no child is wired to target any particular language: the child can make the right ‘choices’ about any language with equal ease. This suggests that children must begin with ‘knowledge’ specific to language, i.e., the data to which the child is exposed is ‘understood’ in terms of prior linguistic concepts as opposed to general concepts of pattern frequency, say. If this is so, then we can see how a child may acquire a language even though the data itself is too poor to determine the language: the child needs no evidence for much of the knowledge it brings to the learning situation. In crude terms, children always make the right ‘hypotheses’ as a function of their genetic endowment. Thus, since the child can fixate on any language in the face of a poverty of stimulus about each language, and all languages are acquirable, children all begin with the same universal linguistic knowledge. This is the poverty of stimulus. (ibid., 5)

 

 

                 

                 SECTION 2: THE STRUCTURE OF COLLINS’ APS

So Collins’s reconstrual of the APS is as follows:

P1: Language is either acquired through data-driven learning or innately primed learning.

P2: All human children acquire language.

P3: No non-humans acquire language.

C1: Therefore language is acquired because of a unique property of human children not shared with non-humans.

P4: The range of languages it is possible for human children to acquire is infinite.

P5: All linguists using data-driven learning have not discovered a complete grammar of one language.

P6: All human children with less data available have acquired a particular language.

P7: Therefore either human children are smarter than linguists or human children do not acquire language through data-driven learning.

P8: Human children are not smarter than the linguists.

C3: Therefore human children do not acquire language through data-driven learning.

P9: If human children do not acquire language through data driven learning, then the fact that the child acquires a particular language as opposed to other possible languages cannot be explained through data-driven learning.

C4: Therefore human children acquire their particular language through innately primed learning.

 

The first three premises of Collins’s argument are correct and the conclusion is true as well. However, I have serious difficulties with Premise 4. Nothing in either Chomsky or Collins’s argument has proven that the number of languages it is possible for humans to acquire is infinite. A more sensible claim for premise 4 would be that there are an extremely large number of languages which it is possible for people to learn. Furthermore, it is difficult to see how this claim can be fitted into the overall structure of the argument. Or rather, it is obvious what role the claim is meant to play in the argument, but it is difficult to fit this role into our argument schema.  The role it plays is that once we have shown that  the child does not learn his language through data-driven learning, then it is difficult to see how the child arrives at the particular language he does, as opposed to the countless other possible languages he is capable of learning. The Collins/Chomsky solution is that the child is born with certain universal principles which are subject to parametric variations, and this explains the possible languages which humans can learn. So the child is born in the initial state UG and his experiences trigger various different parameters which results in the child arriving at his steady state, his I-language, i.e. English, French etc.  If one took Premise 4 and Premise 8 out of the argument, it would still go through as valid because of Premise 1 and C3. Given that the overall argument could go through without P4 or P8 one may want to ask why the premises are in the argument in the first place. The answer is, that without the premise, our theory will not be explanatorily adequate, i.e., it will not explain both the diversity of languages acquired and the mode of acquiring them.  So we will want our meta-argument to express that our object argument is designed to meet the criterion of descriptive and explanatory adequacy.

METACRITERION: An argument for an innate language faculty must match the criterion of descriptive and explanatory adequacy.

With our object argument and our meta-criterion in place, we have Collins’s APS ready to evaluate. The argument has at least seven premises, some of which are uncontroversial, like P2, and some more controversial like P4. Some of the premises are disjunctive, and may seem controversial because they leave out alternatives and assume that certain processes can only occur in one of two different ways.

            Premise 9 and C4 are key aspects of the argument. Premise 9  states that if the child does not acquire language through data-driven learning, then without innate domain-specific knowledge, we cannot explain how the child arrives at the particular language he does as opposed to the countless other languages it is possible for him to acquire. This is certainly true. If the child does not learn from the PLD then there is no reason bar innate constraints that he would target the correct language.  C4 states that the child does not learn his language from the PLD so must therefore key in on the correct language through innate domain specific knowledge. C4 is derived from P5-8 which states that using data-driven learning linguists working over 50 years have not converged on the correct grammar for English. Given that each child acquires English in a few years with much fewer data available to them, it follows that unless children are smarter than linguists they did not learn their language from the PLD.

            Overall, the argument as set out by Collins is not very convincing.  It does not amount to a deductive proof or a knockdown argument as Collins acknowledges. The argument aims to set out the facts of language acquisition which we need to build our theory around. For example, P5-8 does seem to indicate that unless children are smarter than the thousands of linguists working on generative grammar over the past fifty years, then they cannot be learning the language from the PLD. However, given that children born in different linguistic environments do arrive at different languages, the PLD is obviously a factor in how children learn their language. The tension between these facts of language acquisition is what a linguist needs to accommodate. However, to do this, one needs to set out what the structure of each I-language is, what they have in common, and where they differ. The theory of Principles and Parameters, which states that children are born with a UG that consists of fixed principles some of which are subject to parametric variation, aims to accommodate this fact. On this theory, the different parameters are set by experience, while the universal principles are innate.

            Hence, in Collins’ view, the APS does not depend on every child lacking this or that datum. As Collins construes the APS, it depends on the fact that linguists have access to a much greater PLD than children do. Yet children can quickly arrive at the grammar of their language while linguists over generations have failed to isolate the correct grammar of any language.  Nonetheless, Collins is not claiming that facts about the PLD are unimportant for the generative grammarian. Rather, he is claiming that we can only sensibly interpret the importance of each particular datum in the light of facts about UG and the particular I-language of a particular speaker. As set out by Collins, the APS is overcome by the postulation of a UG. This UG consists of invariant principles that a child is born with and that are subject to parametric variation, depending on the experiences of the child. Thus a principle of UG would be that all phrases have a head and a complement. The child is born knowing this. However, it is the child’s experiences with their PLD which determine whether the phrases are head first (English) or head last (Japanese).

            So the order of explanation would be the following. First, discover the structure of particular languages and try to ascertain what principles are shared by the different languages of the world.  Second, discover the ways these languages differ from one another, and construct a theory in terms of parametric variations that can explain these differences. When one has the bones of the principles and parameters theory set up, one is then in a position to explain how this or that datum results in a particular construction being acquired. As presented by Collins, the APS is a set of considerations which leads one to postulate a UG subject to parametric variation. The particular details are to be formulated within linguistic theory as the various different principles and parameters are discovered. Each discovery will either tell for or against the solution to the APS put forth by Chomsky et al.

            Collins’s version of the APS does not work. The disparity between the child’s ability to learn from their PLD and the linguist’s ability to construct an explicit grammar of this language need not be explained in terms of innate domain-specific knowledge. We do not need to claim that a two-year-old child is smarter than teams of linguists researching grammar over thousands of years. Nor do we have to claim that the child has more data available to him than the linguist. The difficulty with Collins’s argument is that it equates an organism’s ability to acquire a competence in x, with an organism’s ability to form an explicit theory of his competence in x. It does not follow that because an organism has difficulty in forming an explicit theory of a particular competence x by extensively studying datum Y that competence in x cannot be acquired from datum Y. Collins’s argument therefore fails because it unjustifiably equates an explicit theory of a competence with an implicit ability to acquire a competence. It is possible, for example, that children use unconscious statistical abilities which help them learn the rules of their language from our PLD. These statistical abilities may not be accessible to consciousness. Our ability to unconsciously detect patterns in our environment may outstrip our ability to construct explicit theories about these patterns. This bare possibility could turn out to be empirically false; however, whether it is or not is an empirical question. Collins’s argument, as he stated it, gives us no reason whatsoever to hold that innate domain specific knowledge must be wired into the child.

                   

                SECTION3: WHICH APS DOES CHOMSKY WORK WITH?

             Having argued that Collins’s version of the APS is not a particularly strong argument, I now want to consider whether Collins’ or Pullum et al.’s versions of the APS correctly characterize Chomsky’s APS. While, Chomsky does seem to argue from the same general considerations which Collins has outlined above, he also joins these arguments with APS’s of the kind that Pullum and Cowie consider. Any intellectually satisfying characterization of Chomsky’s APS must explain why he felt it was not only necessary to argue from the general considerations like the ones Collins points to but also uses APS’s like the ones Pullum et al. critiqued.

            In his paper ‘‘Linguistic Nativism’’, Collins uses the Principles and Parameters model of language acquisition which Chomsky developed in the 1980’s.  When discussing the APS in this era, Chomsky constantly made unsubstantiated claims about the PLD. These claims need to be noted and outlined if we are to really understand Chomsky’s APS. Pullum and Scholz (2002) refer to the auxiliary inversion as the experimental crux of the APS. They cite numerous different places that Chomsky and his followers have used auxiliary inversion as an example of the APS. Collins admits that Chomsky does indeed argue like this in various different places. However, he claimed that when Chomsky argues like this he is not making a claim about the PLD; but is rather setting up a challenge to the empiricist.  Collins claims that Chomsky wants to ask the empiricist what is it about the child’s PLD which helps him converge on the correct grammar, and why this data is insufficient for the linguist to learn the same grammar. In this section I will examine Chomsky’s actual writing to see if this interpretation of his APS is correct.

            Since the 1980’s, Chomsky has been labelling the problem of language acquisition as Plato’s Problem. He characterises this problem by quoting  Bertrand Russell’s question ‘How comes it that human beings, whose contacts with the world are so brief and personal and limited come to know as much as they do’ (1986a, xxv). Chomsky argues that these questions arise in the particular sphere of language acquisition in the same way that they do in general epistemology. He claims that when it comes to language acquisition, the solution to the problem is to postulate innate knowledge. He calls the APS Plato’s Problem because he feels that Plato’s discussion of the child displaying knowledge of geometry which he has not previously been taught is a good instance of an APS. It is interesting how he characterises the situation of the slave:

This experiment raises a problem that is still with us: How was the slave boy able to find truths of geometry without instruction or information? (1986b, 4)

 

The key words here are ‘without instruction or information’; Chomsky repeatedly claims that children know various principles and rules for which they received no instruction or information. He further claims that this is the APS and it is overcome by the postulation of innate knowledge. In this context, whether a particular construction is in the PLD or not is extremely important. Likewise, it is important to determine whether the child gains instruction explicitly or implicitly through positive and negative reinforcement. On this construal, the APS does not appear to be merely a challenge to the empiricist but rather to be an explicit claim about the PLD which is either true or false. I recognise that Chomsky’s vague sketch of what he thinks the APS is need not correlate with how he uses the APS in his linguistics. So in order to evaluate how Chomsky uses the APS as opposed to what he states the APS is, we will need to situate the APS within the context of him describing the rules of a language.

            In his Language and Problems of Knowledge (hence forth LPK), Chomsky considers both English and Spanish in detail. He tries to distinguish what rules they share from those that exist only in one language and hence are presumably learned from the PLD. Throughout his discussion, he makes claims about the nature of the PLD the child has available and the order that the child will learn the data in.  One of the first examples Chomsky offers of Plato’s Problem in LPK concerns a-phrases and reflexive pronouns in Spanish. He begins his discussion by illustrating a particular rule of natural language. He then considers how a child using analogical reasoning would apply this rule to other constructions. He claims that a child reasoning using analogy would create constructions which are incorrect by the lights of the native speakers of the language. So here Chomsky’s arguments bear directly on Quine’s model of language learning. Quine had claimed that analogy along with induction and reinforcement play a key role in language learning. However, Chomsky is here claiming that when the details of language are looked at closely, we see that a learning model based on analogy will make incorrect predictions about the type of sentences ordinary language speakers will find grammatical. He then claims that the child will not try out the false constructions which are derived by analogical reasoning only to receive negative reinforcement. He claims further that the child receives no data from his environment which helps him learn the correct rule. However, he concludes that the rule must be innate. This claim again runs contrary to Quine’s views on how a child learns his language.

            In LPK, pg 12-20 Chomsky illustrates what he believes to be a clear case of Plato’s Problem. He begins by discussing simple sentences of Spanish, giving their direct translation in English, and a paraphrase of the translation in ordinary English. The first sentences he discusses are:

(1) Juan arregla el carro.

‘Juan fixes the Car.’

(2) Juan afetia a Pedro.

     Juan shaves to Pedro.

     ‘Juan shaves to Pedro.’

 

Chomsky notes that sentences (1) and (2) illustrate an interesting fact about a language such as Spanish. He points out that in Spanish, while an animate object of a sentence is preceded by a preposition ‘a’ (to), an inanimate object such as ‘el carro’ does not need a preposition before it. He claims that this feature of language is not shared by similar Romance languages such as Italian. He then goes on to consider more complex sentences involving causative constructions, which also feature the verbs ‘afetia’ and ‘arregla’.

(3) Juan hizo (arreglar el carro).

Juan made (fix the car).

‘Juan had someone fix the car.’

(4) Juan hizo (afeitar a Pedro).

Juan made (shave to Pedro).

‘Juan had someone shave Pedro.’

 

It should be noted from above that the subject of the complement clause is unexpressed, and so is interpreted as someone unspecified. However, Chomsky notes the subject may be explicitly expressed as in (5)

(5)Juan hizo (arreglar el carro a Maria).

Juan made (fix the car to Maria).

‘Juan had Maria fix the car.’

We can see the difference between the English and the Spanish versions of the proposition in (5). In Spanish the subject of the embedded clause is an adjoined propositional phrase (a Maria), whereas in the English sentence Maria appears before the verb. Chomsky asks us to try and construct an analogue to (5) using the phrase afeitar a Pedro, instead of arreglar el carro. Doing this we get (6):

(6)Juan hizo (afeitar a Pedro a Maria).*

Juan made (shave to Pedro to Maria).

‘Juan had Maria shave Pedro.’

 

Here Chomsky notes that sentence (6), constructed on analogy with sentence (5), is an unacceptable sentence. So a child using the Quinean process of analogical synthesis would in this case construct a grammatically deviant sentence. However, this fact in isolation tells us nothing about how a child learns this fact of Spanish. An analysis of children’s PLD and a study of children’s linguistic performance would be needed before we rule out a Quinean conception of language learning. Whether Spanish children try out successive ‘a phrases’ in speech, only to receive negative reinforcement is a question which can only be answered by studying actual performance data, or through constructing experiments. Until then, the question of whether a child tries to construct a sentence like (6) based on analogy with (5) only to receive negative reinforcement remains an open question. Chomsky claims that the reason that sentence (6) is unacceptable is because in Spanish there is a rule which bars two a-phrases from appearing together. He then sums up what he thinks we have learned so far from this brief analysis:

Summarizing, we have general principles, such as the principle for forming causative and other embedded constructions and the principle of barring successive a-phrases; principles that admit some variation in interpretation, such as the embedded clause property; and low-level rules differentiating very similar languages, such as the rule that requires insertion of a in Spanish before an animate object. Of course, these levels are not exhaustive. The interaction of such rules and principles determines the form and interpretation of the expressions of language. (1988b, 15)

 

Having given this brief outline of some simple rules of Spanish, Chomsky discusses how the child acquires these rules. He claims generally that there are three factors to consider when trying to understand how a child acquires the rules of language: (1) the genetically determined principles of the language faculty; (2) the genetically determined general learning mechanisms; (3) the linguistic experience of the child growing up in a speech community. In relation to the rules he discussed above, Chomsky speculates that the rule of a-insertion before animate objects is an idiosyncratic rule of Spanish which is learned from experience.   Given that a- insertion before animate objects is not a feature of other closely related romance languages, he holds that it must be learned from the PLD through processes which we do not as of yet understand. He speculates further that the rule which makes (6) unacceptable, has its source either entirely in the language faculty, or in a combination of the language faculty and experience. He claims that the embedded clause property must be a parameter which needs some experience to be learned because it does not occur in all languages. He speculates further that such embedded clausal complements which occur are not learned but result from general principles of the language faculty. At no point in his analysis, does he offer evidence in favour of this interpretation. He merely asserts a series of propositions which are presumably meant to be taken on faith until he justifies them later in the text.

            He then goes on to consider further examples. He asks us to change (2) ‘Juan afeita a Pedro’, by replacing ‘Pedro’ with a reflexive element. There are he claims two choices for a reflexive:  se or si mismo. He asks us to consider here just the first of these, and to replace Pedro with se.

(7) Juan afeita a se.*

Juan shaves to himself.

However, (7) is not a proper sentence. Chomsky notes that the element se is what is technically called a clitic, a form that cannot stand alone but must attach to some verb.  According to Chomsky, there is a rule of Spanish that moves se from its ordinary position as direct object of afeitar, attaching it to the verb, yielding:

(8) Juan se afeita.

      Juan self-shaves.

     ‘Juan shaves himself.’

 

So the reflexive form corresponding to (2), would then be (8) rather than (7). Note here that on a Quinean account of language acquisition, the child would probably try (7) on analogy with (2) receive negative reinforcement, and somehow have to work out that (8) is the correct rule.  Chomsky then asks us to combine the causative and reflexive constructions, replacing Pedro in (4) by the clitic se,

Yielding:

(9) Juan hizo (afeitar a se).

Juan made (shave to self).

Chomsky notes that since se is a clitic it must attach to a verb, and that there are two different ways that this could be done: se could attach to ‘shave’ or to ‘made’. He notes that in all dialects of Spanish, it is normal to attach it to ‘made’, though only in some is it allowable to attach it to ‘shave’. He sticks to discussing the more common case, where se attaches to ‘made’; this is obviously a simplifying assumption though nothing of much importance attaches to the assumption for our present purposes. He goes on to note that (10) will be the correct transformation of (9):

(10) Juan se hizo (afeitar).

       Juan self-made (shave).

      ‘Juan had someone shave him (Juan).’

He notes that in (10) the embedded complement of the causative verb is subjectless, as in (3) and (4). But of course the subject of the complement can be explicit, appearing as an a-phrase. However, he argues that if the subject of the complement is say, los muchachos (the boys), we would expect to derive:

(11) Juan se hizo (afeitar a los muchachos).*

       Juan self-made (shave to the boys).

      ‘Juan had the boys shave him (Juan).’

Unfortunately, while (10) is an acceptable sentence, sentence (11) is not . So a child trying to derive (11) based on the analogy with (10) will construct an unacceptable sentence. Again, a child trying to learn the rules of language using analogy will end up using a deviant sentence such as (11).

            So what conclusion does Chomsky draw from these facts about Spanish?  He notes first that the examples give rise to Plato’s Problem. He also claims that such facts show the hopelessness of claiming that language is acquired using analogy. He goes on to make the following empirical claim about the acquisition of these facts for a Spanish child:

The question then is how speakers come to know these facts. Surely it is not the result of some specific course of training or instruction; nothing of the sort occurs in the course of normal language acquisition. Nor does the child erroneously produce or interpret the sentences (11) or (12) ‘by analogy’ to (10) and (5), leading to correction of this error by the parent or other teacher; it is doubtful that anyone has undergone this experience and it is certain that not everyone that knows the facts has done so.  (ibid., 21)

 

Here we have in Chomsky’s own words an example of Plato’s Problem.  The question we need to ask ourselves is whether this version of Plato’s Problem should be understood the way Pullum and Scholz claim or whether it is merely, as Collins argues, a challenge to the empiricist. It should be noted that when discussing Plato’s Problem in this context, Chomsky makes three unsupported empirical claims: (1) children do not construct new sentences like (11) using analogy and induction; (2) children do not incorrectly utter sentences with the structure of (11); (3) children are not corrected by their peers for constructing such utterances. If there is evidence to support these claims, Chomsky does not produce any. He merely argues that such things never happen, and we are presumably meant to take him at his word. If he wanted to establish that sentences like (11) are never produced, he would need an extensive corpus analysis to justify such a claim.  He has never provided such an analysis. Short of corpus analysis we have no justification for the claim that children never utter such constructions. Chomsky would probably argue that we know that children do not erroneously produce examples like (11) and receive negative reinforcement. To justify this claim, he could cite Brown and Hanlon (1970) who have shown that correction for bad grammar is rarely provided, and when it is provided it rarely has any effect.  However, recent studies on implicit instruction undermine this claim. So Chomsky has given us no reason to assume that this particular APS works.

            It is impossible to read this particular APS as a challenge to the empiricist. On the contrary, it is more accurately read as a supposed refutation of empiricism. If one did want to view it as a challenge, it seems to be a very strange challenge. The challenge could be construed as follows: Chomsky makes arbitrary unsupported claims about the PLD. The challenge is that the empiricist has to find evidence to refute Chomsky’s arbitrary unsupported claims, and until such evidence is provided, we are to assume that Chomsky is correct. Such a challenge is clearly absurd. The burden of proof is obviously on Chomsky to provide evidence to support his claims, not to merely point out that empiricists haven’t yet refuted his unsupported claims.

            Here it could be argued that I am being a bit unfair on Chomsky.  He does after all offer some evidence to support his claim: for example, the evidence from Crain and Nakayama (1987) and Brown and Hanlon (1970). However, such a defence of Chomsky is historically inaccurate. In his 1968 paper ‘‘Linguistic Contributions: Present’’[14] Chomsky discusses how auxiliary inversion illustrated a particular instance of the Universal Rule that all sentences are structure dependent. Here Chomsky makes the following claims:

 There is no a priori reason why human language should make use exclusively of structure-dependent operations, such as English interrogation, instead of structure-independent operations, such as O1, O2, and O3. One can hardly argue that the latter are more ‘complex’ in some absolute sense; nor can they be shown to be more productive of ambiguity or more harmful to communicative efficiency. Yet no human language contains structure-independent operations among (or replacing) the structure-dependent grammatical transformations. The language-learner knows that the operation that gives 71 is a possible candidate for a grammar, whereas, O1, O2, and O3, and any operations like them, need not be considered as tentative hypotheses…Careful consideration of such problems as those sketched here indicates that to account for the normal use of language we must attribute to the speaker-hearer an intricate system of rules that involve mental operations of a very abstract nature, applying to representations that are quite remote from the physical signal. We observe, furthermore, that knowledge of language is acquired on the basis of degenerate and restricted data and that it is to a large extent independent of intelligence and of wide variations in individual experience. (1972a, 54-56)

 

Here Chomsky is making untested claims about the child’s PLD; he is also making unsupported assertions about the structure of all human languages. Chomsky claims the child knows that only 71 is a possible grammar whereas O1, O2, and O3 are not. Here he is implicitly making a claim about the child’s linguistic performance. The only possible evidence that a child does not consider O1, O2, and O3 is that a child never mouths sentences structured according to the rules of O1 etc. Characteristically Chomsky has not offered any evidence. It is important to note that he is making these claims two years before Brown and Hanlon published their paper, and nineteen years before Crain and Nakayama’s paper was published. So here Chomsky is making claims for which he has provided absolutely no evidence.  If such claims are interpreted as a challenge to the empiricist, they are a poor challenge indeed.

            Chomsky’s APS arguments typically rely on claims that the child does not have access to this or that datum. It is claimed that if the child were learning a particular construction by analogy with previously heard constructions they would produce barred sentences such as x or y. He then argues that children never produce sentences such as x or y, and that therefore negative or positive reinforcement cannot play any role in learning a particular construction.[15] However he does not offer any performance data to indicate how children actually speak in particular circumstances. So his claim that children do not offer certain deviant sentences cannot be substantiated until the relevant research is done.

 Chomsky does sometimes argue from general considerations in the way Collins does. However, when doing his linguistics, Chomsky typically makes claims about lack of reinforcement, limited fragmentary data, and how analogy and induction are insufficient to learn certain constructions. Over-all, neither version of the APS casts much doubt on empiricist models of language learning. The APS which Pullum et al consider do not tell us either way whether nativism is true or not. What the research done by Pullum et al. shows is that much more empirical data is needed if we are to discover how children learn their first language. Furthermore, it shows that Chomsky’s lack of interest in performance data cannot be justified. If we are to construct an accurate theory of language acquisition, we will need to consider actual linguistic behaviour, and the circumstances of such behaviour occurring. Collins’s version of the APS really offers no compelling reason to accept nativism. So linguistic nativism has not been justified through any of the poverty of stimulus arguments I have seen so far.

Before leaving this topic I discuss a recent defence of the poverty of stimulus argument which Chomsky has mounted. Chomsky has co-authored a paper with Berwick, Pietroski, and Yankama called ‘‘Poverty of Stimulus Revisited’’ (2011). This paper does not address the primary criticisms which are raised against the APS in this thesis. In fact, the content of their paper would lead one to believe that Chouinard and Clark, Pullum and Scholz, and Geoffrey Sampson do not exist.[16] However, since this is Chomsky’s most up-to-date defence of the APS I will consider it in detail. I use it to demonstrate that Chomsky’s most up-to-date defence does not in any way meet the concerns which I have raised in this chapter.

                 

                PART 3: CHOMSKY’S LATEST DEFENCE OF THE APS

In his (2011) paper ‘‘Poverty of Stimulus Revisited’’ Chomsky, Berwick, Pietroski, and Yankama offer a defence of the APS against recent criticisms. The paper is divided into five sections:

(1) An introduction

(2) A discussion of empirical foundations

(3) A discussion of their minimalist solution to the empirical issues

(4) A consideration of three alternatives to their approach

(5) A conclusion

 

In Section 2 of their paper, they label what they take to be the central theory neutral facts which need to be explained. They claim that facts about auxiliary inversion in polar interrogatives which reveal the structure dependence of linguistic rules generalises to other rules of natural language. They discuss facts such as constrained ambiguity: consider the following four sentences:

(6) Darcy is easy to please.

(7) Darcy is eager to please.

(8) Bingley is ready to please.

(9) The goose is ready to eat.

 

They claim that children intuitively know that (6) and (7) are unambiguous. In (6), ‘Darcy’ is the object of the sentence and the sentence means that it is easy for others to please Darcy. In contrast, in (7) ‘Darcy’ is the subject of the sentence, and the sentence means that Darcy is eager to please others. Sentences (8) and (9) are ambiguous. ‘Bingley’ can be taken as the subject of the sentence or the object; it can mean Bingley is ready to please someone else, or that Bingley is ready to be pleased. Likewise ‘the goose’ can be taken as the subject or the object of the sentence; thus (9)  can mean that the goose is ready to eat something else, or that, the goose is ready to be eaten. Further examples of constrained ambiguity are sentences such as:

(10) The boy saw the man with binoculars.

(11) The senator from Texas called the donor.

 

These sentences are two-ways ambiguous instead of three-ways ambiguous. They argue that such examples reveal the structure dependence of language, in the same way as polar interrogatives do.  Chomsky et al. also point to the fact that some sentences have zero readings but are not mere word-salad, while other sentences which are word-salad declaratives can be turned into word salad interrogatives using auxiliary inversion. Having discussed the various different cases of constrained ambiguity, they note that they are concerned (at this point) with the knowledge that people have acquired, not with how such knowledge is acquired.

            It should be noted that there are problems with their claims that they are only concerned with the knowledge acquired. First, their belief in the fact that certain sentences have zero interpretation, one interpretation and or two interpretations is obviously derived from tests which are done on people’s intuitions of grammaticality.  However, in order for such tests to be considered an accurate sample of people’s competence, we need statistics to support them. I agree with their interpretations of the facts; however, neither my intuitions nor the intuitions of a few linguists can be used on their own to form the foundation of a linguistic theory. Such intuitions need to be justified statistically. We need statistics which support the claim that people of different ages, and different socio-economic backgrounds have intuitions of the acceptability and unacceptability of the sentences which are used to support the belief that constrained ambiguity is a fact of natural language. Such statistics need to make explicit any gradience of acceptability/unacceptability which occurs in different socio-economic environments, and different age groups. With this statistical background in place, they are then in a position to say whether constrained ambiguity is something that all speakers accept. Until such time as this is done, their supposed theory neutral facts which they claim must be explained by any linguistic theory, have not been shown to be an actual fact of natural language. Furthermore, appeal to people’s intuitions of grammaticality needs to be tested against performance data. Corpus analysis of child-adult interaction, adult-adult interaction, and the linguistic interaction of people from different socio-economic backgrounds needs to be tested. If we are to say that people have intuitions that x is the case, we need to demonstrate that they perform as though they have such intuitions. And if performance data and competence data are at odds, then performance data clearly trumps people’s intuitions of how they believe they perform.

            This discussion demonstrates that Chomsky et al. are not merely pointing out facts about language that any theory must explain, rather, they are in fact making claims about what people know which they have not justified by appeal to empirical evidence.  The fact that they do not provide statistical tests to determine whether people of different ages and socio-economic backgrounds have the same linguistic competence demonstrates that from the outset they are presuming that the intuitions of a few linguists are shared across the board. So far from giving a theory-neutral description of the facts of natural language, they are, in fact, from the outset presupposing a particular model of the nature of language.

            They go on to discuss the following sentences:

(21) hiker, lost, kept, walking, circles.

(22) The hiker who was lost kept walking in circles.

(23) The hiker who lost was kept walking in circles.

 

They note that given (21) we would expect (22) to be the declarative as opposed to (23). However they also ask us to consider the following case:

(24) Was the hiker who lost kept walking in circles?

They note that even if we focus on (21) we still read (24) as the interrogative version of (23). From this fact they conclude that one way or the other, the auxiliary verb was is associated with the matrix verb kept- and not lost. They claim that considerations of coherence alone should lead one to construct a sentence like:

(25) Was ((the hiker who- lost) (kept walking in circles))

as opposed to

(24) Was ((the hiker who lost) (-kept walking in circles))

They note that this shows that the relevant constraint trumps considerations of coherence. However, here again they are not merely stating theory-neutral facts. It is true that I share their intuition that (24) is the interrogative form of (23); however, my intuitions are obviously going to be contaminated by my research into various different APS’s. For Chomsky et al. to draw the conclusion they want, they need statistics to back up their claim that people of all ages and all socio-economic backgrounds share the intuition, however they provide no such statistical evidence.

The next phenomenon which they consider is constrained homophony. They discuss the following sentence.

(25) Can eagles that fly eat?

(26) (Can (eagles that – fly) eat))

(27) (Can ((eagles that fly) (-eat)))

 

 They hold that (27) reveals the correct structure of (25), not (26). Since children cannot see the bracketing of (25) Chomsky et al. ask how children can know that (27) reveals the correct structure and not (26)? To further elaborate this point, they go on to consider do replacing the auxiliary verb can, since do bears morphological tense (did) but is otherwise semantically null. So they indicate the actual position of interpretation with dv, and the logically coherent but incorrect position by dv*, now using this notion freely to indicate constraints on ambiguity/homophony.

(28) (do (eagles that dv* fly) dv eat)

They claim that this notation is entirely descriptive and reveals that (28) is unambiguous. This, then, raises a poverty of stimulus consideration: how do children know that dv* is a barred interpretation but that dv is not a barred interpretation? However, yet again Chomsky presents a certain supposed fact as a theory-neutral description without presenting any evidence to support this claim. They have not provided any statistical evidence of people’s acceptability judgements across ages and socio-economic backgrounds; nor do they present any performance data. So the whole of Chomsky et al.’s claims about the theory neutral empirical facts which any theory must deal with stands on an extremely weak foundation.

            They go on to claim that other languages such as German and Irish respect the same constraints. Yet again they provide no empirical evidence to support this claim. They then consider further examples of these constraints and claim again (unconvincingly) that they are merely producing theory neutral facts which any theory must explain. They cite four constraints which must be met by any theorist wanting to explain these supposedly theory-neutral facts:

(1) Yield the correct pairings, for unboundedly many examples of the sort described.

(2 Yield the correct structures, for the purposes of interpretation, for those examples

(3) Yield the correct language-universal patterning of possible/impossible pairings

(4) Distinguish v-pairings from w-pairings, in part, while also accounting for their shared constraints.

They argue that if one cannot meet the constraints of 1-4 then one has not got an accurate linguistic theory which can explain the relevant linguistic data.

They then proceed to outline their own account of how these linguistic facts are best explained. They explain these facts in terms of their minimalist programme. Once they outline their minimalist alternative they then discuss three contemporary attempts to explain the above facts using domain-general knowledge. They outline the three rival theories and demonstrate weakness in all three theories. Having satisfied themselves that they have refuted their rivals they proclaim that their minimalist explanation is the best explanation of the above-mentioned facts. They conclude by arguing that after fifty years their poverty of stimulus argument still stands.

            The three empiricist alternative theories which they evaluate are:

 

(1) STRING- SUBSTITUTION FOR ACQUISITION (CLARK AND EYRAUD)

In brief, Clark and Eyraud following Zellig Harris postulate ‘discovery procedures’ for grammars. Their inference algorithm when given examples like (37a) and (37b) will correctly generate examples like (37c) and exclude examples like (37d).

(37a) Men are happy.

(37b) Are men happy?

(37c) Are men who are tall happy?

(37d) *Are men who tall are happy?

 

The method works by weakening the standard definition of syntactic congruence, positing that if two items u and v can be substituted for each other in a single sentence context, then they can be substituted for each other in all sentence contexts.  C and E call this notion weak substitutability.

Chomsky et al. claim that this method fails for two different reasons:

(A) It fails for English even when restricted to only strings that a language generates.

(B) It does not address the original APS which depends on which structures a language generates.

(2)BAYESIAN MODEL SELECTION OF GRAMMARS (PERFORS ET AL)

In a 2011 paper Perfors et al. (henceforth PTR) wrote a paper which aimed to address the question of domain-general versus domain-specific theories of how natural language grammar is acquired. In their paper they considered Chomsky’s famous example of auxiliary inversion which had been used as a paradigm example of an APS since the sixties. PTR argued that using a Bayesian model selection of grammars they could demonstrate that the structure dependence of language which was revealed by auxiliary inversion could be learned through a Bayesian model. The Bayesian model is a domain-general theory of acquisition, so the fact that it can learn the structure dependence of grammar purports to show that Chomsky’s APS does not work. It supposedly reveals that language acquisition does not require domain specific-knowledge.

            PTR’s model argues for a notion of a ‘‘Bayes learnable’’ grammar. Their model specifies a hypothesis space consisting of three different grammar types and uses a notion of Bayesian probability to decide amongst them on the basis of a sample from the corpus[17]. The three different grammar types they propose are: (1) Flat grammars which generate strings of a corpus directly from a single non-terminal symbol S, (2) Probabilistic (right) regular grammars (PRGs), (3) Probabilistic context free grammars ((1) and (2) are structure-dependent grammars while (3) is a structure- independent grammar)(ibid., 43). They construct a grammar of each type to generate the sentences of the corpus, and score each grammar with a Bayesian probabilistic matrix. They use the CHILDES corpus as data for training and evaluating the grammars of their respective types (ibid., 43). From their tests, they discovered that Probabilistic Context Free grammars are better able to predict the corpus with a smaller grammar than their rivals, and they are better at handling new constructions not contained in the corpus (ibid., 50).  The only learning prior which PTR use is a preference for a shorter, more compressed hypothesis. And Clark and Lappin correctly note that this learning prior is clearly domain-general. So given, that PTR’s model prefers the structure-dependent hypothesis over the structure-independent one, we have evidence against Chomsky’s original APS. Contrary to what Chomsky claimed, a learner using a domain-general procedure, can indeed, learn the structure- dependent rule for natural language.

            Chomsky et al. reply to this argument as follows:

But even if a Bayesian learner can acquire grammars that generate structured expressions, given child-directed speech but no additional language-specific knowledge, this does not yet make it plausible that such a learner can acquire grammars that exhibit constrained ambiguity of the sort illustrated in Section 2.  In particular children acquire grammars that generate expressions in accord with specific structure-dependent rules that govern interpretations…The question is whether learners can induce that expressions are generated in these human ways.(2011, 19)

 

Here Chomsky et al. are claiming the issue is not whether domain-general procedures can key in on structure-dependent rules, but rather whether the domain-general procedures used by PTR can capture more complicated phenomena such as constrained ambiguity.  Chomsky et al. claim that PTR’s model does not include or suggest any hypothesis about how expressions are generated according to the language-specific constraints which they discussed above. They argue that if one wants to address the real poverty of stimulus argument, then one needs to address the full range of examples which they discussed in Section 2 of their paper, not merely the simple examples of polar interrogatives.

It could be argued that Chomsky et al. are being unfair to PTR here. Throughout his career, Chomsky has claimed that a particular datum cannot be learned from experience, so must therefore be explained in terms of innate domain- specific knowledge. Then when PTR construct a model which can generate the expressions without domain-specific knowledge, Chomsky et al. argue that this fact is irrelevant because there are some further facts about language which the model cannot capture. Here, again, we are back to Cowie’s criticism, that every time a nativist has an APS refuted by an empiricist; the nativist can simply point out some other non-obvious fact which he claims cannot be explained in terms of domain-general learning, and when this claim is refuted another example is manufactured on the spot. The real difficulty with this approach is that it shifts the burden of proof onto the empiricist, any non-obvious fact of language is automatically assumed to illustrate an APS, and the empiricist must refute this claim. However, the burden of proof should not be shifted this way. The burden of proof is on both the nativist and the empiricist. It should not be assumed that some complicated fact of language can automatically be explained in either a nativist or an empiricist manner. Such issues are entirely empirical and should be judged based on the ability of either side to construct accurate models to explain the relevant data.

(3) Learning from bigrams, trigrams, and neural networks.  (Reali and Christiansen)

Realli and Christiansen in their 2005 paper ‘‘Uncovering the Richness of the Stimulus’’ constructed models which aimed to test whether yes/no questions could be learned by domain-general procedures. They used three models: (1) A bigram statistical model, (2) A trigram statistical model, and (3) a simple recurrent network model.  They used a child-directed speech as training data for the three models.

                         BIGRAM STATISTICAL MODEL

Realli and Christiansen computed the frequency word bigrams and then the overall sentence likelihood for any word sequence, even for previously unseen word sequences (ibid, 25). This sentence likelihood was then used to select between opposing test sentence pairs similar to Are men who are tall happy- Are men who tall are happy, the idea being that sentences with the correct auxiliary fronting would have greater likelihood than those with incorrect auxiliary fronting.  Chomsky et al. note that in RC’s experiment, which was done on 100 test pairs, the bigram likelihood calculation successfully chose the correct grammatical form 96% of the time. However Chomsky et al. cite the work of Kam, Stoyneshka, Tornyova, Fodor and Sakas (2008), which shows that the strong success is a result of contingent facts of English and not with the bigram model itself. They note that the model exploits the fact that who and that are homographs, which are unclear as to whether they are pronouns/relativisers (ibid., 26) When Kam et. al correct this bias, the performance of the models decreased significantly.

                              RC’S TRIGRAM MODEL

The trigram model uses a test similar to the one used by the bigram model. It furthermore exhibits a similar level of success. However Chomsky et al. argue that, like the bigram model, the trigram model achieves its success because of contingent facts of English and not because of the model itself. They construct experiments themselves to test this claim. These experiments show that the performance of the model drops significantly once the English-specific bias is accounted for.

              LEARNING FROM SIMPLE RECURRENT NETWORKS

Reali and Christiansen constructed a further experiment using a Serial Recurrent Network to try and learn a particular construction. These Serial Recurrent Networks contained a hidden context layer. Reali and Christiansen trained 10 different networks on the Bernstein corpus, and then tested whether they could discriminate between grammatical versus ungrammatical minimal pairs (2011, 28). So, for example, they tested whether the networks could correctly discriminate between:

(1) Is the boy who is hungry nearby?

(2) Is the boy who hungry is nearby?

Reali and Christiansen recoded the actual words into 1 of 14 possible parts of speech categories, e.g. Det (the), N (boy), PRON (who), V (is), ADJ (hungry), PREP (nearby) etc. (ibid, 28). Chomsky et al. note that Serial Recurrent Networks output a distribution over possible predicted outputs after processing each word. Reali and Christiansen tested their networks by providing the part of speech prefix of test sentences up to the point at which grammatical versus ungrammatical divergence would occur. They then checked the predicted output of trained networks over all word categories to see whether the network activation weight was assigned the grammatical continuation as opposed to the ungrammatical one. Reali and Christiansen confirmed that the grammatical continuation was an order of magnitude higher than the ungrammatical one. In other words, the Serial Recurrent Network was able to predict the correct (1) as opposed to the incorrect (2). Reali and Christiansen take this as evidence that their Serial Recurrent Network can learn the rule for structure dependence using this connectionist model.

            However Chomsky et al. (following on from the work of Kam et al.) suggest that Reali and Christiansen’s results may result from simple brute statistical facts:

In other words, one might wonder whether the success of the SRNs in selecting, for example, V as opposed to ADJ as above might also be attributable simply to ‘‘brute statistical facts’’. Kam et al’s and our findings above suggest that bigram information alone could account for most of the statistical regularity that the networks extract. (ibid, 29)

 

Chomsky et al. tested this claim by analysing the Bernstein corpus to test whether there was a difference between the number of times a PRON- V occurs and to the number of times that a PRON-ADJ occurs. They found that the PRON-V occurs 2,504 times in the corpus, while the PRON- ADJ occurs 250 times in the corpus.  So they claim that using a bigram statistical model they can easily predict the next occurrence of the grammatical sentence. They make the following point:

Since SRNs are explicitly designed to extract sequentially organised statistical patterns, and given that the is-is question types can be so easily modelled by sequential two-word patterns, this is not at all surprising. Indeed, it is difficult to see how SRNs could possibly fail in such a statistically circumscribed domain. (Ibid., 32)

 

They then go on to note that it remains to be seen how Serial Recurrent Networks will deal with a more complex interrogative such as:

(1)  Is the boy who was holding his plate crying?

This example is of the sort explained by Crain and Nakayama. Here the matrix verb is differs from the relative clause auxiliary was. Chomsky et al. note that until such time as Reali and Christiansen can construct a model which can handle cases of this kind, we can conclude that the Serial Recurrent Network results are far from compelling. However this reply of Chomsky et al. is dubious, because, as I discuss now, Reali and Christiansen solve the APS which was raised by Chomsky in his 1975 book Reflections on Language.

            In that book Chomsky writes:

We gain insight into UG, hence LT (H,L), whenever we find properties of language that can reasonably be supposed not to have been learned. (175, 30).

 

The example which Chomsky uses to illustrate this point is

(1) The man is tall – is the man tall?

(2) The book is on the table – is the book on the table?

Chomsky notes that a scientist will observe that children form questions in the ways indicated by (1) and (2). The scientist may form the following hypothesis to explain the above fact:

Hypothesis 1: The child processes the declarative sentence from its first word (i.e., from ‘’left to right’’), continuing until he reaches the first occurrence of the word ‘’ is’’ (or others like it: ‘’may’’, ‘’will’’, etc.); he then proposes the occurrence of ‘’is’, producing the corresponding question (with some concomitant modifications of the form which need not concern us) (ibid, 31).

 

He then argues that this procedure will lead the scientist into making incorrect predictions when it comes to more complicated sentences. He asks us to consider the following sentences:

(3) The man who is tall is in the room- is the man who is tall in the room?

(4) The man who is tall is in the room- is the man who tall is in the room?*

Obviously a scientist who was using hypothesis (1) would generate the incorrect sentence (4). Chomsky claims that a scientist will note that children never make mistakes like (4) and will therefore conclude that hypothesis (1) is incorrect. A reasonable scientist, he notes, will therefore try out a different hypothesis. Hypothesis (2), according to Chomsky, will be a structure-dependent hypothesis which analyses words into phrases. This hypothesis will differ from the structure-independent hypothesis which merely involves analysing words into the property of earliest defined word sequence.

            He further argues that, by any reasonable standards, hypothesis (1) is simpler than hypothesis (2). Yet children supposedly unerringly use the structure-dependent hypothesis as opposed to the structure-independent hypothesis.  Chomsky makes four points: (1) the child will never experience constructions which are relevant to helping him learn the correct rule; (2) the child never makes mistakes like sentence (4); (3) the child is not trained to learn the correct rule; (4) the correct rule is more complex than the incorrect one. Based on these considerations, Chomsky argues that the structure-dependence rule must be innate. Now I do not want to repeat the various arguments against this view that I have already voiced earlier in the thesis. However, in light of Chomsky’ et al’s new paper, there are some new points which need to be clarified. Reali and Christiansen managed to construct a Serial Recurrent Network which could learn the is-is construction through training. This demonstrates that Reali and Christiansen’s model has solved the original poverty of stimulus model which was raised by Chomsky in 1975. Chomsky et al. ask whether the model can handle the types of constructions discussed by Crain and Nakayama’s 1987 paper. This is an interesting question. However, the fact that Reali and Christiansen have not answered it does not take away from their achievement. They have demonstrated it is possible for a domain-general learner to grasp a linguistic rule that Chomsky in 1975 claimed could only be acquired through domain specific learning priors. This fact is important. It illustrates Cowie’s point of how easy it is for a nativist to manufacture a new APS in the face of a refutation of their original claim.

            In this paper, I have evaluated whether Chomsky’s APS has refuted Quine’s conception of language acquisition. By reviewing the best arguments which have been put forth by Chomsky in defence of the APS, I concluded that the APS has not refuted Quine’s conception of language acquisition. However, the evidence which I have reviewed does not indicate which conception of language is the correct one. There is much research which needs to be done before we can decide whether the nativist or the empiricist conception of language acquisition is the correct one.


[1] Independent of his APS argument, Chomsky cites various different strands to support his belief in a language faculty. He speaks of the supposed universals which exist in all the languages of the world. He also points out that  the language faculty grows at a fixed rate and that general intelligence is not affected by the loss of the ability to speak and understand language. These points have all been contested in the literature. However, I will not discuss them here as my main concern is with the evidence which Chomsky uses to support his APS claims.

[2] I will deal with the frequency of the constructions later in the paper and whether there are enough constructions for a child to learn the rules. For now I want to focus in a schematic way on what these findings mean to a Quinean picture of language learning

[3] See Hart and Risley’s book Meaningful Differences in the Everyday Experiences of Young Children 1995 for a discussion of the linguistic input an average child is exposed to.

[4] Hart and Risley’s data refers to children hearing spoken language around them. An educated adult who is an avid reader would have access to much more data per year than the data spoken to him by his peers. So a twenty seven year old would have at least nine times the data a three year old has learning his language in terms of heard linguistic utterances. However if you count the data the adult receives from reading, the data the adult is exposed to would be much higher than nine times that of a  three-year old child.

[5] Like Pullum and Scholz, Sampson  interprets his corpus data using Hart and Riselys figures. One difficulty with this is that Hart and Risely estimate that the sentences a child encounters are 4 words long. The few examples Sampson  published from his corpus research contain sentences which are 10 words long. So doubt could be cast on whether his corpus is representative of the childs PLD. Obviously much further research is needed to clarify this matter; however, like Pullum and Scholz,  Sampson is to be applauded for begining this research into the PLD instead of ignoring it like Chomsky.

[6] In this section I am following Sampson’s  (2002, 20) reconstruction of Crain and Nakayama.

[7] Quine does not anywhere discuss auxillary inversion; however, his constant emphasis on the fact that language is a social art in which people’s utterances are beaten into shape through reinforcement from their peers shows that he thinks that all our linguistic rules are structured through positive and negative reinforcement from our peers.

[8] Later in the paper, I will discuss Reali and Christiansons mathematical models which demonstrate that it is possible for a child to learn structure-dependent rules with even less data than Pullum and Scholz discovered.

[9] The study used only adults who were the children’s parents.

[10] See Pinker 1994 on the Negro families who do not speak to their children.

[11] Whether the older children of such a community do, in fact, engage with the younger children in a manner in which they can use to learn the rules of their language is an empirical question. The point is that Ochs and Schieffelin’s paper does not rule out this possibility.

[12] The innate rule is obviously structure dependence not the auxiliary inversion rule, which is of course not a rule of all languages.

[13] When I say that there are an estimated 7000 languages known today, I am speaking of E-languages. Given the difficulties of individuating E-languages which Chomsky has repeatedly discussed, it is unclear how to calculate how many different languages there are known at a given time or how many have been known or are possible to know. Internal to Chomsky’s theories the question should be rephrased as how many I-languages can be derived from UG based on permitted parametric variation. Answering such a question is obviously impossible until we have a definitive worked out conception of UG.

[14] This paper is in Chomsky’s book Language and Mind pg 21-56.

[15] In Chomsky’s Knowledge of Language he repeatedly makes APS claims whose structures are similar to those outlined by Pullum et al. In this book as well he offers no empirical evidence to support his claims. See Knowledge of  Language pgs:55, 62, 78, 90, 145-149

[16] While they do mention Pullum and Scholz’s paper, they do not consider its impact for  Chomsky’s particular APS.

[17] Here I am following Clark and Lappin’s description of PTR on page 43-45 of their Linguistic Nativism and the Poverty of Stimulus.