Monthly Archives: April 2015

Existential Psychoanalysis and Behavioural Approaches to The Mind

What is wrong with Reinforcement?

“What the results suggested was that the simple learning observed in Pavlovian conditioning paradigms arises from an information processing system sensitive to the contingencies between stimuli. If this implication is valid, then it changes our conception of the level of abstraction at which this basic learning process operates.” (Gallistel ‘Information Processing in Conditioning’ p. 1)

Gallistel (2002) discussed a 1968 experiment by Robert Rescorla which Gallistel argued contemporary theorists have not fully digested. According to Gallistel what the result showed was that the simple associative learning of Pavlovian conditioning comes from an information processing system that is sensitive to contingencies between stimuli. Pavlov famously managed to associate an Unconditioned Stimulus[1] to a Conditioned Stimulus and this has been one of the foundational pillars in behaviourism. Rescorla’s experiment was done in the following way. A pidgin experiences a key light up and a while later he gets food. In the case of the rat a tone comes on and soon after the rat is shocked. After a while the pidgin starts pecking when it hears its key lights up and the rat starts defecating when it hears the tone. According to Gallistel this showed that both the Rat and the Pigeon were anticipating that the US is coming. This learning was considered by followers of Pavlov to be a paradigm of associative learning.

In order to prove that this pairing was the result of association forming; an experiment was done where two stimuli were repeatedly presented together, and a control experiment was done where the stimulus were widely separated in time. So, if you could see the change in the experimental condition, but not in the control condition, then the experimenter would argue that the paring was an association.

However Rescorla (1967) found that temporal pairing and contingency could be disassociated from each other, prior to this people believed that they were the same thing. But Rescorla found that the controls used in previous experiments were not sufficient. Control conditions in which the CS and US are never paired do not eliminate CS-US contingency, they rather replace one contingency with another. Rescorla pointed out that if we want to determine whether it is temporal pairing or contingency that is leading to conditioning we need a truly random control.

In this condition the occurrence of the CS does not determine in anyway the time with which the US may occur; so the US must sometimes occur with the CS. Rescorla (1968) ran his experiment as follows: He tested for conditioned fear in rats. In the first experiment hungry rats were trained to press a lever to obtain food. Once the rats were trained to press the lever regularly there were five sessions during which the lever was blocked (ibid p.3). In each of these sessions twelve tones came on at more or less random times. The rats also received short mildly painful shocks to their feet. Rescorla manipulated the distribution of the shocks relative to the tones. For one group the shocks were completely contingent on the tone and the shocks only occurred when the tone was on. In another group the rats got 12 shocks when the tone was on and the also got shocks at equal frequency when the tones were not on. This protocol did not alter then number or frequency of tone shock parings but it did destroy the contingency between tone and shock, and it did increase the number of shocks per session. To check the importance of this Rescorla had a third group who were shocked at random 12 times without regard to the tone.

Before testing the extent to which the rat had learned to fear the tone Rescorla eliminated their fear of the experimental chamber by eliminating tones and shocks for a couple of sessions. In the final sessions the rats conditioned fear of the tone was measured by seeing how the tone affected their willingness to continue pressing the lever. What they found was that although the Rats in the in the two conditions had the same tone shock pairings, the rats in the contingent condition learned to fear the tone, whereas the rats in the truly random condition did not, neither did the rats in the other condition. So it is contingency not temporal pairing that drives simple Pavlovian conditioning.

“In sum, the evidence that conditioning is driven by contingency unsettles us because it challenges what we have taken for granted about the level at which basic learning processes operate. Contingency, like number, arises at a more abstract level than the level of individual events. It can only be apprehended by a process that operates at the requisite level of abstraction–the level of information processing. A US is contingent on a CS to the extent that the CS provides information about the timing of the US.” (Ibid p.4)

Gallistel has done many other experiments which confirm Rescorla’s experimental findings. In various different blogs I have defended various different types of behaviourism. I have noted that behaviourism is not a blank slate theory, that reinforcement does play a role in our language acquisition, and that Applied Behavioural Analysis (in particular the Picture Exchange Communication System), can be very useful in helping children with Autism acquire language. I have also argued that despite the hype by Chomsky and Pinker behavioural accounts of language have not been refuted and more research should be done into them. These views of mine have typically been met with rhetoric and sometimes even personal attacks. Typically people don’t respond with reasons they respond with anger. Not all responses have been so irrational. Linguist David Pereplyotchik has argued that Gallistel’s paper above shows that any appeal to reinforcement alone will only give a partial account of language acquisition.

Firstly some terminology; Gallistel (2002, 2006, and 2012) presented evidence that classical conditioning relies on subjective computations of partial information. His evidence was largely directed against Pavlovian accounts of learning. Skinner however had a different conception of behaviourism his conception focused on Operant Conditioning as opposed to Pavlov’s focus on Classical Conditioning. Classical conditioning involves placing a neutral stimulus before a reflex, whereas Operant conditioning involves placing either punishment, negative reinforcement, or positive reinforcement after a behaviour. Classical conditioning focuses on taking an involuntary response (instinct) and pairing it with a neutral stimulus to form an association; Operant conditioning instead focuses on voluntary behaviour and a consequence following that behaviour. In Operant conditioning people are rewarded with incentives while in classical conditioning no such incentives are offered. In therapeutic setting typically operant conditioning is focused on more than classical conditioning. But it is not uncommon for both types of conditioning to be used by an applied behavioural analyst. In Tacting for example a child will come to associate a sound with a particular state of affairs in the world. This is a type of classical conditioning where the child’s instinctive imitative behaviour, (imitating sounds of his care givers) becomes associated with particular states of affairs. When the child says ‘mama’ primarily in the presence of ‘mama’ he has become conditioned to associate a piece of instinctive imitation with state of affairs and this is classical conditioning. But it is also combined with operant conditioning as the parents use various types of positive and negative reinforcement to help the increase the likelihood that the child will mouth the sound in the right circumstances. Now I have discussed at length in other blogs the extent to which reinforcement is necessary to explain things like language acquisition. I have note that is one amongst many of the tools necessary for a person to acquire language, it is far far from the complete story. Nonetheless I think that it is important to note that since operant and classical conditioning are combined in many different studies then Rescorla’s findings need to be taken account of in any scientific theory which makes use of reinforcement.

The results indicate that classical conditioning takes place at a much more abstract level than theorists previously believed. It is fair to say that no behaviourists currently believe that all of our behaviour can be explained in terms of classical conditioning. In fact Rescorla (1988) explicitly argues that classical conditioning is just a small part of psychology and should not be viewed as the total story; though it thinks that since it is relatively easy to gain experimental traction over it is ideal to be integrated with neuroscientific research.

Rescorla (1988) explicates some of the same experiments as Gallistel does, and details some other experiments which he thinks are relevant to showing people they misunderstand the details of Pavlovian conditioning. Rescorla emphasises that Pavlovian Conditioning no longer works in the simplistic reflex condition and their experimental results suggests that Pavlovian conditioning has a more complex richness in the relations it represents and the way the representations influence behaviour. He even goes as far as claiming that modern versions of classical conditioning involves representations which are closer to the British Empiricist tradition in philosophy than the reflex tradition favoured by reflex theorists. Rescorla’s view argues that conditioning involves the learning of relations and contiguity between US and CS is neither necessary nor sufficient to this process; the important aspect is the information available to the organism. Rescorla details experiments like the one above that Gallistel outlined above and further experiments like ‘The Blocking effect’ which he performed in 1968. In the blocking effect two groups of animals receive a compound stimulus, but they differed in that for one the prior training of the light makes the tone redundant (ibid p.155). Rescorla notes that the results in the experiment are not driven by contiguity but by the informational relation on which they note. Experiments have been repeated by Rescorla, Gallistel et al over the last 50 or so years.

It is important to note that Rescorla explicates his view in terms of subjective amount of information available to the organism, not just the objective features of the environment. Furthermore Rescorla talks about the organism using this information to represent his environment. As I understand behaviourism this talk is not typical. It is difficult to answer this question, does claiming that classical conditioning (and hence indirectly operant conditioning) makes use of subjective information mean that one is no longer a behaviourist. People like Fodor don’t think so:

“The heart of the matter is that association is supposed to be a contiguity-sensitive relation. Thus Hume held that ideas became associated as a function of the temporal contiguity-sensitive relation. Thus Hume held that ideas became associated as a function of the temporal contiguity of their tokenings. (Other determinants of associative strength were said to be ‘frequency’ and on some accounts ‘similarity’). Likewise, according to Skinnerian theory, responses become conditioned to stimuli as a function of their temporal contiguity to reinforcers. Likewise, according to Skinnerian theory, responses become conditioned to stimuli as a function to their temporal contiguity to reinforcers. By contrast, Chomsky argued that the mind is sensitive to relations amongst mental or linguistic associations that may be arbitrarily far apart.” (Fodor ‘Language of Thought 2’ p. 103)

Now it could be argued that since on the evidence from Gallistel and Rescorla shows that in classical conditioning contiguity is neither necessary nor sufficient for association then their evidence is closer to Chomsky’s views than Skinners. And any view that is closer to Chomsky’s than Skinners’ cannot be considered behaviourist. I am not so sure though. The data that is used by Rescorla is behavioural data, that the data shows that Pavlov’s theory is too simplistic should be views as improvements on behavioural science not a refutation of it. Debates like these are common place in all sciences. Thus some people in Darwinian Theory argue that recent evidence in terms of laws of form, epigenetics, and evo-devo research show that the neo-Darwinian synthesis has been refuted and we need a superior theory to capture this new data. It is hard to know what importance should be attached to these debates. Whether we call it ‘Neo-Darwinism’ or something else is irrelevant as long it deals with the new evidence. Was Darwin refuted by the discovery of genetics and its explanation of heredity in terms of genes, or was his theory merely extended in light of new evidence. Similar questions arise with Rescorla’s discoveries; I prefer to think of them as improvements on behavioural science rather than refutations, just like I think the neo-Darwinian synthesis was an improvement of traditional evolutionary theory rather than a refutation. But the issues are complex and I cannot go in to them in any more detail here.

Interestingly that there are aspects of Rescorla’s talk that remind me of Fodor’s Representational Theory of the Mind, except Rescorla thinks that connectionist models can capture the facts revealed by Pavlovian conditioning, he cites the work of Rumelhart and McClelland (1986), in this connection. Noting that

“Connectionistic theories of this sort bear an obvious resemblance to theories of Pavlovian conditioning. Both view the organism as adjusting its representation to bring it into line with the world, striving to remove any discrepancies. Indeed, it is striking that often such complex models are built on elements that are tied closely to Pavlovian associations” For instance, one of the learning principles most frequently adopted within these models, the so called delta rule, is virtually identical to one popular theory of Pavlovian conditioning, the Rescorla-Wagner Model” (ibid p.158)

What is interesting here is that the same year that the above paper was wrote Fodor and Pylyshyn wrote their famous paper attacking the connectionist models of Rumelhart and McClelland. They argued that these connectionist models don’t work for modelling cognition and language because they cannot capture the compositionality of both our language and thought.

Fodor and Pylyshyn emphasise that connectionism is committed to representation and that we need to becareful to set out what our level of explanation is; is it the neural or the cognitive level. They note that since connectionists are dealing with representation they are giving explanations at the cognitive level. According to F and P classical theories unlike connectionist theories are committed to the existence of (1) a language of thought. Classic theories also accept an ABC of assumptions. (A)There is a distinction between structurally atomic and structurally molecular representations. (B) Structurally molecular representations have syntactic constituents that are themselves either structurally molecular or structurally atomic. (C) The semantic content of a molecular representation is a function of the semantic content of its syntactic parts, together with its constituent structure. And classical theories are also committed to (2) The structure sensitivity of our thought processes.

The major difference that Fodor and Pylyshyn see between connectionist architecture and classical architecture are that classical architecture recognises constituent structure and the semantics of a thought is determined by the semantics of its constituents, whereas connectionist models do not work in that way. Fodor and Pylyshyn argue that there are 4 reasons that researchers don’t always recognise that connectionist architecture cannot handle constituent structure: (1) Failure to understand what arrays of symbols do in classical architecture and what they do in connectionist architecture. (2) Confusion of the question of whether the nodes in connectionist architecture have constituent structure with whether the nodes are neurologically distributed. (3) Failure to distinguish between a representation having semantic and syntactic constituents and a concept being encoded interms of micro features. (4) By wrongly assuming that since representations in connectionist networks have graph structure then they have constituent structure.

“To summarize: Classical and Connectionist theories disagree about the nature of mental representation; for the former, but not for the latter, mental representations characteristically exhibit a combinatorial constituent structure and a combinatorial semantics. Classical and Connectionist theories also disagree about the nature of mental processes; for the former, but not for the latter, mental processes are characteristically sensitive to the combinatorial structure of the representations on which they operate” (ibid p. 21)

Part 3 their argument centres on explicating why connectionist architectures cannot handle the productivity, systematicity, compositionality, and inferential coherence of thoughts.

Productivity of Thought: We can combine finite atoms of thought (concepts) in productive ways to produce a potentially infinite amount of expressions. This capacity is beyond all connectionist architecture. (Some connectionist models get around this fact by denying that our thoughts are potentially infinite) Note Elman has created connectionist recursive models.
Systematicity of cognitive representation: Even if you deny that cognitive capacities are productive you cannot deny that they are systematic. You won’t find a person who can think the thought that ‘John loves the girl’ but who cannot think the thought ‘The girl loves John.’ The reason that the thoughts are connected is that there must be structural connections between the thought ‘John loves the girl’ and ‘The girl loves John’ the structural connection is that the two sentences are made of the same parts. Based on this they argue that mental representations have constituent structure and that we therefore have a language of thought and these mental representations cannot be captured by connectionist models.
Compositionality: They connect this with systematicity; and note that the way that sentences are systematic is not arbitrary from a semantic point of view. A lexical item must make approximately the same semantic contribution to each sentence which it occurs in (not sure this works for metaphor, idioms, etc).

“So, here’s the argument so far: you need to assume some degree of compositionality of English sentences to account for the fact that systematically related sentences are always semantically related; and to account for certain regular parallelisms between the syntactical structure of sentences and their entailments. So, beyond any serious doubt, the sentences of English must be compositional to some serious extent. But the principle of compositionality governs the semantic relations between words and the expressions of which they are constituents. So compositionality implies that (some) expressions have constituents. So compositionality argues for (specifically, presupposes) syntactic/semantic structure in sentences.” (ibid p. 30)

They argue that some connectionists actually try to deny compositionality to get out of this argument. Connectionists seem to take idioms ‘Kick the Bucket’ as their model for natural language.

Systematicity of Inference: The syntax of mental representations, mediates with their semantic properties and their causal role in mental processes. Classical architecture can handle this fact but connectionist models cannot.

Summary of the Argument:

What’s deeply wrong with connectionist architecture is this: Because it acknowledges neither syntactic nor semantic structure in mental representations, it perforce treats them not as a generated set but as a list. But lists, qua lists, have no structure; any collection of items is a possible list. And, correspondingly, on Connectionist principles, any collection of (causally connected) representational states is a possible mind. So, as far as Connectionist architecture is concerned, there is nothing to prevent minds that are arbitrarily unsystematic. But that result is preposterous. Cognitive capacities come in structurally related clusters; their systematicity is pervasive. All the evidence suggests that punctate minds can’t happen. This argument seemed conclusive against the Connectionism of Hebb, Osgood and Hull twenty or thirty years ago. So far as we can tell, nothing of any importance has happened to change the situation in the meantime.” (ibid p. 34)

David Chalmers in his 1990 paper ‘Fodor and Pylyshyn the Simplest Refutation’ argued that F and P underestimate the differences between localist and distributed representation. Chalmers notes that distributed representations can be used to support structure sensitive operations in a very different manner than the classical approach. Chalmers notes that the refutation of F and P can be stated in one sentence: If F and P’s argument is correct, as it is presented, then it implies that no connectionist network can support a compositional semantics; not even a connectionist implementation of a Turing Machine, or of a Language of Thought.

It is unclear that Chalmers’ reply to F and P really works. Philosopher Simon Mc Waters has argued persuasively that the model that Chalmers uses to show that connectionist models can capture systematicity and compositionality is not sufficient to refute F and P. He argues that Chalmers model relies on a prior structuring of the symbols used in the model so the connectionist model doesn’t really show that operate on the symbols in a direct and holistic manner as Chalmers claimed. Debates are ongoing on this issue and despite the open ended nature of the debate the connectionist models have flourished in the 27 years since F and P wrote their critique. I am not sure if a model has been developed that can deal with all of F and P’s criticisms but I plan to return to the issue in my next blog.

It is unclear Fodor and Pylyshyn 1988 would affect Rescorla’s views on Pavlovian conditioning because he thinks that Pavlovian conditioning plays only a small role in cognition, and doesn’t tell us what his views are things like operant conditioning are and what he thinks the nature of computational procedures which make learning of all kinds possible. Furthermore as Paul Churchland has argued in ‘Plato’s Camera’ (2012) Rumelhart and McClelland (1986) is ancient history and more modern connectionist models can deal with the difficulties posed by Pylyshyn and Fodor (I don’t know whether Churchland is correct on this point I am currently researching the issue. I am not sure where Rescorla would stand on these facts; presumably he thinks it is his job to do the experiments and other theorist’s job to construct mathematical and artificial models which can accommodate the various different experimental data as they come in. Personally my hunch is that Bayesian Modelling will be more successful than connectionist ones to help us accommodate the findings of Rescorla et al. Overall though I don’t think that Rescorla’s studies refute reinforcement theory they merely show that the story is richer than previously believed.

[1] Henceforth Unconditioned Stimulus are referred to has US and Conditioned Stimulus are referred to as CS.

1984: A POSTMODERN HORROR STORY.

Quine and Chomsky: Disopositions or Rules?

2 Replies

My distinction between fitting and guiding is, you see, the obvious and flat-footed one. Fitting is a matter of true description; guiding is a matter of cause and effect. Behaviour fits a rule whenever it conforms to it; whenever the rule truly describes the behaviour. But the behaviour is not guided by the rule unless the behaver knows the rule and can state it. This behaver observes the rule ( Quine: Methodological Reflections on Current Linguistic Theory).

Quine begins his critique of Chomskian linguistics by distinguishing between two different types of rules: Fitting and Guiding. He claims that Chomsky uses a third intermediate type of rule; this is a type of rule which Quine claims is an implicitly guiding rule. Quine claims that Chomsky thinks that ordinary speakers of English are guided by rules even though these rules cannot be stated by the English speaker. According to this Chomskian picture, we can have two extensionally equivalent grammars, each of which fits the behaviour of the child, neither of which explicitly guides the child, and only one of which is true of the child. Quine claims that if this intermediate way of following a rule is to be made sense of, we need some type of evidence which will help us decide which grammar the child is implicitly following. He claims that a person’s disposition to behave in determinate ways, in determinate circumstances, is the way to make sense of which grammar the person is following. However, obviously these dispositions must go beyond well-formedness if we are to use them to explain the distinction, because extensionally equivalent strings are indistinguishable from each other in terms of behaviour. He speculates that such dispositions which may be relevant are those such as the disposition to make certain transformations and not others; or certain inferences and not others. Quine further notes that he has no problem with dispositions; and points out that a body has a disposition to obey the law of falling bodies, while a child has a disposition to obey any and all extensionally equivalent grammars. However, he can make no sense of Chomsky’s intermediate notion of rule following. He alludes to the ironic fact that while Chomsky seems to have no difficulty with the obscure notion of implicit guidance by rules, he has serious difficulties with the humdrum notion of disposition.

Chomsky replied to this criticism by first questioning the analogy of a child following the rules of grammar with a body obeying the law of falling bodies. He argued that this is a singularly misleading analogy because the rules of English grammar do not determine what a speaker will do in a given context in the same way that the law of falling bodies will determine, if a person jumps from a building, he will hit the ground at a specified time. According to Chomsky, what the rules of English grammar tell us is that English speakers will understand and analyse certain sentences in certain ways, and not in other ways. In other words, the linguist in Chomsky’s sense is trying to discover regularities in a person’s linguistic competence; the linguist is not after a theory of performance. This reply of Chomsky’s has two different strands to it: first, his distinction between competence and performance; second his claim that we can have no scientific theory of performance. His argument that by using idealisations, such as abstracting from memory and performance, we can gain some understanding of subjects like phonology etc. has something to recommend it. Idealisation plays a vital role in any science, and the appropriateness of an idealisation is to be judged by the success of the science which uses it. Given that generative linguistics has had some success, we can tentatively accept his idealisation. However, Chomsky’s confident assertion that we cannot have a science of human performance/behaviour has little to recommend it. Quine’s claim that people are obeying extensionally equivalent rules if their performance conforms to those rules is a perfectly legitimate claim. And Chomsky has offered us no reason to think otherwise.

Chomsky adds that even if we leave aside the purported disanalogy between the law of falling bodies and linguistic competence, Quine’s formulation would still fail because he is guilty of treating physics and linguistics inconsistently. What Quine should have said if he wanted to remain consistent was:

English speakers obey any and all of the extensionally equivalent English grammars in just the sense in which bodies obey the law of falling bodies or the laws of some other system of physics that is extensionally equivalent (Chomsky: Reflections on Language p. 188)

Chomsky claims that when put in these terms, we see that linguistics is no more in need of a methodological cure than physics is. Chomsky’s argument here is merely repeating the criticism which he made against Quine’s Indeterminacy of translation argument. He is claiming that Quine is guilty of treating underdetermination as fatal in linguistics but as harmless in physics.

Quine’s criticism was that Chomsky was using a third type of rule which was obscure because it went beyond fitting the behaviour of the person, but was not consciously guiding the behaviour of the person. Chomsky notes that people are rarely, if ever, guided by rules in the sense of being able to state the rule they are following. Furthermore, he argues that linguists can go well beyond rules that merely fit the behaviour of people. He says that if we accept the same realistic attitude towards linguistics that we do with physics, then we can say that people are obeying rules which are really encoded in their brains, but of which they are usually not consciously aware. So we can use various different pieces of evidence to decide between extensionally equivalent grammars.

To make his criticism of Chomsky’s conception of rules explicit, Quine discussed a toy example of two extensionally equivalent grammars which a person could obey. He asks us to imagine a string (abc) which can be divided up as extensionally equivalent grammars (ab) (c) versus (a) (bc). He notes that from a behavioural point of view, we could say that a person is following either of the grammars and that if we ask the native which is the correct grammar he will not be able to tell us. In this situation, he claims that Chomsky’s view that the person can be said to be implicitly guided by rule (ab) (c) as opposed to (a) (bc) is senseless.

Chomsky argues that that there is no mystery in deciding whether the child is implicitly guided by rule system (a) as opposed to (b). In, this situation, he argues all we have is a problem which can be solved by the ordinary methods of the natural sciences. Quine had argued that a natural solution to deciding whether (ab) (c) or (a) (bc) was the rule which implicitly guided the people, would be to ask them which rule they followed. He notes, of course, that the people will not be able to tell us. So he argues that since the natives cannot decide which rule they are implicitly following, and both rules are compatible with the dispositions of the ordinary language user, then the notion of one of the rules implicitly guiding the child is senseless. Chomsky agrees that asking the native will not get us far in deciding between rule (a) and rule (b), and he suggests a different way of dealing with the problem. He cites various different types of evidence which could theoretically bear on the problem. He discusses how the linguist could have evidence which suggests that intonation patterns are determined by structure. This evidence could be derived by studying our own language as well as other languages. He argues that such evidence might bear on the choice between the two proposed grammars. So we might discover that the rules needed for other cases give the correct intonation if we take the constituents to be those of (a) instead of (b). Whether this type of evidence really occurs or not is irrelevant. Chomsky’s point is that Quine is incorrect to assume that no evidence on the topic is attainable beyond evidence gained by questioning the natives.

Chomsky cites a quote from Quine’s paper ‘‘Linguistics and Philosophy’’ where Quine claims that there is much innate apparatus which will need to be discovered to tell us how the child gets over the great hump that lies beyond ostension. Quine further notes that if Chomsky’s anti-empiricism says nothing more than that conditioning is not sufficient to explain language learning, then it is of a piece with his indeterminacy of translation argument. Chomsky notes that if Quine really believes that there is further innate apparatus waiting to be discovered by science, then this casts doubt on his claims in both his ‘‘Methodological Reflections’’ paper and in Word and Object. Chomsky asks us to consider the sentence: ‘ABC’. Quine had claimed that we have no evidence which can help us determine whether a subject is implicitly following the rule (ab) (c) or (a) (bc). However, according to Chomsky, given that Quine has no problem with innate mechanisms of any sort, then surely it is possible we will discover an innate mechanism in our species which determines that we follow (ab) (c) as opposed to (a) (bc) or vice versa. So in this respect, he claims, Quine (1969) holds doctrines which run counter to the doctrines he accepts in (1960) and (1970).

Quine would deny this charge of Chomsky’s. What Quine has said is that he has no problem with innate apparatus of any sort as long as such apparatus can be made sense of behaviourally. When he is discussing the notion of rule following, he is claiming that if two grammars both equally fit the behaviour of the subjects, and each subject is not consciously guided by these rules, then to claim that the person is implicitly guided by the rule is senseless. The postulation of innate apparatus which ensures that the child follows the rule (ab) (c) as opposed to rule (a) (bc) is pointless unless we have some behavioural evidence to justify such a postulation. The type of evidence which Chomsky claims could be useful to the linguist in deciding whether the subject is following rule A or B, i.e. perceived structure in intonation, is behavioural evidence so it does not bear on the type of considerations which Quine is concerned with.

A Chomskian could argue that if Quine would accept evidence such as structure intonation as a way of distinguishing between extensionally equivalent grammars, then he would be committed to accepting the Chomskian intermediate notion of a rule which implicitly guides the subject. Quine would of course rightly deny this. He would claim that, if we assume for the sake of argument that Chomsky’s made-up example of structure in intonation pattern is correct, then it would follow that the two extensionally equivalent grammars do not fit the totality of linguistic behaviour of the relevant subjects. He could argue that his flat-footed distinction between the two different types of rules can accommodate the type of example which Chomsky put forth. Quine could further claim that Chomsky has not fully explained his notion of the third intermediate notion of a rule.

This debate about the notion of a rule is a real point of contention between both thinkers. Chomsky has tried to explicate his intermediate notion of a rule as something in the brain which unconsciously guides the language user. This conception of a rule does not fit in neatly with Quine’s two conceptions of a rule; the question is whether Chomsky’s third notion captures something real which is missed by the two Quinean notions.

For simplicity’s sake, let us consider a particular rule of English, the subject-auxiliary rule which states that when forming a question, one must move the main auxiliary to the front of the sentence. According to Chomsky, the subject auxiliary rule is a rule of English (and Spanish) though not of all languages; so it is a rule which requires some experience in order for acquisition of it to take place. Children are only capable of learning this rule because they are genetically programmed to follow certain linguistic universal principles such as the following one: All languages are structure dependent.

On the Chomskian conception, the child is guided by certain rules which he does not know and cannot state (unless he has knowledge of linguistic theory). The question we now need to ask is why Quine would have difficulties with these rules, and can such rules be accommodated within his flat-footed conception of rules? Quine’s emphasis on reinforcement and induction means that he would more than likely feel that the postulation of such innate rules is unnecessary. Nonetheless let us assume that Chomsky is correct that poverty of stimulus considerations dictate that rules such as structure dependence are known innately and that parametric variations of these universal rules result in particular languages being spoken. The question which needs to be asked at this point is whether we can make sense of the notion of ‘rule’ which Chomsky is postulating here?

When Quine wrote his ‘‘MRCLT’’ in the early 1970s Chomsky was operating with a rule-based conception of language. From about 1980 onwards, with the inception of his principles and parameters approach, Chomsky’s conception of language changed. Though the change from a rule-based approach to a principles and parameters approach was a significant empirical advance, it does not have any effect on the criticisms which Quine brings to bear in his ‘‘MRCLT’’. So let us consider a question such as the following one: ‘Is the man who is happy over thirty?’ which is derived from the statement ‘The man who is happy is over thirty’. When forming this question, Chomsky claims the child begins with the statement and unconsciously applies the rule: move the main auxiliary to the front of the sentence. According to Chomsky, none of this is done consciously; rather, what happens is that a mechanism in the brain which is genetically programmed will interpret the data of experience and construct a model. The mechanism will determine what the rules of the language are, using the universal principles of the language and having the parameters set through experience. When these parameters are set, we can say that the child is unconsciously following various different rules of English, such as the subject/auxiliary inversion.

Suppose we assume (falsely) that Quine would accept Chomsky’s poverty of stimulus argument and agree that rules such as structure dependence are innate and that these rules determine that ordinary humans will derive rules such as auxiliary inversion when placed in certain linguistic environments. The question we now need to answer is how would Quine characterise these facts? The key point here is that Quine would focus on performance facts; he would ask whether people are disposed to form questions in ways consistent with these rules. Answering these questions would involve studying various different corpuses and seeing how people actually form questions when talking with others, or when writing various different texts. It would also involve constructing various different controlled experiments to see whether the manner in which questions are formed varies in the circumstances of speaking/writing. What Quine would object to is the reliance of Chomsky et al. on the intuitions of others as to what is or is not an acceptable way of forming questions. Quine would claim that the salient aspect of our studies should be the behaviour of the subject, not the intuitions of acceptability or unacceptability by the subjects of various constructions.

However, the next question to ask is what Quine’s response would be if the behaviour of the subjects and the acceptability tests lined up perfectly. So Quine would have the following facts to account for[1]: poverty of stimulus considerations, subject’s intuitions of the acceptability or unacceptability of certain constructions, and subject’s behaviour in certain determinate circumstances. In this circumstance, Quine would claim that the person’s behaviour fits with any and all extensionally equivalent grammars which capture the behaviour of the subject. He would say that we are justified in claiming that the person’s behaviour conforms to a particular rule system (and other extensionally equivalent systems). He would have to object to any postulation of innate apparatus because there is no justification for postulating one rule system over another one. He would accept any innate apparatus if it was determined by behavioural facts; however, no behavioural facts will help us decide between attributing rule system (A) over rule system (B), if they are extensionally equivalent. So on this Quinean picture, the rules which we claim the behaviour of the child fits do so in the same way the behaviour of physical objects fits certain rules of physics. So rules discovered in this behaviourist manner would easily conform to Quine’s flat-footed conception of rules fitting the behaviour of the subject. There would be no need for the Quinean to postulate an intermediate type of rule which implicitly guides the subject.

There is, however a difficulty with this Quinean conception of the behaviour of the child fitting certain rules. The difficulty stems from normative issues. The Chomskian picture conforms to our pre-theoretical intuitions about language in one clear sense. It seems obvious that when I construct a new sentence, the sentence will be grammatical or ungrammatical according to the rules of the language. So, for example, most people would believe that if I construct a new sentence there will be a fact of the matter as to whether the sentence is grammatical or not. However, if we are deriving our rules of language by studying how people actually perform, then at best, all we can speak of is the probability of an utterance with a certain syntactic structure occurring or not. We will have no warrant to claim that the sentence is grammatical or ungrammatical. From a behavioural point of view, we can say that the sentence is atypical but not that it is incorrect. This seems like a serious difficulty with Quine’s conception of the nature of rules. He could reply to this concern by claiming that we could test whether the sentence was grammatical by the lights of the linguistic community by asking its members. However, this reply does not solve our problem; rather, it reintroduces the problem in a different guise. If we ask people whether this or that construction is acceptable, we are testing their intuitions in the same manner that Chomsky recommends. If people’s intuitions all agree that a certain construction is ungrammatical, but performance data indicates that the construction is used and accepted in ordinary speech, then we have arrived at an impasse. Furthermore, if people’s intuitions of acceptability disagree with their actual performance, then Quine would argue that we would have to give preference to performance data over people’s intuitions of acceptability. The problem with this is that if people’s intuitions cannot be used to help us decide between a correct or an incorrect linguistic utterance, then we seem to have no way of doing so. All we have are linguistic regularities, and constructions which are irregular from that point of view.

To clarify the above difficulty let us consider a computer. From a performance point of view, the computer exhibits certain regularities. So, for example, if I press the caps lock key when typing in Microsoft Word it results in the words I type afterwards being capitalised. When I press the caps lock key again, the computer no longer types in capital letters. Now imagine if one day I press the caps lock key and the following symbol appears on my screen *. Imagine I continued pressing the key for a while and the symbol * kept appearing. You can be quite sure that if this happened, nobody would think to themselves ‘Strange, the probability of my letters being capitalised when I press the caps lock button has just been reduced’. It is a safe bet that anybody who noticed that the caps lock key being pressed resulted in * being typed would assume that the computer was broken. This assumption would be based on the fact that we know that the computer was designed for a particular purpose which it is no longer achieving efficiently. So we would assume that some part of the computer was broken and set about getting it repaired.

Now in natural language such an approach is possible as well. Take, for example, people with severe schizophrenia, or with some form of aphasia. People with schizophrenia sometimes speak with what is known as word-salad. Such word-salad sentences sometimes exhibit syntactic, semantic and pragmatic deviance. People with some forms of aphasia are sometimes incapable of forming sentences into syntactic units. Analysing sentences from schizophrenics and aphasics is no trivial matter, and understanding the way such sentences go wrong is a flourishing field of psycholinguistics and neurolinguistics. Any theory which claimed that the speech of aphasics and schizophrenics was just statistically unlikely rather than incorrect seems to be seriously deficient. Furthermore, as opposed to the case of the computer, we cannot say in this case that such people are in error because they are behaving contrary to their designers’ intentions. Who designed English that we would be sinning against if we speak ungrammatically? Chomsky would answer that people speaking in such a deviant manner are breaking the implicit rules of universal grammar and this explains our judgements that such people are speaking ungrammatically[2]. Quine, who rejects appeal to implicit rules which govern what sentences we accept as grammatical, will need to tell a different story of how such sentences are viewed as deviant.

In the rough and ready world of ordinary discourse, Quine recognises that intentional idioms are indispensable. It is only when we are trying to limn the true and ultimate scheme of reality that such idioms have no place. It is also in the rough and ready world that people’s behaviour (including their verbal behaviour) is viewed as deviant. It is from this rough and ready vantage point that people are judged to be suffering from schizophrenia and aphasia[3]. Quine would have no problem with using this pragmatic idiom in daily discourse while using the more precise discourse of neurology and behavioural science when trying to limn the ultimate nature of reality. So in this sense the schizophrenic and aphasic objection does not in any way affect Quine’s argument.

To the objection that Quine’s flat-footed conception of rule following cannot handle normative notions like correct and incorrect grammar the same reply as above will suffice. First, Quine can point out that in ordinary discourse, when applying the dramatic idiom of intentionality, we can judge that certain statements are unclear, badly structured, pragmatically deviant, etc. However when we are limning the true and ultimate structure of reality, we can do no better than say that people are following whatever extensionally equivalent grammars can be constructed to systematise their utterances. To people such as Chomsky who claim that people follow one true grammar as opposed to any and all extensionally equivalent grammars, Quine would reply that we have in-principle no behavioural evidence which can decide between them. And without such evidence, all we are justified in positing is Quine’s flat-footed conception of rule following.

As we have already seen, Chomsky would greatly object to this characterisation because he does not think that syntactic rules can be studied in terms of performance. The important thing to note is that at this point we do not know whether a theory of syntax, semantics, etc. is possible in terms of performance. We do not know whether people’s grammatical intuitions match up with their actual linguistic behaviour, though from the empirical data to date they clearly don’t. A further difficulty stems from the fact that Chomsky does not typically offer statistical analysis of people’s grammatical intuitions, and when studies have been done[4] the result has revealed much greater variety in people’s intuitions than Chomsky would admit. So Chomsky’s confident assertions aside, we do not know whether it is possible to construct a theory of performance yet, and discovering whether this is possible or not will require empirical research, not rhetoric. This bears on the debate between Chomsky and Quine on the nature of rules. If a science of behaviour is tractable then Quine’s conception of the behaviour of various subjects fitting syntactic rules will be the most accurate way of conceiving the facts. However, if it is shown that Chomsky is correct that a science of human behaviour is impossible, then Chomsky’s conception of a third intermediate type of rule will be the correct picture.

[1] In reality, Quine does not have this collection of facts to account for because as, Chomsky’s poverty of stimulus argument is dubious, and performance data does not match up closely with competence data.

[2] Obviously, there is more to schizophrenic word-salad than ungrammatical sentences; however ,I am just focusing on ungrammaticalness because it is directly relevant to the debate between Chomsky and Quine on this particular point.

[3] The rough and ready vantage point includes the judgements of psychiatric workers, who help themselves to intentional idioms and physical idioms as is useful for their purposes.

[4] For some research on the statistical analysis of people’s grammitical intuitions see Lappin and Clark Linguistic Nativism and Poverty of Stimulus.

Stephen King: Possible Worlds and the Idealised I

2 Replies

“Perhaps no-one at the end of his life, if he gives the matter sober consideration, and is, at the same time frank, ever wishes to live it over again, he more readily chooses non-existence.” (Schopenhauer: ‘World as Will and Idea’ p. 204)

“This is the best of all possible worlds”. (Leibniz ‘Essays on the Goodness of God, the Freedom of Man and the Origin of Evil’)

Philosophers from Plato to the present day have been constructing grand metaphysical narratives about the nature of reality and man’s place in this grand metaphysical scheme. As well as using rational arguments, and empirical evidence to support their position philosophers have often used thought experiments to support their theories. Famous thought experiments like ‘The Allegory of The Cave’, ‘Twin Earth’ ‘The Zombie Hunch’ and the ‘Chinese Room’ are part of all undergraduates lexicon. In many ways good fiction can serve as interesting thought experiments for philosophical consumption; thus Dostoevsky’s ‘Crime and Punishment’ can be read as a thought experiment on the nature of morality in a Godless universe. Tolstoy’s ‘War and Peace’ can be read as a large thought experiment on the nature of freewill[1] . Voltaire’s ‘Candide’ is a satire of Leibniz’s claim that we live in the best of all possible worlds. Candide is a brilliant thought experiment with which we can explore whether we agree with Leibniz’s claim. Though it should be mentioned that Candide’s novel like all thought experiments should not be read at face value, we should read it and try to change certain parameters in the book to see if this effects the conclusion Candide tries to force on us. Likewise we should read Leibniz’s actually philosophical arguments closely along with reading ‘Candide’ and explore the degree to which it is an unfair caricature etc[2].

It is important to note that a novel doesn’t have to be “high brow” to be useful as a thought experiment for philosophers. Nor does the novelist need to be interested in philosophy or attacking a particular philosophical system for it to be interesting as a thought experiment. Thus, for example, George Orwell didn’t like philosophy nor did he read much of it. Nonetheless his 1984; in particular the scene where O Brien and Winston debate with each other about truth, has massive philosophical importance[3] .

All novels have potential for providing philosophical insights and the potential to at least give us some interesting premises which we can explore. Novels provide us with possible worlds other than our own actual ones, and hence can be used to help us think through various themes and issues. So, for example, Stephen King’s science fiction novel ‘11.22.63’ raises some interesting philosophical issues about the nature of the self. The central protagonist of the novel is Jake Ebbing an English teacher. Ebbing is told of a portal that leads into 1958 by the owner of a diner called Al. Al shows Jake the portal which is in his diner, and lets Jake go through the portal. After Jake comes back through the portal Al tells Jake of his plan to stop the murder of Kennedy in 1963. After Al commits suicide Jake decides to fulfil Al’s mission and stop the assassination of Kennedy.

However; interesting and central as the plot to stop the Kennedy assassination is, I think that the relationship between Ebbing and another character Harry Dunning is even more fascinating. Early in the book Ebbing recounts how when teaching a General Educational Class he was given an essay by Harry Dunning a slightly brain damaged and crippled man. Dunning’s story ‘A day that changed my life’, was about how his alcoholic father murdered his mother and siblings with a hammer and left him brain damaged and crippled. This story had a deep effect on Ebbing even reducing him to tears.

Dunning’s story had such an influence on Ebbing that he decides to try to help him. The time portal that Al shows to Ebbing can be used to change things that happened in the past, and hence can create a different future. However, if after changing the past you step through the portal again you will reset history, your previous changes will be erased. When he returns to the past Ebbing tries to change the tragic events which led to Dunning’s family being killed and Dunning being brain damaged and crippled.

The choice of which person to save and why would be mind boggling. People die every day in various preventable ways, whether through accident, or murder etc. When Ebbing decided to change Dunning’s past he would have been moved by the tragedy of Dunning’s story, by the lost possibilities. Dunning’s siblings were wiped out of existence, any potential they had to achieve or experience something was wiped out. By changing past events Ebbing was creating new futures, new potentialities, and new destinies.

Throughout the novel Ebbing managed to save Dunning from his father a number of times. But reality being what it is there are no happy endings[4] to be had. Dunning is saved from his father only to be killed in the Vietnam War in one reality. In the other reality Dunning is saved from his father only to end up living in a post apocalyptic world. Dunning it seems can’t catch a break. In the novel things are such that we don’t get to try out endless possibilities to see if we can find a perfect ending for Dunning. The question raised by Dunning’s alternative realities is: is there a possible world in which Dunning gets his happy ending?

Arthur Schopenhauer in his ‘World and Will and Idea’ argued that the nature of existence is such that all creatures are destined to live a miserable existence. The world is full of living things who to need to feed off each other. Every creature by virtue of existing is taking up space and resources from another creature. We can try as hard as we like to decrease suffering, but our very survival depends on killing and eating other living creatures. No matter how hard we strive to achieve happiness our bodies eventually decay and breakdown and die, and the same thing happens to everyone we will ever know and love. Schopenhauer notes that all our fairy stories have a similar structure. Heroes are challenged by Evil Queen’s, Wicked Witches, Monstrous Kings etc; the heroes face challenge after challenge and eventually overcome them. Schopenhauer notes that these fairy stories all end with the claim ‘They all lived happily ever after’. None of the details of these happy endings are ever provided, and the reason according to Schopenhauer[5], is that even children would recognise the absurdity if they had to read a description of a perfect ending. Such things do not exist.

Now I am too much of a pragmatist to buy into any grand metaphysical, whether optimistic like Leibniz, or pessimistic like Schopenhauer. I think the best we can do is to try and cope with the flux of experience as best as we can. However I do think that Schopenhauer has a point to some degree. Any course of action we take will have consequences for other living creatures and either pro or con. This is why in the clichéd science fiction dialogue one protagonist reminds another that any changes they make can have dire consequences. I want to briefly discuss a possible dire consequence of Ebbing changing Dunning’s past: in a sense Ebbing is killing Dunning.

Journalist Brendon O Connor has a daughter Mary who has Down syndrome; in 2013 he wrote an interesting article ‘Would ‘fixing’ our child with Downs’s mean we’d be given back a stranger.’ In his interesting article he noted that we do things all of the time to improve those we love, we give them glasses to improve eyesight, send them to behavioural specialists to improve their ability to learn better. But he worried that if his child was cured of Downs she would be in effect a different person. Such a cure, if it were possible, would in effect change his child into a different person. Now in a sense Brendon’s sensitive and well thought out views are to some degree misplaced. Any cure would not be a miracle overnight cure. The child would still have suffered developmental delay and would slowly have to learn to think differently once the Downs is cured. The slow process of the child learning once the Downs was cured would mean that Brendon would not feel he was losing his daughter. He would rather just think she began to look and think differently.

The case of Dunning would be different. He would have his entire life erased. Without the brain damage and the traumatic experience of seeing his family killed Frank would have become an entirely different person. The brain is an incredibly complex organism, the brain injury Dunning suffered, and incredibly traumatic childhood experiences he had would have had a huge effect on who he became. By changing that Ebbing was in effect killing Dunning. But most people would argue; so what? Frank was damaged goods. Killing this damaged person is a good thing if it makes room for a better person[6].

Ebbing’s belief that he is saving Dunning is understandable. He is going to make Dunning physically and cognitively superior and prevent him from experiencing a terrible trauma. However there is reason to doubt that changing Frank’s past is for the good. Any life anybody has is going to have a lot of trauma. No matter what way the chips fall we will all suffer. But surely the amount of suffering we experience differs depending on our environment. On average a person living in a severe famine in a war torn area will have less happiness than a person living in an affluent peaceful area. Similar things could be said of Dunning surely he would have been happier if his family were not slaughtered on front of him by his father. We can take this as axiomatic. Right? Well actually we simply don’t have enough data about how this one event affected the world that Dunning lived in to make any claims about how it would have affected his overall happiness. Maybe one of his sisters would have turned out to have been worse than his father. The truth is no matter what our intuitions tell us we cannot be sure how this would have affected his overall happiness.

The same is true about his physical injury. Can we say for sure that without it he would have been happier? Who knows? In the book without the injury he ends up going to war and dying. When it comes to his cognitive abilities again it is hard to be sure. There is little empirical evidence I know of to say that people with higher IQ’s are happier than those with lower IQ’s[7]. Is wiping brain damaged Dunning out of existence without any clear empirical evidence on the issue justified? Ebbing doesn’t really feel the need to look for the evidence he just assumed that brain damaged Dunning’s life was terrible and needed to be changed.

To return to Brendon O Connor; in an interview on The Late Late Show he noted how he went through a process of mourning when his daughter was diagnosed with Downs. He had an implicit idealised image of who his daughter would become. It took a long process of mourning to cope with the fact that his daughter would not go through the typical developmental milestones that other children do. Apparently this is a common phenomenon for parents. But once they get to know their child they learn to love the actual person as opposed to some illusory idealised fiction.

Lacan talks about children going through a mirror phase where they recognise themselves in a mirror and note the stable nature of the image which they oppose to their own experiences of their bodies which they have limited control of. He speculates that children who go through this phase begin identifying with this image and this continues through to adulthood. As developmental psychology there is little reason to take Lacan seriously. But he does have a point about the idealised I. We all seem to have a kind of fictional perfect world which we and those around us are imperfect exemplars of. Ebbing used this implicit belief in an idealised I as justification for fixing Dunning. But he had little justification to do so. He had no real idea of who Dunning was. His beliefs were an unverified assumption. I think we should take King’s novel as a cautionary tale against assuming that some life is an imperfect exemplar of some perfect life they were denied. There are no perfect lives; no happy endings just people trying to cope with the flux of experience as best as they can[8].

[1] Obviously the brilliance of the above mentioned books goes beyond any use they have as philosophical thought experiments.

[2] Candide because of its narrow focus on refuting Leibniz is actually a pretty poor novel with one-dimensional characters silly plot twists. But it works as a caricature of Leibniz. To this degree it is not really good art. Dickens’ ‘Hard Times’ manages to both attack a philosophical view Utilitarianism and tell a brilliant human story; not an easy thing to manage.

[3] For discussion of some of these issues see Pinker’s ‘The Language Instinct’ and ‘The Blank Slate’ as well as Rorty’s ‘Contingency, Irony and Solidarity’

[4] Shout out to the excellent children’s programme ‘Once Upon a Time’ which explores the issue of happy endings in an easy going manner.

[5][5] David Berman discusses Schopenhauer and Fairy Tales in the introduction to the everyman edition of ‘World as Will and Idea’.

[6] It could be claimed that saving Dunning isn’t the only motivation for Ebbing he is also saving his siblings. But this is not Ebbing’s primary motivation. And furthermore Ebbing has no idea of whether saving Dunnings siblings will lead to a better or worse world.

[7] To think through this premise I would recommend reading ‘Flowers for Algermon’ a science fiction story about a child with intellectual disabilities who has his IQ radically raised. The jump in IQ isn’t followed by a jump in happiness. A novel is no proof on the issue each way but it can be a spur to thinking about the issue and may inspire doing some empirical research on it.

[8] Note a parent who tries to cure their child of some disability is not making the same mistake as Ebbing. They don’t need to destroy anyone when using a cure. They are just trying to help their child cope with the flux of experience in the best manner they know how.

MINDS WITHOUT MEANINGS AND NEUROPHILOSOPHY

2 Replies

While reading Fodor and Pylyshyn’s[1] recent book ‘Minds Without Meanings’, my mind was constantly drawn back to Paul Churchland’s (2012) book ‘Plato’s Camera’. Fodor and Churchland are philosophers, while Pylyshyn is a cognitive scientist. Despite the fact that Churchland and Fodor are philosophers their respective books chocked full of empirical data, experimental results and as well as theoretical claims. Churchland as well as F and P consider their theories to be empirical theories subject to empirical refutation like any other theory in psychology and neuroscience. Churchland draws his data primarily from neuroscientific evidence while F and P primarily use data drawn from cognitive science. In this blog I will consider some key areas of disagreement between F and P and Churchland and reflect on which theory best accounts for the data and propose some further experiments which will help us decide between the respective theorist’s views on mind and meaning.

F and P begin their book by outlining nine working assumptions that they make in their book (1) They are realist about Belief/Desire psychology, (2) They are Naturalists, (3) They accept the Type/Token distinction (4) They accept Psychological reality about linguistic posits (5) They assume propositions have compositional structure (6) Mental Representations have compositional structure (7) They accept the Representational Theory of the Mind (8) They accept the computational theory of the mind. (9) They argue that thought is prior to language.

There are a lot of objections that could be made to any of the above assumptions I laid out where I stand in relation to all of them in my last blog. Here I will discuss some assumptions that Churchland objects to. One area where he has always disagreed with Fodor is on the language of thought thesis; as a result, he would have serious issues with assumptions 1 and 9 above. Here is Churchland on the Language of Thought argument:

“For now, let me announce that, for better or for worse, the view to be explored and developed in this book is diametrically opposed to the view that humans are capable of cognition precisely because we are born with an innate ‘language of thought’. Fodor has defended this linguaformal view most trenchantly and resourcefully in recent decades, but of course the general idea goes back at least to Kant and Descartes. My own hypothesis is that all three of these acute gentlemen have been falsely taken in by what was, until recently, the only example of a systematic representational system available to human experience, namely, human language. Encouraged further by our own dearly beloved Folk Psychology, they have wrongly read back into the objective phenomenon of cognition-in-general a historically accidental structure that is idiosyncratic to a single species of animal (namely, humans), and which is of profoundly secondary importance even there. We do of course use language-a most blessed development we shall explore in due course-but language like structures do not embody the basic machinery of cognition. Evidently they do not do so for animals and not for humans either, because human neuronal machinery, overall differs from that of other animals in various small degrees, but not in fundamental kind.” (Paul Churchland ‘Plato’s Camera’ p.5)

“As noted earlier, Jerry Fodor is the lucid, forthright, and prototype perpetrator on this particular score, for his theory of cognitive activity is that it is explicitly language like from its inception (see, e.g. Fodor 1975)-a view that fails to capture anything of the very different, sublinguistic styles of representation and computation revealed to us by the empirical neurosciences and by artificial neuromodeling. Those styles go wholly unacknowledged. This would be failure enough. However, the ‘Language of Thought’ hypothesis fails in a second monumental respect, this time ironically by undervaluing the importance of language. Specifically, it fails to acknowledge the extraordinary cognitive novelty that the invention of language represents, and the degree to which it has launched humankind on an intellectual trajectory that is impossible for creatures denied the benefits of that innovation, that is to creatures confined only to the first and second level of learning. (ibid p.26)

Some of Churchland’s reasoning above unfair to F and P. So, for example, he argues that animals and babies neuronal activity doesn’t differ from ours to that great a degree. From this fact he concludes that since neither animals nor babies have language but share a lot of neural machinery with us there seems little point in assuming that humans primary thought processes involve a language of thought. But this is just a question begging assumption. We have reason to believe that children as young as four months have concepts of objects, there is evidence that children before learning their first words at twelve months have expectations of causality, number etc.[2]. There is also evidence that animals think using concepts, in fact it is a working assumption by most cognitive scientists that they do[3]. The fact that humans and animals and non-human babies share a lot of neural architecture is in principle neutral on the issue of whether normal human adults think using a language of thought. Nothing in F and P’s work precludes the possibility that animals and babies who have not yet acquired a public language are not thinking in concepts using a proto-language of thought. The degree to which pre-linguistic children and non-human animals have concepts and whether they can combine these concepts to think is an open empirical question[4]. So given the open ended empirical nature of the debate, Churchland cannot just assume that because normal language speaking humans having a lot of neural tissue in common with children and non-human animals, this has any bearing on the debate on the existence of a language of thought. The fact that he does make this move is simply evidence of him question begging against F and P.

Churchland also uses another argument that he thinks shows that Fodor’s LoT is false. He argues by claiming that our thinking is done through our language of thought Fodor is ignoring the incredible cognitive benefits that is conferred on our species through having a language. This argument simply doesn’t work. Fodor believes that our public language is derived from our private language of thought but from this fact it doesn’t follow that a public language has negligible importance. While a private language of thought will give us the ability to combine concepts in productive manner, giving us the ability to think a potentially infinite amount of thoughts, this solitary mode of thinking has its limits. When we have a public language we have not only our own thoughts to rely on but the thoughts of others. A creature who is born into a particular culture will inherit the combined wisdom of the society he is born into. If the culture keeps written records then the child will eventually be able to read the thoughts and experiences of people long dead who have lived in different places at different times. By sharing a language it makes it easier for members of a culture to explain to another person how to do various things, and this will have huge benefits. So a shared language with the ability to share information will have huge cognitive benefits, and nothing F and P have said denies this fact. So again Churchland’s attack hits the wrong target.

Churchland goes on to make further claims about the LoT hypothesis:

“But I am here making a rather more contentious claim, as will be seen by drawing a further contrast with Fodor’s picture of human cognition. On the Language of Thought (LoT) hypothesis, the lexicon of any public language inherits its meanings directly from the meanings of the innate concepts of each individual’s innate LoT. Those concepts derive their meanings, in turn, from the innate set of causal sensitivities they bear to various ‘detectable’ features of the environment. And finally, those sensitivities are fixed in the human genome, according to this view, having been shaped by many millions of years of biological evolution. Accordingly, every normal human at whatever stage of cultural evolution, is doomed to share the same conceptual framework as any other human, a framework that the current public language is doomed to reflect. Cultural evolution may therefore add to that genetic heritage, perhaps considerably, but it cannot undermine or supersede it. The primary core of our comprehensive conception of the world is firmly nailed to the human genome, and it will not change until the human genome has changed.

I disagree. The lexicon of a public language gets its meanings not from its reflection of an innate LoT, but from the framework broadly accepted or culturally entrenched sentences in which they figure, and by the patterns of inferential behaviour made normative thereby. Indeed, the sublinguistic categories that structure any individual’s thought processes are shaped, to a significant degree, by the official structure of the ambient language in which she was raised, not the other way around” (ibid p.28)

I have to admit that I share Churchland’s scepticism about Fodor’s idea that all our concepts are innate. The conclusion on the face of it seems to be incredible. In fact the conclusion is so incredible that the majority of theorists have simply rejected the argument outright. So, for example, Churchland doesn’t say what he thinks is wrong with Fodor’s argument. Rather he simply states that he doesn’t accept Fodor’s conclusions and thinks that concepts get their meaning publically through our shared culture and developmental history.

Before assessing Churchland’s alternative it is important to consider what evidence Fodor has to support his views on concepts. His ‘Mind’s without Meanings’ is his most recent attempt to explicate his views on concepts so it is worth working through the arguments sketched there. In their ‘Minds without Meanings’, F and P argue, that current views on the nature of concepts are radically wrongheaded, they dedicate an entire chapter to showing that all other theories on the nature of concepts fail.

They begin by arguing that concepts are not mental images and give the following four reasons: (1) We have many concepts that apply to things that we cannot picture, (2) Black Swan Arguments: We can have an image of (A) a black swan, but what about an image (B) that shows that all swans are white? Or an image (C) That shows that (A) and (B) are incompatible with each other? This is not possible because images cannot depict incompatibility. But we do have conceptual knowledge of incompatibility. Therefore images are not concepts. (3) Constituent Structure: Images have parts not constituents. If we take an image we can divide it up in as many different ways as possible and put it back together in any arbitrary way. Concepts however are combined according to rules. They have a syntax that governs how they can be put together. Pictures do not follow rules like this therefore they are not concepts. (4) Leibniz Law: Mental images supposedly occur in the brain (where else could they occur), but they cannot be identical with any brain area because they have properties that no brain area has. We can have a mental image of a purple cow, but there is no purple in the brain. Therefore upon pain of breaking Leibniz’s Law we have to admit that mental images are not brain states. But unless we want to become dualists (which contradicts F and P’s naturalist assumption above) we have to argue that mental images are really only something that seems to exist, in reality they are propositional at base. Therefore since mental images don’t really exist they cannot be concepts.

Secondly they argue that concepts are not definitions: Because (1) we have very few definitions of any concepts after over two thousand years of philosophers looking for them, (2) All concepts cannot have definitions logically some of them must be primitive concepts which the others are defined in terms of but we have no way of finding out what these primitive concepts are. Some people argue that the primitive concepts are abstract innate concepts like causality, agent, object etc. However F and P argue that these supposed primitives can be broken down further so are not really primitive see (Spelke on Objects 1990) the other approach is to say the primitive concepts are sensory concepts. However there are few concepts that can be explicated in terms of sensory primitives. (3) Fodor’s Paradox: If concepts were definitions we could not learn any concepts. Take the definition ‘Bachelors are unmarried men’ this means that the concept BACHELOR is the same as the concept UNMARRIEDMAN. So to learn BACHELOR is to learn that bachelors are unmarried men. But BACHELOR and UNMARRIEDMAN are the very same concept. So it follows that you cannot learn the concept of BACHELOR unless you already have the concept of UNMARRIEDMAN (and vice versa). Therefore you cannot learn the concept at all. So something is obviously radically wrong with the definition story.

Thirdly they argue that concepts are not Stereotypes because: Concepts compose but stereotypes do not. Therefore concepts are not stereotypes. They explicate this with their famous PetFish examples, and Uncat examples.

Fourthly they argue that concepts cannot be inferential roles. They claim that if concepts are inferential roles then we need to be able to say which conceptual content supervenes on which inferential roles. However they note that there are really only two ways of sorting doing this. (1) By appealing to Holism: But they argue that if holism is true and every inference that a concept is involved in is constitutive then, then the content of ones concepts alter as fast as ones beliefs do (Minds Without Meaning p. 55). But they note batty consequences follow from this theory. So, for example, because two people may agree in some judgement about concept x at time t1 but at t2 as they have both had their concepts modified because of new information they no longer even share the same concept. If people have their concepts modified because of their own idiosyncratic experience occurring moment to moment then this would make communication very difficult (2) By appealing to Analyticity: But we have good Quinean reasons to think that appeal to analyticity is a bad way of reasoning, because we cannot explicate analyticity in a non-circular manner etc.

The fifth reason is directly relevant to Churchland. F and P do not think that concepts can be explicated in terms of connectionist models. F and P have criticised connectionist models in detail in their 1988 paper. In ‘Mind’s Without Meanings’ they give an abbreviated version of their 1988 argument. Firstly they note that from the start connectionist models face serious difficulties, because the distinction between thoughts and concepts are not often noted in the literature. For them a thought is a mental representation that expresses a proposition (ibid p. 47). So on an associationist model WATER may be associated primarily with WET. But, they argue, it wouldn’t be right to equate having the thought ‘Water is Wet’ with associating the concept WATER with the concept WET. This is because the thought that ‘Water is Wet’ has logical form where we predicate the property of wetness to the object water. Once we use this predication we are making a claim that is true or false. The claim is true if the stuff ‘Water’ has the property associated with it re ‘Wetness’, it is false otherwise. The thought ‘Water is wet’ has logical form and is made true or false by things in the world. So even if concepts are associative nodes with in a semantic network on F and P’s model concepts are distinct from thought. And connectionist models cannot explain thought even if they can explain concepts.

However despite the fact that they think that connectionist models are in principle incapable of explaining what thought they agree to bracket this consideration. They then consider the question of whether connectionist models can explain what concepts are. Again they argue that they cannot.

They ask us to think of our total set of concepts as something that can be represented in a graph of finitely labelled nodes with paths connecting some of them to some others (ibid p.49). However there is a severe difficulty with this approach. For the connectionist the content of the concepts whether it is a concept of a dog is provided by the label. But the connectionist model is supposed to explain what the concept, and it cannot do this by relying on labels on pain of circularity. F and P note that Quine is right that most theories of meaning suffer from a serious circularity problem. But if a connectionist wants to explain conceptual content without question begging then he will need another approach.

F and P argue if we cannot on pain of circularity equate the content of a node its labels then we must say that the content is simply provided by nodes and their various connections to other nodes. But there is a problem with this approach. It means that corresponding nodes in isomorphic graphs have the same content whatever the labels of their connected nodes may be (ibid p.50). This project cannot work because it means that we could have the concept of SQUARE in one graph that has the same location as the concept ROUND has in another graph. So for F and P this argument shows that connectionist models are incapable in principle of explaining what the content of different concepts are; hence they cannot explain what concepts are.

They note that Paul Churchland (2006) tries to get over this difficulty by arguing that basic level concepts are grounded in sensory perception and that is how they get their content. This approach though won’t work because it is vulnerable to the same objections that Berkeley raised against concepts being associated with sense data.

This long detour through 5 main objections that F and P make to various different theories of conceptual content shows why Fodor originally argued that our concepts must be innate. For him concepts cannot be explained in terms of mental images which are faint impressions of our sensory experiences. They cannot be explained as definitions derived from basic sensory primitives or definitions derived from innate metaphysical concepts of causation, agency etc. They cannot be explained as something that we derive from proto-types and statistical generalisation. They cannot be explained as something derived from inferential roles. And they cannot be explained via connectionist models. Therefore if we have no other explanation of how concepts are acquired and we think people have concepts we will be forced to conclude that our concepts must be innate.

F and P don’t argue that we can avoid paradoxes about how concepts are learned (and by extension claiming that all concepts are innate) if we stop thinking of concepts as something that have intensions. Hence they sketch their purely referential theory of concepts, arguing that intuitions to the contrary this approach is viable.

I discussed Fodor’s objections to various different theories of concepts in my last blog. In a nutshell I think that he is right in his criticism of concepts as mental images. But that his arguments against concepts being prototypes badly misconstrues Eleanor Rosch’s prototype theory. I think that Fodor’s arguments against concepts being definitions is pretty convincing. But that his argument against inferential role semantics is very weak. Nonetheless here I just want to discuss Churchland’s (2012) objections to Fodor’s views on concepts.

Churchland argues that our lexicon is largely determined by our public language that we have learned in the idiosyncratic environment that we happen to have been born into. Any limited pre-linguistic concepts that we have will be to an extent over-written by the conceptual abilities that the particular culture we are born into gives us. So he disagrees with what he takes to be Fodor’s position that all our concepts are written into our genome and cannot be radically changed by the culture we are born into. It is unclear that F and P need to accept Fodor’s old argument for innate concepts. Since they now think our concepts are determined by our extensions entirely and that intensions play no role. However their views are still at odds with Churchland’s claim that the lexicon of our public language will radically modify our concepts which we use when thinking. For F and P since our concepts are determined by their extensions, our public language should not really affect the concepts we use to think about the world.

In the above discussion Churchland claimed that Fodor’s of concepts was incorrect. However he did not engage with Fodor’s arguments that there is no other way we can acquire concepts other than them being innate or being determined by their extensions. Obviously; the key argument that Churchland would object to, is F and P’s claim that concepts cannot be explicated in terms of connectionist models. Churchland has criticised both Fodor and Pylyshyn for what he views as their inadequate views on the nature of connectionist models:

“Fodor briefly turns to address, with more than a little scepticism, the prospects for a specifically ‘connectionist’ solution to his problem, but his discussion is hobbled by an out dated and stick-figured conception of how neural networks function, in both their representational and in their computational activities. His own reluctant summary (Fodor 2000, 46-50) wrongly makes localist coding (where each individual cell possesses a pro-prietary semantic significance) prototypical of this approach, instead of population or vector coding (where semantic significance resides only in the collective activation patterns across large groups of cells). And it wrongly assimilates their computational activities to the working out of ‘associations’ of various strengths between the localist-coded cells that they contain, instead of the very different business of transforming large vectors into other large vectors. (To be fair to Fodor, there have been artificial networks of exactly the kind he describes: Rumelhart’s now ancient ‘past-tense network’ [Rumelhart and McClelland 1986] may have been his introductory and still-dominant conceptual prototype. But that network was functionally inspired to solve a narrowly linguistic problem, rather than biologically inspired to address cognition in general. It in no way represents the mainstream approaches of current neuroanatomically inspired connectionist research). Given Fodor’s peculiar target, his critique is actually correct. But his target on this occasion is, as it happens a straw man. An in the meantime, vector-coding, vector-transforming feed-forward networks-both biological and artificial-chronically perform globally sensitive abductions as naturally and as effortlessly as a baby breathes in and out.” (ibid p.71)

“I here emphasize this fundamental dissociation, between the traditional semantic account of classical empiricism and the account held out to us by a network-embodied Domain-Portrayal Semantics, not just because I wish to criticize, and reject the former. The dissociation is worth emphasizing because the latter has been mistakenly, and quite wrongly, assimilated to the former by important authors in the recent literature (e.g. Fodor and Leopore 1992, 1999). A state-space or domain-portrayal semantics is there characterized as just a high-tech, vector-space version of Hume’s old concept empiricism. This is a major failure of comprehension, and it does nothing to advance the invaluable debate over the virtues and vices of the competing contemporary approaches to semantic theory. To fix permanently in mind the contrast here underscored, we need only to note that Hume’s semantic theory is irredeemably atomistic (simple concepts get their meanings one by one), while domain-portrayal semantics is irreducibly holistic (there are no ‘simple’ concepts, and concepts get their meanings only as a corporate body). Any attempt to portray the latter as just one version of the former will result in nothing but confusion” (ibid p.88)

In the above quote Churchland is explicating a discussion of the Frame Problem by Fodor in his ‘The Mind Doesn’t Work That Way’. In that book Fodor was criticising attempts of connectionist models to overcome the frame problem. Churchland complains that Fodor is wrong in his interpretation of connectionism because he is working from antiquated models. Churchland has a point; Fodor has an irritating habit of associating any theory that disagrees with his own as another form of empiricism. Nonetheless, Fodor’s misunderstanding has no real bearing on his criticism of connectionist models of concepts. We still don’t have an answer to how concepts get their contents in connectionist models.

However whatever we make of F and P’s criticisms of connectionist theories of concept content, Churchland has another argument against their views of concepts being entirely determined by their extensions:

“Before leaving this point, let me emphasize that this is not just another argument for semantic holism. The present argument is aimed squarely at Fodor’s atomism in particular, in that the very kinds of causal/informational connections that he deems necessary to meaning are in general impossible, save as they are made possible by the grace of the accumulated knit of background knowledge deemed essential to meaning by the semantic holist. That alone is what makes subtle, complex, and deeply context dependent features perceptually discriminable by any cognitive system. Indeed it is worth suggesting that the selection pressures to produce these ever more penetrating context-dependent discriminative responses to the environment are precisely what drove the evolutionary development of multi-layered networks and higher cognitive processes in the first place. Without such well-informed discriminative processes thus lifted into place, we would all be stuck at the cognitive level of the mindless mercury column in the thoughtless thermometer and the uncomprehending needle position of the vacant voltmeter.

Beyond that trivial level, therefore, we should adopt it as a (pre-revolutionary principle that there can be “No Representation without at least some comprehension… In sum, no cognitive system could ever possess the intricate kinds of causal or informational sensitivities variously deemed necessary by atomistic semantic theories, save by virtue of its possessing some systematic grasp of the world’s categorical/causal structure. The embedding network of presumptive general information so central to semantic holism is not the post facto ‘luxury’ it is on Fodor’s approach. It is epistemologically essential to any discriminative system above the level of an earthworm.” (ibid p.97)

The very points that Churchland makes above are addressed in chapter 4 and 5 of F and P’s ‘Minds without Meanings’. This part of a dispute is entirely and empirical one and I will address it in my next blog.

[1] Fodor and Pylyshyn will be referred to as F and P throughout this blog.

[2] For evidence of children’s conceptual abilities pre-learning a public language see Spelke 1990, Carey 2009, and Bloom 2000.

[3] For an excellent discussion of animal cognition and whether animals have concepts see Kristin Andrews ‘The Animal Mind: An Introduction to the Philosophy of Animal Cognition’ (2014)

[4] I discussed this question in my blog-post ‘What are Concepts and which Creatures have them?’

Philosophical Naturalism

Philosophical naturalism

Monthly Archives: April 2015

Existential Psychoanalysis and Behavioural Approaches to The Mind

What is wrong with Reinforcement?

1984: A POSTMODERN HORROR STORY.

Quine and Chomsky: Disopositions or Rules?

Stephen King: Possible Worlds and the Idealised I

MINDS WITHOUT MEANINGS AND NEUROPHILOSOPHY