School of Psychology, Birkbeck College

Course PSYC044U (Adaptive Learning And Comparative Cognition.) WEEK 12
May 10th 2007

gifThis is just the first 7 pages of the longer paper handout. Web versions of the other pages in the paper handout are accessible from the side index. If you need to print out the handout, then all the pages are in this 'pdf' file, but this is quite large and may be difficult to download over a telephone modem.



For other Notes see the Easter Handout for Summer Term Lectures

[top of page 1 of handout]

Thorndike called himself a connectionist — is this just a co-incidence, or can comparisons be made between modern accounts of neural networks and previous theories of animal learning?

No 9 on the March 15th list

(NB “Parallel Distributed Processing”, “PDP”, “Connectionism”, “Neo-connectionism”, “New connectionism, “Neural Networks” and “Neural Network Simulations” can be used almost synonymously. The terms refer to theories about, and demonstrations of, the effects of training systems in which large numbers of simple processing units interact only via positive or negative connections between them)

Main Sources

[page 1 of wk 12 handout]
Very basic points

  • There is a field called “Connectionism” which has developed very rapidly over the last 20 years.

  • It arises from computer simulation of idealized networks of neurons.

  • Its importance for present purposes comes from the fact that the simulations learn to perform certain tasks, rather than being programmed to do them in a way known in advance.

  • The learning mechanisms used have some similarity to those proposed by some animal learning theorists in that learning takes the form of strengthening connections – or associations – between the idealized neurons.

  • It is similar to early kinds of stimulus-response theory (Watson, Thorndike, Hull, Spence) in assuming that large numbers of simple associations, formed according to a few completely general rules, will be able accomplish relatively complex cognitive tasks — including those required for human cognition.

[page 1 of wk 12 handout]
Further notes

The theme of the earlier draft notes is similarities and differences between recent connectionist theories and associative theories of animal learning. Before getting on to this we should consider the main thrust of new connectionist theories, which is to give accounts of specifically human cognitive processing (using Rumelhart and McClelland, 1986a, on past-tense learning, as an example).

An illustrative quotation.

“Connectionism is ‘in’. Not since the Dark Ages of the pre- Chomskyan era have we seen so much interest in associationist models of human thinking. Streaming forth from their banishment in the Skinnerian dungeons are dozens of detailed computational models based on the new language of networks, nodes, and connections.” (from MacWhinney, B. and Leinbach, J. ,1991)

This is from a paper on simulations of past-tense learning, which was one of the topics in the 1986 two volume work which attracted the most intense criticism (Rumelhart and McClelland, 1986a). Without going into any detail, it is possible to see from the claims and stated goals of this 1986 chapter why it attracted, and still attracts, so much attention.

[bottom of page 1 of wk 12 handout]
1. The fact that the acquisition of English as a first language includes a stage at which children make errors by supplying regular past tense endings for irregular verbs they had initially used correctly (e.g. “goed”, “comed”, or “camed”), and can generate a regular past tense for an invented word, had been used to support than children make use of explicit inaccessible rules, which they discover through the use of a special purpose innately given language acquisition device.

2. Rumelhart and McClelland (1986a) ended up by directly challenging this for the past tense in particular and all other language processing more generally. —

“We have shown that a reasonable account of the acquisition of the past tense can be provided without recourse to the notion of a ‘rule’ as anything more than a description of the language..... The child need not figure out what the rules are, or even that there are rules.”

“We view this work .... as a step toward a revised understanding of language knowledge, language acquisition, and linguistic information processing in general!

[page 2 of handout]

3. Many of the details used by Rumelhart and McClelland (1986a) are not relevant to this overall conclusion since they have been changed in subsequent simulations (e.g. McWhinney and Leinbach, 1991; Plunkett and Marchman, 1991; Plunkett and Juola, 1999; Joanisse & Seidenberg, 1999, 2005.

Basic points in Rumelhart and McClelland (1986a) are:

  1. The system learns the past-tense forms of English verbs by being exposed to pairs of inputs and correct outputs.

  2. After learning the model makes sensible responses to novel verbs (irregular and regular).

  3. During the course of its learning, it is claimed that the model simulates the stages of children’s learning: it starts with similar performance on a small number of high- and low- frequency verbs, then goes through a phase of incorrect regularizations, then reaches a high level of performance on both regular and regular verbs.

  4. The pattern of errors made by the model during the phase of incorrect regularization was similar to that of preschool children, in that verbs ending in /t/ or /d/ tended to be treated as the “no change” type, while verbs not ending in t/d were predominantly regularized.

  5. There is a list of stimulus inputs to correspond to the base form of verbs, coded in a special (and artificial way) as a list of numerical values.

  6. There is a corresponding list of specially coded outputs which represent the correct past tense form of both irregular and regular English verbs.

  7. There are connections linking the input units to the output units. But there is no “innate” help given to this network, since all connections are set to zero initially. (In other examples they might be set initially at random).

  8. Note that the model does not decode speech input or produce motor output but only associates artificial codes which abstractly represent verbal material. (The perceptual and motor tasks required for real-life language may be more difficult than what this network accomplishes).

  9. Nevertheless, it is clearly a radical new proposal that phenomena of human language can arise from a system which can be characterized as a set of learned associations between input units and output units. Not surprisingly, such proposals have attracted criticism.


[top of page 3 of WEEK 12 handout]
Criticisms of Connectionist claims

There have been many detailed and lengthy attacks on the claims made by Rumelhart et al (1986) and others subsequently. (e.g. Fodor and Pylyshyn, 1988; Pinker & Ullman, 2002, 2003). For present purposes they can be condensed to the two points examined by Kaplan et al (1992):

1. “Connectionism is merely a naive, computerized revival of behaviourism.”

2. “Connectionist models are fundamentally associationist in nature, and this severely limits their cognitive potential.” (pp 91-2)

Kaplan et al endorse the first criticism (“connectionism equals behaviourism”) in the case of standard input-output nets like that of Rumelhart and McClelland (1986a) or those which are more elaborate only in having an intermediate layer of “hidden” units between inputs and outputs;

“..... past tense mapping must be implemented as a direct mapping from stem to past-tense form. This direct mapping means [that standard neural network models] are actually complex, parallel versions of a traditional stimulus-response (S-R) model. (p.94)

“When a stimulus is presented to the input layer, a response follows immediately which is fully determined by that stimulus. This property, the complete determination of each response by the characteristics of the immediate stimulus, is a more accurate description of the behaviourist position, and it is true of ... networks regardless of whether or not they contain hidden layers.” (p.95).

2. "Connectionist models are fundamentally associationist in nature, and this severely limits their cognitive potential." (pp 91-2)

Kaplan et al do not contest the claim that connectionist models are associationist, but “argue that this has improperly been considered a disadvantage only because the power and ubiquity of association in cognition has been underestimated.” (p.92). They suggest however that connectionist models will need to include mechanisms that correspond to “higher-level processes” such as abstraction and cognitive maps.


Quinlan (1991) has a short section on New connectionism and human reasoning (pp 262-5) in which he reviews the criticism that connectionist networks cannot exhibit the systematicity which is characteristic of the human understanding of sentences, and some forms of animal cognition (Fodor and Pylyshyn, 1988). The term “systematicity” is related to the concept of rule-learning, and Quinlan uses the example of the difference in rule-learning ability that apparently exists between corvids and pigeons, as discussed by Mackintosh (1988), who concluded that “associations alone do not generate rules.”

Thus, stimulus-response theories of animal learning (Thorndike, 1898; Hull, 1943; Spence, 1937) and direct input-output neural network models, have been subjected to the same kind of criticism, that they do not capture cognitive processes such as abstraction, rule-following and the use of cognitive maps.

The criticism is particular acute for the case of human language —
“An overall impression gained from reading the new connectionist literature is that regardless of the seeming complexities of human language processing, a few general principles of learning and processing will suffice. Framed thus, it is easy to see why rather caustic comparisons have been drawn between new connectionism and old behaviourism.”

(Quinlan, 1991; p. 193: my italics)

    [bottom of page 3 of handout]
  • Connectionists have not addressed in any detail the question of why, if language processing can be simulated using a few general learning principles, it remains unique to the human species even though very extensive training has been given to supposedly high-powered neural network systems in the shape of chimpanzees (see Week 10). The conventional view is of course that human language is based on special purpose, built-in mechanisms which do not depend on experienced associations and which deal with aspects of language such as grammatical rules, either as a result of evolutionary processes (Pinker and Bloom, 1990) or for some other reason (Piatelli-Palmarini, 1989).

    This debate is still continuing, e.g. see Marslen-Wilson and Tyler (1997, 1998, 2003, 2007), Joanisse and Seidenberg (1999, 2005), Harm & Seidenberg (1999), Tikkala, A. (2000), Hutzler et al., (2004), , Ullman et al., (2005); Desai et al. (2006), Newman et al. (2007) and Nicoladis et al. (2007).

  • But the criticism is not restricted to language, as the contrast between Tolmanian and stimulus-response accounts of animal learning demonstrates. For human psychology, connectionist approaches currently have a strong presence in theories of cognitive development (Berthier et al., 2005; Colunga & Smith, 2005; Elman, 2005; Mareschal & Johnson, 2002; Thomas & Karmiloff-Smith, 2002, 2003; Westermann et al., 2006, 2007) and are occasionally applied to a variety of other areas, such as intelligence (Garlick, 2002) perceptual processing (Gurney, 2007) and issues in personality and social psychology (Queller & Smith, 2002; Van Overwalle & Jordens, 2002; Read & Urada, 2003).
[top of page 4 of handout]

Kinds of Learning in Connectionist Models

The basic distinction usually drawn between types of learning in connectionist model is between “unsupervised” “supervised” and “reinforcement” learning, (Quinlan, 1991 p. 53; Hinton, 1989). This distinction is related to the kind of feedback available to the systems as a consequence of its outputs (responses). There are however some gray areas between the categories because it is often possible to convert one kind of learning procedure into another (Hinton, 1989).

[top of page 4 of handout]
Unsupervised learning

Typically examples use Hebb rules of association by contiguity, and are able to capture regularities in repeated inputs. At the behavioural level an example is habituation, where a certain stimulus is repeated and comes to be recognized: there is no external feedback for a “right” or a “wrong”. The connectionist equivalent is an “auto-associative network”. In these the same pattern is presented at both the input and the output stages of a pattern associator (Quinlan, p.52; see overhead), and eventually the network can complete the pattern if only a partial input is given.

Simple kinds of Pavlovian conditioning can also be regarded as unsupervised learning: the connectionist equivalent is when a pattern associator is given pairs of different patterns at the “input” and “output” stages, and can subsequently reproduce the output pattern when given just the input. This is unsupervised in the sense of lacking external feedback for right or wrong responses (it is not sensitive to goals).

[middle of page 4 of handout]
Supervised learning

A. The “Delta Rule” (Lieberman, 2000; pp. 522-523: Quinlan, 1991; pp. 55- 6)

Supervised learning involves methods of changing the strength (or weight) of connections between input and output units that are more complicated that the Hebbian rule of contiguity (or “co-activation” — when two units are active at the same time the weight of the connection between them is increased.)

The underlying idea is a form of trial-and-error learning, where errors are systematically corrected. Usually starting at random, the network responds to an input, and its output (a list of numbers) is compared to the completely correct “target” output which it is supposed to learn. Then the difference between the actual output and the target output is quantified and used as an index of how much (and in which direction) the weights between input and output units should be changed. The delta rule applies when there are only direct connections between input and output.

This type of supervised learning has an unexpected relation to Pavlovian conditioning, because the formula used by Rescorla and Wagner (1972) to successfully account for classical conditioning phenomena (especially blocking) is formally equivalent to a special case of the delta rule. (Sutton and Barto, 1981; see overhead). This is unexpected because Pavlovian conditioning is not supervised in the “right or wrong” sense, but understandable in that the Rescorla and Wagner (1972) principle is that the increment in associative strength to a conditioned stimulus on a given trial is proportional to the difference between its current strength and the theoretical maximum for the signaled event. Another general similarity is that both the delta rule and the modern treatments of Pavlovian conditioning emphasize that the Hebbian rule of contiguity is insufficient.

[bottom of page 4 of handout]

B. Back-propagation (Quinlan, 1991; pp. 56- 8)

For present purposes the back-propagation method can be regarded as an elaboration of the delta rule for the purpose of supervising learning in “multi-layer” nets, where there is at least one layer of “hidden unit” which intervene between the input and output units. The important points for comparison with ideas derived from studies of animal (or human) learning are:

1. Back-propagation is very widely used in connectionist modelling.

2. As with the delta rule it is of the essence that the complete details of the target, or end- product of learning are provided to the system throughout the learning process.

3. No learning occurs when the system makes a correct output.

C. Problems with back-propagation, especially as a model for natural learning.

1. For connectionist modellers, back-propagation in multi-layer units is good since it can do things not possible for the delta-rule with direct input-output.

2. A major drawback in both theory and practices is that it provides very slow learning (Quinlan, 1991; p.69). There are many techniques for speeding it up, but it cannot be a model for “one-trial” learning in real life.

3. There is no theoretical guarantee that the system will not stop learning before it learns to produce the desired target (by “getting stuck in a local minimum”; Quinlan, p. 71). However, in practice it has been made to work for a wide variety of tasks by making networks bigger, and Hinton (1989) believes this is less of a practical problem than slowness in learning.

4. It is computationally convenient to go backwards though a network to change weights in a similar but reversed way of going forwards from input to outputs. But neurons are unidirectional, and thus this mechanism is not physiologically plausible in a direct way. There are plenty of reciprocal (forwards and backwards) connections between different brain structures, but the sorts of computations used mean that “the algorithm falls well outside of the realms of neurological plausibility.”

5. Equally important is the fact that back-propagation is very implausible at the behavioural level for many tasks. In the first place it is clear that in both animal and human learning (see Week 13) learning does not take place by systematic alteration of errors, but by reward for, or practice of, successful responses and strategies. In the second place there are many situations in which it seems unlikely that perfect copies of the “target” behaviour are constantly available during learning. (E.g. are verb stems and their past-tense forms consistently paired during language acquisition: how could a rat compare erroneous routes with correct routes through a maze?)

[bottom of page 5 of wk 12 handout]
Reinforcement learning

In these procedures external feedback is only given globally, to distinguish “right” from “wrong” outputs.

Hinton (1989) notes that there is a large literature on this topic “beyond the scope of this paper”. I.e. not much use was being made of reinforcement procedures in connectionist simulations of learning. There is a technical problem called “credit assignment”: if a reasonably large network produces a correct output which local connections are responsible? This is potentially solvable, and it is not clear that reinforcement methods could not in principle be made more use of.

The fact that more is made of the effects of reward and punishment in analyses of biologically “real” learning may be related to the involvement of motivational factors, which are not mimicked (so far) in neural network research. However, there is currently some interest in reinforcement learning in areas such as robotics (Dean, 1998; Colman et al., 2005 - abstract) and there was a special issue of the journal Machine Learning devoted to "Reinforcement Learning" (Kaelbling, 1996). Reinforcement learning may be used for practical purposes (Bingham, 2001; Franklin, 2007), or for simulating biologically realistic reward-related behaviours (Berthier et al, 2005; Hazy et al., 2006; Hampton & O'Doherty, 2007).

Main Sources — Animal Learning and Learning in Connectionist (Neural Network) Simulations (Week 12)

Lieberman, D. (1990/1993/2000) Learning: Behavior and Cognition. Belmont: Wadsworth. ("The Neural Network Solution": pp. 439-455 /1993 edition pp. 511-525; /2000, pp. 517-532)

Quinlan, P. (1991) Connectionism and Psychology. Harvester Wheatsheaf, Hemel Hempstead. Chapter 2 “Memory and Learning in Neural Networks” esp pp.51-56, pp.69-71, and pp. 262-266. [152 QUI & AKCHN(Qui).]

Walker, S.F. (1990/1992) A brief history of connectionism and it psychological implications. AI & Society 4, 17-38. (TIED XEROX/SLC) Or Walker, S.F. (1992) A brief history of connectionism and its psychological implications. In Clark, A. and Lutz, R. (eds) Connectionism in Context. Berlin: Springer-Verlag. 123-144. (BK library AKCHN [Cla] )


References (Not normally required for further reading)

Albright, A., & Hayes, B. (2003). Rules vs. analogy in English past tenses: a computational/experimental study. Cognition, 90(2), 119-161.

Baxt, W. G., Shofer, F. S., Sites, F. D., & Hollander, J. E. (2002). A neural network aid for the early diagnosis of cardiac ischemia in patients presenting to the emergency department with chest pain. Annals of Emergency Medicine, 40(6), 575-583.

Becktel, W. and Abrahamsen, A. (1991) Connectionism and the Mind: An Introduction to Parallel Processing in Networks. Oxford, Basil Blackwell. (AKCHN).

Berthier, N. E., Rosenstein, M. T., & Barto, A. G. (2005). Approximate optimal control as a model for motor learning. Psychological Review, 112(2), 329-346.

Bingham, E. (2001). Reinforcement learning in neurofuzzy traffic signal control. European Journal of Operational Research, 131(2), 232-241.

Christiansen, M.H. and Chater, N. (1999) Connectionist natural language processing: The state of the art. Cognitive Science, 23, 417-437.

Christiansen, M. H., Chater, N., & Seidenberg, M. S. (1999). Special issue - Connectionist models of human language processing: Progress and prospects. Cognitive Science, 23(4), 415-415.

Coleman, S. L., Brown, V. R., Levine, D. S., & Mellgren, R. L. (2005). A neural network model of foraging decisions made under predation risk. Cognitive Affective & Behavioral Neuroscience, 5(4), 434-451.

Colunga, E., & Smith, L. B. (2005). From the lexicon to expectations about kinds: A role for associative learning. Psychological Review, 112(2), 347-382.

Desai, R., Conant, L. L., Waldron, E., & Binder, J. R. (2006). FMRI of past tense processing: The effects of phonological complexity and task difficulty. Journal of Cognitive Neuroscience, 18(2), 278-297.

Elman, J. L. (2005). Connectionist models of cognitive development: where next? Trends in Cognitive Sciences, 9(3), 111-117.

Elman, JL, Bates, EA, Johnson, MH, Karmiloff-Smith A, Parisi, D. & Plunkett K. (1996) Rethinking Innateness: A connectionism perspective on development. London: MIT Press. (155.7 ELM in Bk Libary).

Fodor, J. & Pylyshyn, Z.W. (1988) Connectionism and cognitive architecture: a critical analysis. Cognition, 28, 3-71.

Franklin, J. A. (2006). Jazz melody generation using recurrent networks and reinforcement learning. International Journal on Artificial Intelligence Tools, 15(4), 623-650.

Garlick, D. (2002). Understanding the nature of the general factor of intelligence: The role of individual differences in neural plasticity as an explanatory mechanism. Psychological Review, 109(1), 116-136.

Gurney, K. (2007). Neural networks for perceptual processing: from simulation tools to theories. Philosophical Transactions of the Royal Society B-Biological Sciences, 362(1479), 339-353.

Hampton, A. N., & O'Doherty, J. P. (2007). Decoding the neural substrates of reward-related decision making with functional MRI. Proceedings of the National Academy of Sciences of the United States of America, 104(4), 1377-1382

Harm, M. W., & Seidenberg, M. S. (1999). Phonology, reading acquisition, and dyslexia: Insights from connectionist models. Psychological Review, 106(3), 491-528.

Hartshorne, J. K., & Ullman, M. T. (2006). Why girls say 'holded' more than boys. Developmental Science, 9(1), 21-32.

Hazy, T. E., Frank, M. J., & O'Reilly, R. C. (2006). Banishing the homunculus: Making working memory work. Neuroscience, 139(1), 105-118.

Herd, S. A., Banich, M. T., & O'Reilly, R. C. (2006). Neural mechanisms of cognitive control: an integrative model of stroop task performance and fMRI data. Journal of Cognitive Neuroscience, 18(1), 22-32.

Hinton, G.E. (1989) Connectionist learning procedures. Artificial Intelligence, 40, 185-234.

Hutzler, F., Ziegler, J. C., Perry, C., Wimmer, H., & Zorzi, M. (2004). Do current connectionist learning models account for reading development in different languages? Cognition, 91(3), 273-296.

Joanisse, M. F. (2004). Specific language impairments in children - Phonology, semantics, and the English past tense. Current Directions in Psychological Science, 13(4), 156-160.

Joanisse, M. F., & Seidenberg, M. S. (1999). Impairments in verb morphology after brain injury: A connectionist model. Proceedings of the National Academy of Sciences of the United States of America, 96(13), 7592-7597.

Joanisse, M. F., & Seidenberg, M. S. (2005). Imaging the past: Neural activation in frontal and temporal regions during regular and irregular past-tense processing. Cognitive Affective & Behavioral Neuroscience, 5(3), 282-296.

Kaelbling, LP (1996) Special issue on reinforcement learning - introduction. Machine Learning, Vol.22, No.1-3, Pp.7-9

Kandel, E. R. (2001). Neuroscience - The molecular biology of memory storage: A dialogue between genes and synapses. Science, 294(5544), 1030-1038.

Kaplan, S. Weaver, M. and French, R.M. (1992) Active symbols and internal models: Towards a cognitive connectionism. In Clark, A. and Lutz, R. (eds) Connectionism in Context. Berlin: Springer-Verlag. 91-110. (TIED XEROX)

Kemp, N., & Bryant, P. (2003). Do beez buzz? Rule-based and frequency-based knowledge in learning to spell plural -s. Child Development, 74(1), 63-74.

Mackintosh, N.J. (1988) Approaches to the study of animal intelligence. British Journal of Psychology, 79, 509-25.

Mareschal, D., & Johnson, S. P. (2002). Learning to perceive object unity: a connectionist account. Developmental Science, 5(2), 151-172.

Marshall, C. R., & van der Lely, H. K. J. (2006). A challenge to current models of past tense inflection: The impact of phonotactics. Cognition, 100(2), 302-320.

Marslen-Wilson, W and Tyler, LK (1997) Dissociating types of mental computation. Nature, Vol.387, No.6633, Pp.592-594.

Marslen-Wilson, W and Tyler, LK (1998) Rules, representations, and the English past tense. Trends in Cognitive Sciences, Vol.2, No.11, Pp.428-435 Is: 1364-6613.

Marslen-Wilson, W. D., & Tyler, L. K. (2003). Capturing underlying differentiation in the human language system. Trends in Cognitive Sciences, 7(2), 62-63.

Marslen-Wilson, W., & Tyler, L. (2007). Morphology, language and the brain: the decompositional substrate for language comprehension. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 823-836.

McClelland, J. L., & Patterson, K. (2002). Rules or connections in past-tense inflections: what does the evidence rule out? Trends in Cognitive Sciences, 6(11), 465-472.

McWhinney, B. and Leinbach, J. (1991) Implementations are not conceptualizations: Revising the verb learning model. Cognition, 40, 121-157.

Monaghan, P., & Shillcock, R. (2004). Hemispheric asymmetries in cognitive modeling: Connectionist modeling of unilateral visual neglect. Psychological Review, 111(2), 283-308.

Monaghan, P., & Shillcock, R. (2007). Levels of description in consonant/vowel processing: Reply to Knobel and Caramazza. Brain and Language, 100(1), 101-108.

Newman, A. J., Ullman, M. T., Pancheva, R., Waligura, D. L., & Neville, H. J. (2007). An ERP study of regular and irregular English past tense inflection. Neuroimage, 34(1), 435-445.

Nicoladis, E., Palmer, A., & Marentette, P. (2007). The role of type and token frequency in using past tense morphemes correctly. Developmental Science, 10(2), 237-254.

Penke, M., & Westermann, G. (2006). Broca's area and inflectional morphology: Evidence from Broca's aphasia and computer modeling. Cortex, 42(4), 563-576.

Pinker, S. and Bloom, P. (1990) Natural language and natural selection. Behavioural and Brain Sciences, 13, 707-784.

Pinker, S., & Ullman, M. T. (2002). The past and future of the past tense. Trends in Cognitive Sciences, 6(11), 456-463.

Plunkett, K., & Bandelow, S. (2006). Stochastic approaches to understanding dissociations in inflectional morphology. Brain and Language, 98(2), 194-209.

Plunkett, K., & Juola, P. (1999). A connectionist model of English past tense and plural morphology. Cognitive Science, 23(4), 463-490.

Queller, S., & Smith, E. R. (2002). Subtyping versus bookkeeping in stereotype learning and change: Connectionist simulations and empirical findings. Journal of Personality and Social Psychology, 82(3), 300-313.

Ralph, M. A. L., Braber, N., McClelland, J. L., & Patterson, K. (2005). What underlies the neuropsychological pattern of irregular > regular past-tense verb production? Brain and Language, 93(1), 106-119.

Read, S. J., & Urada, D. I. (2003). A neural network simulation of the outgroup homogeneity effect. Personality and Social Psychology Review, 7(2), 146-169.

Rumelhart, D.E. and McClelland, J.L. (1986a) On learning the past tenses of English verbs. In McClelland, J.L. and Rumelhart, D.E (eds) Parallel Distributed Processing. Volume 2. Psychological and Biological Models. London: MIT Press, 216-271.

Rumelhart, D.E. and McClelland, J.L. (1986b) PDP Models and General Issues in Cognitive Science. In Rumelhart, D.E. and McClelland, J.L. (eds) Parallel Distributed Processing. Volume 1. Foundations. London: MIT Press, 110-46

Sutton, R.S. and Barto, A.G. (1981) Toward a modern theory of adaptive networks: expectation and prediction. Psychological Review, 88, 135-171.

Thomas, M. S. C., & Karmiloff-Smith, A. (2003). Modeling language acquisition in atypical phenotypes. Psychological Review, 110(4), 647-682.

Thomas, M., & Karmiloff-Smith, A. (2002). Are developmental disorders like cases of adult brain damage? Implications from connectionist modelling. Behavioral and Brain Sciences, 25(6), 727-+.

Thorndike, E.L. (1905/1919) The Elements of Psychology. New York, A.G. Seiler. [in 'Early Texts' at Senate House]

Tikkala, A. (2000). A connectionist word production tool for Finnish nouns with a model for vowel harmony restrictions. Computer Speech and Language, 14(1), 1-13.

Ullman, M. T., Pancheva, R., Love, T., Yee, E., Swinney, D., & Hickok, G. (2005). Neural correlates of lexicon and grammar: Evidence from the production, reading, and judgment of inflection in aphasia. Brain and Language, 93(2), 185-238.

Van Overwalle, F., & Jordens, K. (2002). An adaptive connectionist model of cognitive dissonance. Personality and Social Psychology Review, 6(3), 204-231.

Walker, S.F. (1992) A brief history of connectionism and its psychological implications. In Clark, A. and Lutz, R. (eds) Connectionism in Context. Berlin: Springer-Verlag. 91-110.

Westermann, G., Mareschal, D., Johnson, M. H., Sirois, S., Spratling, M. W., & Thomas, M. S. C. (2007). Neuroconstructivism. Developmental Science, 10(1), 75-83.

Westermann, G., Sirois, S., Shultz, T. R., & Mareschal, D. (2006). Modeling developmental cognitive neuroscience. Trends in Cognitive Sciences, 10(5), 227-232.

White, R. L., & Snyder, L. H. (2007). Spatial constancy and the brain: insights from neural networks. Philosophical Transactions of the Royal Society B-Biological Sciences, 362(1479), 375-382.