5
Operant conditioning:
negative reinforcement
and punishment

 

 

Unfortunate as it may be, motivating stimuli are not universally enjoyable. Living under the influence of constant social approval, exerting ourselves only to achieve internally-rewarding feedback or attractive conditions in the environment, would be all very well. As it is most of us spend some of the time working to wipe out debts, putting on extra clothes to stop feeling cold, moving the car to avoid getting a parking ticket, trying to get to meetings on time to avoid social disapproval, drinking to stave off anxiety and so on. In cases like these we are trying to minimize contact with aversive stimuli, rather than to achieve positive goals. Many aversive situations can be construed as a lack of a positive factor: being cold is the opposite of being warm, social rejection is the other side of the coin to social approval and so on. But unpleasant or frightening stimuli usually have some distinctive properties of their own, with particular behavioural repercussions. When cold we may shiver and when overheated sweat and pant; these extremes of discomfort precipitate actions directed towards escaping from the extremes rather than to bring about maximum comfort. Anxiety is more than the lack of happiness, and is clearly more than the lack of physical pleasure.

Negative reinforcers must therefore be considered as separate stimuli and motivators in their own right, not just as the absence of positive reinforcers. A negative reinforcer is defined as a stimulus which we would struggle to get away from, and escaping from a negative reinforcer is classified as negative reinforcement.

63

Losing rewards and receiving pain are punishers in this scheme. There seems to be a strong natural tendency to think of punishment as being included under negative reinforcement, but the distinction between negative reinforcement and punishment is necessary. Negative reinforcement fosters the target response as a means of escape, whereas punishment as a rule deters or suppresses response.

Forms of learning with aversive stimuli

Behaviours are influenced by consequences, and this holds true by and large when the consequences are unpleasant. We normally lose enthusiasm for activities which bring about disasters and gain some facility for items of behaviour which alleviate pain or distress. There are, however, a number of special factors which enter into learning motivated by negative reinforcers. The disturbing emotional effects of aversive stimuli are susceptible to classical conditioning and this can alter the nature of the operant learning (see p. 66). There are side effects from the use of severely unpleasant reinforcers which bring stress, aggression and defensive reactions into the picture. Certain types of learning can be observed with few side-effects if mild negative reinforcers are employed, and these are discussed first.

Escape learning

Escape from confinement was one of the types of learning studied by Thorndike in 1898 (see Ch. 1). Putting cats inside a box proved sufficiently motivating to induce the gradual learning of a method of operating a latch which held the door. Since Thorndike many experimenters have employed electrical stimulation as a negative reinforcer for laboratory animals. This is most frequently delivered through metal bars on which the animals are standing, at levels which elicit escape movements but are not physically harmful. By this use of mild electric shocks, rats may be trained to press levers, or run through mazes, in much the same way as they can by using food as a positive reinforcer. Shaping can be used if the current is turned on every thirty seconds or so and turned off again when the animal makes an approximation to the desired response. Behaviour produced by such escape learning is often stereotyped, with less respite from the conditioned responses for exploration

64

or grooming. For instance, when rats have to learn to press down a lever to escape from shock they tend to hold the lever down for long periods after the negative reinforcement of the shock going off. This by-product (rigidity in behaviour) is less pronounced however when weaker aversive circumstances, like loud noise or a draught of cold air, are the stimuli being turned off.

Avoidance learning

If animals are being trained to learn a T-maze (see p. 57) by negative reinforcement it is not necessary for the maze to be a continuously awful environment. Occasional electric shocks delivered in the body of the maze will be quite sufficient to motivate the animal to find a`safe' compartment. In fact the avoidance of dangerous places is one of the strongest sequels to negative reinforcement. Extremely rapid learning can be observed, often as a result of a single experience, when painful stimuli are associated with a particular location. In the `stepdown test', for instance, a mouse is placed on a small `safe' platform, above an electrified floor. Placing one foot on the electrified floor is usually enough to prevent any further moves from the platform for an extended period, which provides a crude measure of the animal's memory of the shock. It is obviously a good behavioural principle, on evolutionary grounds, for animals to avoid signs of danger and unfamiliar situations which are associated with pain (see p. 93). Very rapid learning is observed also when it is a matter of `getting out' rather than `staying put'. If rats are allowed to explore an enclosure for a time and are then shocked, they learn very quickly to jump up on to a `safe' platform.

A special apparatus exploits the `getting out' reaction to negative reinforcers by requiring animals to shuttle back and forth over a barrier, set at their shoulder height, between two compartments. Neither of the compartments is permanently safe, but only one is electrified at a time. The problem for the animal is therefore to find whichever side is safe. The usual procedure is to give a warning signal whenever the electrified side is to be changed, so that the animal can always avoid being shocked if he switches sides when he hears the warning signal. Most animals learn to avoid at least eight out of ten of the shocks by getting over the barrier in time, but the speed of learning is affected by many details of the procedure, and by the prior experience of the subjects. The idea is that emotional reactions

65

to the painful stimulus are conditioned to the warning signal in accordance with the usual results of classical conditioning (p. 36). The anticipatory emotional reaction of fear or anxiety then serves as a negative reinforcer for any responses which reduce fear. This two-factor theory postulates that animals first learn to be afraid and then learn a response to reduce the fear. (Mowrer, 1960). In a demonstration of the two-factor theory Miller (1948) trained rats initially to run out of a white room, through a small door into a black room (of appropriate size), by giving shocks in the white room. After this pre-training, the door was closed, and could only be opened by the rats turning a wheel. Although no more shocks were given, the residual 'aversiveness' of the white room was sufficient for the rats quickly to learn to turn the wheel so that they could run through into the `safe' room, thus relieving their anxiety.

Making sure prevents finding out. Miller's result is reminiscent of Tolman's experiments with food rewards, where positive incentive was attached to getting to a`good' place where food was usually found, and rats would circumvent or climb over any new barriers along the route to the `good' place. The `conditioned negative reinforcer' of getting out of the white compartment into the black one is almost identical to the `conditioned positive reinforcer' of getting to a black compartment that usually has food in it (see Ch. 7). But there is one crucial difference. If we try to persuade rats to learn new responses in order to get into a black box which used to have food in it, they will very soon stop bothering because they will learn that the black box no longer contains food. But if they are successfully escaping from a white room which used to be dangerous, they may go on indefinitely, because having escaped from the white room they cannot find out that it is no longer dangerous. This is a reason why actions motivated by conditioned fear or anxiety should persist longer than those influenced by positive incentives. Experiments with dogs and human subjects have shown that when strong stimuli are used to establish an avoidance response the response may continue indefinitely if it is not physically prevented (Solomon and Wynne, 1953; Turner and Solomon, 1962). If subjects are concerned enough to make sure that they avoid an anxietyprovoking situation, it will be very difficult to find out if the situation has changed. This is part of the learning theory analysis

66

of the importance of anxiety in neurosis. If someone has a phobia about leaving the house in case something terrible happens, they may not leave the house for years, and this will simply make going out more frightening. The theory is that by going out without something terrible happening, or even imagining going out without anxiety, the conditioned avoidance response can be diminished (see Chs 2 and 3).

Avoidance without fear. Solomon and Wynne trained their dogs to jump out of a compartment by giving a few strong shocks, and found that the dogs continued to jump out of the compartment for hundreds of trials without further shocks. There is little doubt that a state of conditioned anxiety elicited by the situation was responsible for this behaviour. Feelings of anxiety produced by car accidents, muggings, war experiences or childhood traumas may also persist for indefinite periods. But is it necessary that all avoidance behaviours should be accompanied by intense anxiety? Is it possible to make `rational' avoidance responses, like taking an umbrella to avoid getting wet, without any worrying to motivate the response? A certain amount of anxiety, or at least some incentive to avoid unpleasant outcomes, is probably a good thing to have, in as much as it helps to ensure that we catch trains on time and so on. But if we found that, on the whole, carrying an umbrella was a good thing because we were likely to be more comfortable with it than without it, this might be an adequate background for learning to carry an umbrella, without specific anxieties being involved. Herrnstein (1969) has pointed out that it is at least theoretically possible for avoidance responses to be learned on the basis of their useful consequences, without any preparatory anxiety, and has designed some ingenious experiments in which rats are prevented from learning when to be anxious, but yet still make avoidance responses.

The basis for the `avoidance without fear' argument of Herrnstein is avoidance training where there is no warning signal to announce the aversive stimulus. The standard laboratory technique of this kind is the free-operant avoidance procedure introduced by Sidman (1953). In the Skinner box apparatus, very brief shocks are delivered at standard intervals, for instance once every ten seconds. The avoidance response, in this case pressing down the lever, delays the next shock for a period, for instance of eight seconds. The shock can be repeatedly

67

delayed, and the subject therefore has the opportunity of postponing shock indefinitely if it presses the lever at least once every eight seconds. If there is no response, shocks continue at intervals. This procedure has been widely-used to assess avoidance learning abilities in physiological work, and can be effective with a range of time values. The shock-shock interval can be exactly the same as the response-shock interval, or can differ in either direction. The two-factor theory explains this in terms of cycles of anxiety, building up as time passes, being reduced by the responses of lever pressing. Herrnstein, however, suggests that the outstanding characteristic of the freeoperant avoidance procedure is simply that subjects are better off if they make avoidance responses than if they don't, and this in itself may be a reasonable cause of the behaviour. -Perhaps mild negative reinforcers can support routine behaviours without conditioned waves of fear even though stronger stimuli could result in anxiety or stress.

The distracting effects of anxiety

One way of measuring the force of anxiety is to look for the disruption of customary behaviours. A standard method of doing this is called the conditional emotional response (CER) procedure. A reliable baseline of behaviour is developed and this is used as the background for assessing the distracting effects of a warning signal. Laboratory animals are kept busy responding on a variable interval schedule for food reinforcement (p. 83). At irregular intervals a signal (a buzzer, say) comes on for about a minute, and is at first ignored. This is evident from the fact that the animal carries on its operant task, perhaps finding and eating food while the signal is present. But if the warning signal is used as a precursor for an electric shock, it quickly becomes a disruptive stimulus; the animal may cease its normal work altogether for the duration of the signal, even though this means missing possible food deliveries. Clearly, although working at the food reinforced task is not punished - the animal may go on working without making things worse - anticipation of the aversive stimulus has suppressed the ongoing behaviour. The incipient shock `puts off' the subject, and this conditioned suppression may still happen after prolonged experience of the schedule. The off-putting effect is a very good index of the association between the signal and the aversive stimulus.

68

Punishment

In view of the desire of everybody to escape from unpleasant situations and associated signals, there have always been reservations and misgivings about inflicting pain or distress on other people, or indeed other animals, as a means of education or reform. Decline in culturally-sanctioned use of corporal punishment can be taken as a dimension of social advance. However fines, imprisonment, threat, insult and rejection are still an intrinsic part of legality and, to a lesser extent, education. The liberal attitude is that using punishment to suppress socially undesirable behaviours is immoral, and also produces undesirable side-effects of stress and aggression. Skinner (1953) and many others have hoped that positive reinforcement, and other means of encouraging good behaviours, could replace the social use of punishment altogether.

For the purposes of experiment punishment can be defined as the reduction of certain behaviour by means of contingent ' events. Punishment can be looked at as the dark side of positive reinforcement : if responses bring about good stimuli they flourish, but responses that bring about bad stimuli dwindle away. This was the version of mechanical hedonism expressed by Thorndike's original Law of Effect (Ch. 1), in which gain or gratification stamped in causal behaviours while annoyance or loss stamped them out again. Theoretical confusion was introduced by the failure of some weak punishments used by Thorndike and Skinner to produce much dwindling away of persistent behaviours. In Thorndike's case he found that saying `wrong' after students had given a mistranslation did little to prevent them making further mistakes. Skinner was overim= pressed by the short-lived nature of response suppression which resulted from slap-back movements of the bar in his conditioning apparatus. It is now perfectly clear that the discouragement of actions by contingent punishment is if anything a stronger influence on behaviour than the incentive effects of positive reinforcers, as far as laboratory work with animals is concerned (Cambell and Church, 1969). The problem with punishment is not that it always is ineffective, but that it has the unpleasant side-effects of stress, anxiety, withdrawal and aggression in the subject.

The basic suppressive effect of punishing stimuli is illustrated in Figure 5.1. Rats with previous experience of pressing a lever on a fixed interval schedule of reinforcement were given electric

69

shocks every time they pressed the lever, for a fifteen-minute period only. Then they were left to respond freely, being given neither food nor shocks. Different groups of these rats had been given different levels of shocks and the rate of response following punishment depended on the voltage. A low voltage shock did not slow down responding appreciably when it was delivered, but reduced the total number of responses given in the next nine daily hours of extinction testing. A high voltage shock stopped response in most of the subjects almost entirely while it was being delivered and for the remaining nine days of the experiment. gif Fig. 5.1 Effect of punishment on extinction. (After Boe and Church, 1967: see text for explanation)

This indicates a return to the original conception of reward and punishment as response consequences which have opposite effects on behaviour. Reinforcement strengthens and encourages while punishment on the whole weakens and discourages. Specialized mechanisms and side-effects apply unequally to the reinforcers and punishers, but in many circumstances actions represent a balance between attraction and reluctance built up by favourable and unfavourable outcomes (Mackintosh, 1974). Few responses have universally good consequences, and

70

compensations may be found in apparently unrelieved gloom. Not many behaviours are therefore without some degree of what Miller (1944) called approach-avoidance conflict. Animals drawn towards a goal by food may also be repelled by previous punishments received at the goal, and in such cases may be seen to approach and then withdraw from the goal alternately. Given a choice between two goals, each of which represents a mixture of good and bad consequences, it is reasonable to choose the best mixture. The effects of various mixtures of food reward and electric shocks on choices made by rats has been studied by Logan (1969). The animals had equal lengths of experience with certain rewards and punishments given for running down white or black alleys before having the opportunity of choosing between the two alternative alleys. If given seven pellets of food for the white choice against one pellet for the black, they had a strong preference for the larger amount of food, which was only partly outweighed even by a strong shock. An initially strong preference was also evident for the immediate receipt of three pellets as opposed to an identical amount of food delayed for twelve seconds; however, this preference was more readily counteracted by having to run across the electrified grid before reaching the food, and was reversed when the punishment was at a high intensity. Logan interprets this result to mean that some effects of punishment are symmetrically opposite to those of reward, punishment being said to supply negative incentive which can cancel out the positive incentive due to reinforcement.

From this point of view, neither the effects of punishment, nor those of reinforcement, can be judged without reference to the other. This fits in with many other results. For instance, the conditioned suppression effects mentioned in the last section depend very much on the degree of positive incentive : suppression is greatest when it does not seriously interfere with the gaining of positive reinforcers.

Furthermore, the effects of punishment are much more pronounced if they are combined with reinforcement, instead of being in opposition to it. One way to do this is to punish response A, at the same time giving plenty of positive reinforcement for response B, where making response B makes it impossible to perform response A as well (Azrin and Holtz, 1966).

71

Natural defensive reactions to aversive stimuli A great focus of attention in recent years has been the degree to which species-specific behaviours change the effects of learning and reinforcement. 'Species-specific' is a noncommital way of discussing factors which might be instinctive or innate. Analysis of naturally-occurring behaviour patterns has made great strides under the influence of ethologists such as Lorenz and Tinbergen and there is a need to integrate ethological results with laboratory findings (Hinde, 1970). It has long been thought that the natural reactions of animals to aversive stimuli, such as running away, panic or `freezing' in a rigid posture, could facilitate or interfere with learning processes, and these effects are now being studied in more detail (Bolles, 1970). Where it is possible to get away from a source of unpleasantness, most animals have an in-built tendency to do so, by running, leaping, flying or any other method at their disposal. It is suggested that the presence of aversive stimuli, whether as negative reinforcers in escape and avoidance learning, or as punishers, will predispose the animal to these natural reactions. If the subject is required to perform a response which is compatible with its first impulses, learning will be facilitated, but if a learning task involves responses which are incompatible with natural reactions, the task will be extremely difficult. An extreme case of this occurs with the much-studied button-pecking response of pigeons. Although pecking is maintained under almost all conditions where food has been present in the immediate environment, and there is an identifiable visual cue, it is virtually impossible to persuade a pigeon to peck a key to avoid an electric shock (Herrnstein and Loveland, 1972). It can be done, but only with shocks of critical intensity, which motivate the response without inducing competing behaviours. Running or jumping to get away from the location of shocks is on the other hand a very easily-learned response. Although it is quickly learned, it may be difficult to alter by punishment once it has been established. Rats trained to run away from shock in a start box down an electrified alley to a`safe' box will continue to run over the electrified grid, even if the starting box is made `safe', and even if the response has once been allowed to die out when the entire maze has been made `safe'. Dogs trained to jump over a hurdle to escape shocks may be unable to alter this pattern when they are jumping from a safe compartment over to a live one. Natural responses, once learned, are very persistent.

72

Apart from merely running away, hiding or returning to familiar quarters are common responses to dangerous or aversive situations, but have rarely been investigated. A third alternative is the `freezing' response. This is often part of the response pattern in the confined space of the operant-conditioning chamber, and sometimes contributes to the conditioned suppression effect.

Stimulus compatibility. The conditioning of emotional reactions to new stimuli is usually studied with rather artificial cues such as buzzers and flashing lights being associated with electric shocks. There are some natural combinations which allow conditioning to occur more rapidly. For rats, tastes go with sickness so that an animal given an emetic drug will tend to lose interest in the last food tasted before intestinal upset began, even if the last meal was quite some time before the illness (up to several hours). On the other hand, tastes do not go with external factors, and so the external unpleasantness of electric shocks is connected with the place where it happened, rather than with what was being eaten at the time. It is not clear whether this effect is due to previous experience that internal signals go with internal consequences while external signals go with external events, or whether there is some 'wired-in' preference for certain kinds of association. There certainly seem to be strong natural aversions to such things as spiders and snakes; and it has been pointed out that many more people have phobias for such objects as these than have phobias for things like furry white lambs. We should not forget, though, that even painful shocks can become attractive if they are associated often enough with positive reinforcers and pleasant stimuli can become aversive after being paired with fear: initial preferences can often be overcome by learning (Pavlov, 1927).

Fighting back The best form of defence may be attack, especially if it is too late to retreat. It is well known that animals can be most dangerous when they are frightened or cornered, and that people are often more violent when they are depressed, and more irritable when they are anxious. Aggression and fear are entwined physiologically as the `fight or flight' mode of the autonomic nervous system and in the central control of emotion by the brain (A2). It is not surprising that first reactions to disappointment

73

or fear may be irrational aggression or hate, whether in the case of hostility towards innocent persons after dangerous incidents in motor cars or in the ancient practice of killing the messenger who brought bad news.

The simplest stimulus for aggression may be pain. Animals will substitute attack for escape if another animal is present when shocks are given. Even inappropriate objects like stuffed dolls and rubber balls may elicit threat postures or be scratched and bitten if the floor of an animal's cage is suddenly electrified. After being shocked, a pigeon or rat will move across the cage to attack another animal, and monkeys have been trained to press levers so that a ball is presented for them to bite. Being shocked, therefore, can be said to induce a mood of aggression, and in that mood animals do go to some trouble to engage in attack (Hinde, 1970).

Other aversive situations may also induce aggression. One of the first behavioural categories discussed by Pavlov is the `freedom reflex' by which he meant the struggling and fighting which frequently accompanied the first attempts to restrain the movements of a dog by a harness. Some process of taming is needed before most animals will put up with physical confinement or handling without showing aggression. Loss of freedom of movement because of physical restraint may be paralleled as a source of resentment and aggression by intensive social restraints in human institutions.

Being told what to do, or being prevented from doing things by bureaucratic restrictions or cultural taboos, can perhaps be classified as a form of frustration. Frustration has sometimes been defined as the prevention of a highly-motivated response, though more recently it has been interpreted as the absence or loss of rewards (Amsel, 1972). We regard it as self-evident that if trains are repeatedly cancelled, it is more likely that normally impassive bowler-hatted commuters will show violence towards railway staff, even if the personnel at hand have no responsibility for the cancellations. Somehow the inconvenience and aversiveness of waiting, or the loss of the routine reinforcers of catching a train after getting to the station, brings about aggression, whether or not it is `justified'. Such aggression does not always take the form of violence. Vocal abuse and postural threats are frequent preliminaries to, or substitutes for fighting in human and animal confrontations and even more diverse expressions of aggression are possible. In one experiment designed

74

to induce frustration heavy smokers were kept up all night, not being allowed to sleep, smoke, read or otherwise amuse themselves. They were led to expect an early morning meal which was cancelled at the last moment. As predicted, this produced decidedly uncomplimentary comments directed at the psychologists running the experiment. Further aggression was shown in later paper and pencil tests when a subject produced grotesque drawings of dismembered and disembowelled human bodies, said to represent `psychologists'. The expressions of hostility engendered by disappointment and frustration are more limited in laboratory animals, but the emotional effects of not getting the usual food reinforcer can be strong enough to induce prolonged attacks on another animal of the same species (e.g. Azrin et al., 1966).

It is extremely important to consider the influence of aversive stimuli on aggression as one of the adverse effects of negative reinforcers. But reaction to aversive conditions is only one of the many factors which influence aggression. Competition and aggression are closely interlocked with social dominance, sexual behaviour and defence of home territories in many animal species (Lorenz, 1966). For us, social and cultural conditioning, especially by imitation learning (Ch. 9) is the overriding monitor of individual aggressiveness (Bandura, 1973).

Aversive stimuli may damage your health

It is possible to worry a good deal without getting ulcers, but worry and anxiety, or more generally stress, may be responsible not only for ulcers but also for proneness to heart attacks, lowered resistance to diseases and other health risks, besides endangering mental stability. Some of the conditions which cause this kind of stress may be very complicated, but there is little doubt that intense aversive stimuli can contribute to it. Recent theories and experiments have concentrated on the subsidiary psychological circumstances that might make a certain amount of physical pain more or less `stressful'. The degree of stress is gauged by bodily changes such as loss of weight or ulceration of the stomach, and psychological changes such as inability to learn new problems. Generally, it appears that uncertainty and conflict compound the stressful effects of receiving electric shocks, but the degree of stress is considerably relieved by the experience of being able to `cope' with the aversive stimuli by successful escape or avoidance responses (Weiss, 1971).

75

Other factors such as prolonged confinement and lack of rest may also exacerbate stress. The best-known example is the `executive monkey' experiment (Brady, 1958). Monkeys were paired together on a free-operant avoidance schedule, so that only one monkey, the `executive', could make avoidance responses, but both monkeys received the same shocks. In fact, the executive monkeys worked so hard, making hundreds of responses per hour, that shocks were very infrequent, only about one per hour. The shift-work system was unusually demanding : six hours on then six hours off, all day every day. A few weeks of this produced sever ulceration of the stomach for the `executives' but not for the other monkeys. However, with less rigorous work schedules, being the `executive' may often seem less stressful; it is usually less upsetting to be the driver than the passenger in a badly-driven fast car. Certainly it seems in experiments with rats, that ability to take evasive action moderates stress. With a different schedule from the `executive monkeys', `executive rats', whose actions determined the shocks received by them and a second animal, had more normal weight gains and far fewer ulcers than their passive partners. In these cases the `executives' did not work excessively, and the passive animals received quite a few shocks. The unpredictability of shocks may increase stress for the passive animal, while achievement of some `safe' periods reduces stress for the executive.

The unpredictability of unpleasant events may not only produce more stress, but may diminish future capacity to deal with more orderly and avoidable aversive stimuli. It has been proposed that exposure to random shocks, which an animal can do nothing to escape or avoid, leads to a state of learned helplessness which prevents further efforts to learn in related situations (Maier et al., 1969). There certainly seem to be cases of `giving up' in experiments where dogs are given inescapable shocks before training in a shuttle-box task. Although this task is usually learned very rapidly (p. 65), complete failure to learn can be found in dogs with previous experience of enforced failure.

Thus, although the picture is far from clear, the intensive use of negative reinforcers and punishing stimuli can be accompanied by risks to physical and emotional health, especially if combined with uncertainty and conflict.

Therapeutic use of aversive stimuli

It may seem odd, in view of the damaging effects of aversive

76

stimuli discussed in the last two sections, that unpleasant stimuli could ever be employed therapeutically. There are two reasons why therapeutic uses for negative reinforcement and punishment can be found. First, there may be problems such as alcoholism or self-injury which it is felt are serious enough to outweigh the disadvantages of aversive stimulation. Second, not all unpleasant experiences are harmful, and a certain amount of stress may even be found invigorating or interesting, without leading to emotional disaster. A modicum of stress in early life may actually be a prerequisite to later psychological adjustment. In any event, mild negative reinforcers need not always be damaging, and can sometimes be used to bring long-term benefits.

Time-outs to reduce problem behaviours. A standard form of mild punishment which has been used to deal with several kinds of disruptive behaviour in children is termed a time-out. The essential part of this is to remove the child from any possible social reinforcement for problem behaviours such as temper tantrums. Usually the time-out consists of five or ten minutes of social isolation. As a consistent consequence of the offending behaviour, the child is placed alone in his bedroom, or possibly in a room reserved for time-out purposes, and stays there for the minimum period, or for longer if the problem behaviour continues. It has been reported that this technique diminishes otherwise intractable behaviour problems, with no undesirable side-effects. An early case was that of Dicky, who after having had serious eye operations when he was two years old was hospitalized as a childhood schizophrenic when only three (Wolf et al., 1964). A critical problem was that he needed to wear spectacles to safeguard his sight, but did not do so. He was shaped up to wear glasses with food reinforcers (Ch. 3) but developed a habit of taking them off and throwing them across the room about twice a day. To suppress this behaviour, Dicky was simply put in his room for ten minutes if he threw his glasses, and not allowed out if he threw a tantrum. He stopped throwing his glasses after five days. The same time-out procedure was also used initially to reduce the frequency of Dicky's tantrums, which included self-destructive behaviours such as head-banging and face-scratching.

Time-outs in the form of removal from the dining room were used to eliminate food-throwing and food-stealing. After these

77

and some other training methods for improving bedtime behaviour and socially appropriate speech, performed with the cooperation of the parents, Dicky was able to return home and showed continued progress six months later.

Time-outs may not be very strong aversive stimuli; their effect may be due to loss of positive reinforcers. But there is no hard and fast distinction between situations which are aversive because they signify loss of social rewards and situations that are lonely or unpleasant in themselves. It has been found, though, that social isolation procedures can be very much more effective than the use of differential attention by parents. Wahler (1969) described the treatment of children whose parents had sought psychological help because the children were 'oppositional'. That meant they were unresponsive to parental requests or demands and would have tantrums, refuse to go to bed, jump on furniture and so on. Even when the parents were successfully trained to ignore `bad' behaviours completely but reinforce `good' behaviours immediately by giving attention and approval, the behaviour of the children did not improve. The parents were then instructed and supervised in the use of the time-out technique of isolating children in their bedrooms immediately after oppositional behaviours, but giving approval for cooperative activities. This produced dramatic and sustained changes in the children's behaviour.

More severe treatments. Traditional methods of `making the punishment fit the crime' have recently been included in a complex package of techniques applied to persistent behavioural problems such as self-stimulation in autistic children and bedwetting as well as daytime incontinence in retarded or normal children. Abundant positive reinforcement by social encouragement and food rewards for desirable substitute activities accompany adverse consequences for target behaviours. For instance, programmes for the successful retraining of incontinent adults or children have included some self-correction of accidents - the subject has to wash his own clothes or make his own bed (Azrin et al., 1974).

The most controversial application of punishment and negative reinforcement procedures is the adoption of electric shocks as a stimulus for the training of retarded or autistic children. These are cases where severe treatments are brought into play because there are even more severe behavioural difficulties. If

78

children have to be kept in strait-jackets because they are likely to break their bones or cause dangerous wounds by selfinjuring responses, the use of electric shocks to punish selfinjury may arguably be less cruel than it might first be thought (Lovaas and Simmons, 1969). It has also been found that autistic children who are completely withdrawn and have virtually no social behaviours can be shaped to some basic social actions such as hugging by the negative reinforcement of escape from electric shocks (Lovaas et al., 1965).

Electrical stimulation and drug-induced nausea have both been used to induce conditioned emotional reactions during aversion therapy (Ch. 2).

Summary and conclusions Operant conditioning can take the form of learning to get away from frightening or dangerous situations as well as getting closer to positive reinforcers. An act learned or strengthened because it removes or prevents disagreeable sensations is said to be negatively reinforced. A response suppressed or weakened when it is followed by pain or loss of reward is said to be punished. It is necessary to take into account conditioned emotional reactions to stimuli associated with aversive events, since these may influence responses not directly reinforced or punished. Other reactions, such as running away from dangerous situations, or acting aggressively towards other individuals present, may also take place. Strong aversive stimuli which are unpredictable, or cause conflict, contribute to stress. Despite these many side effects, negative reinforcement and punishment have occasional therapeutic applications.

79