NEW WEBSITE COMMING SOON 

 

Up

POSITIVE REINFORCEMENT DOG TRAINING

 

by Robert Loftus

 

Introduction

Dog training has been around for as long as dogs have been domesticated, it is only in the last hundred years or so that structured training programs were developed, the most prominent being that outlined in Colonel Konrad Most’s manual "Training Dogs", published in 1910. Many of the methods used in these early training programs are still in existence to day and are used to successfully train many dogs and owners, but, there are many more dogs and owners that fail to adapt to these methods. In recent years, enlightened dog trainers have turned away from the physical and harsh training methods, and are adopting more humane and fairer approach to training dogs. These methods are based upon scientifically proven, psychological processes of how animals learn.

The methods used to train dogs should depend upon a firm foundation of practical experience and an understanding of how dogs learn using "classical" and "operant" learning. The dog training class instructor may also utilize in addition to these, other forms of learning to help the dog’s owner learn how to teach their dog, namely "observational" (demonstrating) and "cognitive" learning (printed instructions).

 

The relative importance of different forms of learning:

Type of learning

Dog

Human

     

Classical

High

High

Operant

High

High

Cognitive

Very Low

High

Observational

Low

High

     

Classical learning is responsible for most forms of emotional conditioning (how we feel about something) it can override other forms of learning. This fact is especially important to remember when dealing with problem behaviour that is driven by any negative emotional states in the dog, such as fear, anger and frustration.

 

Useful Definitions

 

Stimulus:

Any event in the environment that provokes a behavioral response in the dog. ( a sound, a movement, a smell, food).

 

Behaviour:

Behaviour is a series of learned or innate responses to environmental stimulus.

 

Stress:

A physical and psychological response caused by the inability to cope with the environment.

 

Classical Learning:

The forming of an association between two stimulus, it is concerned mainly with involuntary or reflexive behaviour (eye-blink, salivation). Many of our emotions also respond to classical learning.

 

Operant Learning:

Learning in which the likelihood of behaviour is increased, decreased or suppressed, by the consequences that follow it.

If the consequence is:

Reinforcing – Behaviour will increase.

Non-Reinforcing – Behaviour will decrease (extinction).

Punishing -- Behaviour may be suppressed.

 

Cognitive Learning:

Learning by insight, processing new and learned information.

(Reading a map, reading a book, solving a problem)

 

Observational Learning:

Learning by observation or imitation, this may also be a form of Cognitive Learning. (Observing the demonstration of a process).

 

Positive Reinforcement Training:

Where we establish positive physiological and positive psychological conditions for learning to occur.

 

Reinforcer:

A reinforcer is anything following a behaviour that increases that behaviour and is contingent upon it.

 

Reward:

Something the dog wants being given that is not necessarily contingent upon on the performance of a behaviour.

 

Primary Reinforcers:

These are reinforcers that are inherently rewarding.

Examples: Eating food, drinking, sleep, sex.

 

Conditioned Reinforcers:

These are reinforcers that are not inherently rewarding, but receive rewarding properties through a conditioned association with a primary reinforcer. Example: The word "Yes". The "click" sound of a clicker.

 

Positive Reinforcement:

Adding a reinforcer in order to increase the likelihood of a wanted behaviour. Example: Giving a food treat after a wanted behaviour.

 

Negative Reinforcement:

Removing an unwanted or aversive stimulus in order to increase a wanted behaviour. Example: Removing the dog from a fearful or stressful situation when the dog is focused on you.

 

Negative Punishment:

Removing a reinforcer in order to suppress unwanted behaviour.

Example: Removing yourself from the dog’s environment when the dog behaves hyperactively.

 

Positive Punishment:

Adding an aversive in order to suppress unwanted behaviour.

Example: Checking the dog with a choke chain causes pain.

 

Note: The use of punishment in dog training can destroy or damage the relationship and mutual respect and trust between the dog and trainer. The consequences are not predictable, and may increase the level of fear and stress, may also cause fear-induced aggression or displacement behaviour. In positive reinforcement training we avoid the use of punishment to suppress unwanted behaviour, we focus on teaching and reinforcing behaviour we want from our dog.

 

 

THE OPERANT MODEL (incl. Extinction)


REINFORCEMENT

POSITIVE REINFORCEMENT                           NEGATIVE REINFORCEMENT

(+R) something wanted added                                       (-R) something unwanted taken away

Behaviour Increases


NON – REINFORCEMENT

Neutral Consequence                                                                                                No Consequence

Behaviour Decreases (Extinction)


PUNISHMENT

POSITIVE PUNISHMENT                                              NEGATIVE PUNISHMENT

(+P) something unwanted given                                         (-P) something wanted taken away

Behaviour Suppressed


 

When using Operant learning our dog’s behaviour is voluntary and our dog is operating on the environment to earn reinforcement.

The consequences for behaviour are defined by how the dog (the operant) sees it, not the trainer.

 

 

 

THE CLASSICAL LEARNING MODEL

(Pavlovian conditioning).

The pairing of a stimulus with a response causes an association whereby the presentation of the stimulus will then elicit the involuntary response. Emotions can also be conditioned with classical conditioning.

 

STIMULUS             à    RESPONSE

(Food)                                              (Salivation)

 

STIMULUS            à      STIMULUS

(Bell)                                               (Food)

 

STIMULUS            à    RESPONSE

(Bell)                                              (Salivation)

 

Another example, Child sitting on ground beside mother…

 

STIMULUS                à   RESPONSE

(Spider on child’s arm)              (Mother screams in fear)

 

STIMULUS                 à   STIMULUS

(Mother’s fear)                               (Child’s fear)

 

STIMULUS                 à   RESPONSE

(Spider)                                         (Child is fearful)

 

Fear as a conditioned emotional response can become generalised very easily to cause fear of all things crawling and flying, even known harmless ones.

 

Classical learning is also used to attach "cues" to wanted behaviour.

 

STIMULUS                à                 RESPONSE

(Environmental cue to sit)            (Dog sits)

 

STIMULUS                 à                STIMULUS

(Sound "sit")                                    (Environmental cue to sit)

 

STIMULUS                 à                RESPONSE

(Sound "sit")                                    (Dog sits)

 

Note: The consequence is not part of classical learning. But both classical and operant learning can occur simultaneously.

 

SYSTEMATIC DESENSITISATION:

This is a classical conditioning process of changing a conditioned negative emotional response (usually fear) to a stimulus that is reduced to a level below the threshold level of the fear response.

A new association is then formed (Counter conditioning) to the fearful stimulus using a stimulus with a positive emotional response (i.e., food).

 

 

FEAR STIMULUS               à              FEAR RESPONSE

 

Fear stimulus (weak)            à              stress response (weak)

 

PLEASURE STIMULUS   à             Fear stimulus (weak)

(Food)

 

Fear stimulus (weak)             à             PLEASURE RESPONSE

(Now pleasure stimulus)

 

 

Repeat process gradually increasing the fearful stimulus while at the same time maintaining a pleasure response.

 

 

TRAINING PREPARATION (the Tools)

 

To use positive reinforcement we must decide what PRIMARY or unconditioned reinforcer we are going to use. The most practical to use would be food that the dog likes.

 

FOOD: Type of food, the dog must want it.

Size of pieces of food, must be easy for the dog to eat.

Handling suitability, easy to carry and use.

Adjust the dog’s food intake to allow for treats.

 

Decide on what CONDITIONED REINFORCERS we are going to use. Before conditioning can take place there must be a NEUTRAL RESPONSE to them from the dog. The purpose in using a conditioned reinforcer is to be able to instantly "mark" a wanted behaviour as rewarding to our dog. This is a MARKER or COMMUNICATION with the dog that the dog can understand.

 

Use Classical Learning to make the association between the "Marker" and the Food.

 

MARKER              à               FOOD

 

Repeat a number of times till the marker is conditioned, this will be seen as an instant response from the dog "where’s my treat?"

Every time you use a conditioned reinforcer you MUST always follow it with a treat to maintain the conditioning.

 

Examples:

· The word (sound) "YES" can be conditioned to mark wanted behaviour, must always be followed by a treat.

· The "CLICK" of a clicker can be conditioned to mark wanted behaviour very precisely, must always be followed by a treat.

 

IMPORTANT POINTS TO REMEMBER:

When using a Conditioned Reinforcer we must pay great attention to the following.

 

· The TIMING OF MARKER must coincide with the wanted behaviour.

· The RATE OF REINFORCEMENT must be high to maintain interest and focus on learning the wanted behaviour.

· The CRITERIA must allow the dog to win at least 80% of the time.

SHAPING BEHAVIOUR

Most behaviour we see in our dog is in a constant state of change. The rate of change may vary from very small changes to large ones. When we use a marker to give a dog the information on which behaviour or part of a behavioral response, the timing is critical if we want to change the behaviour by small amounts so that a clear signal is given to the dog. Poor timing can lead to what is known as "lumping" which usually leads to unstable long-term performance of the finished behaviour.

 

To shape behaviour we must use very small changes in the criteria we set for reinforcement (to maintain the 80% rate), this is the key to successful shaping of behaviour. This shaping process is known as using SUCCESSIVE APPROXIMATIONS.

 

 

PRESENT BEHAVIOUR

Small change

        Small change

                     Small change

                                   Small change

                                                 Small change

                                                                Small change

                                                                              Small change

                                                                                             Small change

                                                                                                           Small change

                                                                                                                       TARGET BEHAVIOUR

 

 

When we reach our target behaviour we attach a "cue" to it using classical learning to put the behaviour "on cue" there is now no further need to use the marker for this behaviour… but we should always reinforce wanted behaviour (shaped or otherwise). The value of the reinforcement may vary to reward excellence.

i.e. one "yes" followed by lots of treats.

 

ATTACHING A "CUE" TO A BEHAVIOUR:

To form an association between a stimulus "cue" and a response we must first know that the behaviour is going to happen. We then introduce the "cue" immediately prior of during the early part of the behaviour we want to put on "cue".

 

 

 

START TRAINING (Using the Tools)

By using positive reinforcement training you instantly enhance your relationship with your dog, no matter what previous experiences good or bad your dog has had, he will develop a positive relationship with you.

 

The training process consists of a simple learning sequence.

 

· Decide on what behaviour you want from your dog.

· Get the behaviour, or the nearest behaviour to it and use your marker to shape it to what you want using Reinforcement and Successive Approximations.

· When you have the behaviour you want attach a ‘cue’ to that behaviour, then only reinforce "cued" examples of the behaviour to achieve STIMULUS CONTROL of the behaviour.

 

LEARNING SEQUENCE

1. "CUE"

2. BEHAVIOUR with MARKER ("yes")

3. Verbal Praise "Good Boy"

4. Patting and Petting.

5. REINFORCEMENT

By selectively using many different combinations of Nos. 3. 4. 5. We can vary the value of the reinforcement for wanted behaviour.

 

LEARN PATIENCE

The rate of learning a behaviour may vary from dog to dog, depending upon previously learned associations and environmental influences with that behaviour and/or stimulus. This is especially true in a class situation. It is beneficial to keep training sessions short and allow the dog time to absorb newly learned behavioral responses. Always find something done well to

reinforce the owner’s effort too.

 

 

NEUTRAL RESPONSE

When a dog displays an unwanted behaviour or one that we want to extinguish, we should NOT reinforce that behaviour by reacting to it, as our reaction may be seen as a positive interaction by the dog. By using a NEUTRAL RESPONSE, ignoring the dog for a count of three seconds, we can change what the dog sees as a possibly rewarding response to a non-rewarding response. Remember, unwanted or ‘bad’ behaviour is only ‘bad’ to us.

 

 

LEAST REINFORCING STIMULUS ( LRS ):

The LRS is the minimum amount of reinforcement required by the dog to maintain interest in the trainer and the behaviour/tasks being trained. It is interesting to note that the value of an LRS is quantified by the dog not the trainer, this can range from the mere 'presence' of the trainer in the dog's environment, to food treats in a highly distracting one.

Unfortunately many dog trainers seem to misunderstand the idea of the LRS as being a positive response by the trainer, and tend to think of it as the 'removal' of something rewarding to the dog (-P), which will in fact reduce behaviour, so it cannot be a LRS.

 

 

COLLARS

The only collar we should use is a flat leather or webbing one to hold our dog’s registration and identification disc. The purpose of a collar should be as a neutral "fail-safe" device, this means that it should not be used as a controlling device when training.

When a problem exists with a learned and reinforced unwanted behaviour, a collar, or a head collar, may be used to assist the handler during periods of intense unwanted behaviour to manage the problem. We must ensure that the dog receives little or no reinforcement either positive or negative for this behaviour, and certainly no PUNISHMENT. A flat collar attached to a leash is used as a safety device while we teach an alternate wanted behaviour.

 

 

LEASHES

As for using a leash in training, it is best done without one. Our dog wants to be with us because we are reinforcing to be with. If this is not the case, we need to work on our relationship with our dog. Having our dog off leash (in a safe environment) avoids the possibility of unwanted stimulus affecting training caused by the leash. The leash when used should be a neutral "fail-safe" device, used only to ensure our dog’s safety. In a club situation, or on the street they are necessary to protect our dog.

 

 

INSTRUCTING

Many training problems exist because of the unwitting reinforcement of unwanted behaviour, the first duty of a class instructor is to ensure the safety and well being of the dogs and handlers in their care. The next most important job is to start to build a positive ‘happy’ relationship between handler and dog this will be the foundation for all successful training using positive reinforcement.

 

"HAPPY TRAINING"

 

Robert Loftus 2001

Top of page