|
POSITIVE REINFORCEMENT DOG TRAINING
by Robert Loftus
Introduction
Dog training has been around for as long as dogs have been
domesticated, it is only in the last hundred years or so that structured
training programs were developed, the most prominent being that outlined in
Colonel Konrad Most’s manual "Training Dogs", published in 1910.
Many of the methods used in these early training programs are still in existence
to day and are used to successfully train many dogs and owners, but, there are
many more dogs and owners that fail to adapt to these methods. In recent years,
enlightened dog trainers have turned away from the physical and harsh training
methods, and are adopting more humane and fairer approach to training dogs.
These methods are based upon scientifically proven, psychological processes of
how animals learn.
The methods used to train dogs should depend upon a firm
foundation of practical experience and an understanding of how dogs learn using
"classical" and "operant" learning. The dog training class
instructor may also utilize in addition to these, other forms of learning to
help the dog’s owner learn how to teach their dog, namely
"observational" (demonstrating) and "cognitive" learning
(printed instructions).
The relative importance of different forms of learning:
|
Type of learning |
Dog |
Human |
| |
|
|
|
Classical |
High |
High |
|
Operant |
High |
High |
|
Cognitive |
Very Low |
High |
|
Observational |
Low |
High |
| |
|
|
Classical learning is responsible for most forms of emotional
conditioning (how we feel about something) it can override other forms of
learning. This fact is especially important to remember when dealing with
problem behaviour that is driven by any negative emotional states in the dog,
such as fear, anger and frustration.
Useful Definitions
Stimulus:
Any event in the
environment that provokes a behavioral response in the dog. ( a sound, a
movement, a smell, food).
Behaviour:
Behaviour is a series of learned or innate responses to
environmental stimulus.
Stress :
A physical and psychological response caused by the inability
to cope with the environment.
Classical Learning:
The forming of an association between two stimulus, it is
concerned mainly with involuntary or reflexive behaviour (eye-blink,
salivation). Many of our emotions also respond to classical learning.
Operant Learning:
Learning in which the likelihood of behaviour is increased,
decreased or suppressed, by the consequences that follow it.
If the consequence is:
Reinforcing – Behaviour will increase.
Non-Reinforcing – Behaviour will decrease (extinction).
Punishing -- Behaviour may be suppressed.
Cognitive Learning:
Learning by insight, processing new and learned information.
(Reading a map, reading a book, solving a problem)
Observational Learning:
Learning by observation or imitation, this may also be a form
of Cognitive Learning. (Observing the demonstration of a process).
Positive Reinforcement Training:
Where we establish positive physiological and positive
psychological conditions for learning to occur.
Reinforcer:
A reinforcer is anything following a behaviour that increases
that behaviour and is contingent upon it.
Reward:
Something the dog wants being given that is not necessarily
contingent upon on the performance of a behaviour.
Primary Reinforcers:
These are reinforcers that are inherently rewarding.
Examples: Eating food, drinking, sleep, sex.
Conditioned Reinforcers:
These are reinforcers that are not inherently rewarding, but
receive rewarding properties through a conditioned association with a primary
reinforcer. Example: The word "Yes". The "click" sound of a
clicker.
Positive Reinforcement:
Adding a reinforcer in order to increase the likelihood of a
wanted behaviour. Example: Giving a food treat after a wanted behaviour.
Negative Reinforcement:
Removing an unwanted or aversive stimulus in order to
increase a wanted behaviour. Example: Removing the dog from a fearful or
stressful situation when the dog is focused on you.
Negative Punishment:
Removing a reinforcer in order to suppress unwanted
behaviour.
Example: Removing yourself from the dog’s environment when
the dog behaves hyperactively.
Positive Punishment:
Adding an aversive in order to suppress unwanted behaviour.
Example: Checking the dog with a choke chain causes pain.
Note: The use of punishment in dog training can destroy
or damage the relationship and mutual respect and trust between the dog and
trainer. The consequences are not predictable, and may increase the level of
fear and stress, may also cause fear-induced aggression or displacement
behaviour. In positive reinforcement training we avoid the use of punishment to
suppress unwanted behaviour, we focus on teaching and reinforcing behaviour we
want from our dog.
THE OPERANT MODEL (incl. Extinction)
REINFORCEMENT
POSITIVE
REINFORCEMENT
NEGATIVE REINFORCEMENT
(+R) something wanted
added
(-R) something unwanted taken away
Behaviour Increases
NON – REINFORCEMENT
Neutral
Consequence
No Consequence
Behaviour Decreases (Extinction)
PUNISHMENT
POSITIVE
PUNISHMENT
NEGATIVE PUNISHMENT
(+P) something unwanted
given
(-P) something wanted taken away
Behaviour Suppressed
When using Operant learning our dog’s behaviour is
voluntary and our dog is operating on the environment to earn reinforcement.
The consequences for behaviour are defined by how the dog
(the operant) sees it, not the trainer.
THE CLASSICAL LEARNING MODEL
( Pavlovian
conditioning).
The pairing of a stimulus with a response causes an
association whereby the presentation of the stimulus will then elicit the
involuntary response. Emotions can also be conditioned with classical
conditioning.
STIMULUS
à
RESPONSE
(Food)
(Salivation)
STIMULUS
à
STIMULUS
(Bell)
(Food)
STIMULUS
à
RESPONSE
(Bell)
(Salivation)
Another example, Child sitting on ground beside mother…
STIMULUS
à
RESPONSE
(Spider on child’s
arm)
(Mother screams in fear)
STIMULUS
à
STIMULUS
(Mother’s
fear)
(Child’s fear)
STIMULUS
à
RESPONSE
(Spider)
(Child is fearful)
Fear as a conditioned emotional response can become
generalised very easily to cause fear of all things crawling and flying, even
known harmless ones.
Classical learning is also used to attach "cues" to
wanted behaviour.
STIMULUS
à
RESPONSE
(Environmental cue to
sit) (Dog
sits)
STIMULUS
à
STIMULUS
(Sound
"sit")
(Environmental cue to sit)
STIMULUS
à
RESPONSE
(Sound
"sit")
(Dog sits)
Note: The consequence is not part of classical learning. But
both classical and operant learning can occur simultaneously.
SYSTEMATIC DESENSITISATION:
This is a classical conditioning process of changing a
conditioned negative emotional response (usually fear) to a stimulus that is
reduced to a level below the threshold level of the fear response.
A new association is then formed (Counter conditioning) to
the fearful stimulus using a stimulus with a positive emotional response (i.e.,
food).
FEAR
STIMULUS
à
FEAR RESPONSE
Fear stimulus
(weak) à
stress response (weak)
PLEASURE STIMULUS à
Fear stimulus (weak)
(Food)
Fear stimulus
(weak) à
PLEASURE RESPONSE
(Now pleasure stimulus)
Repeat process gradually increasing the fearful stimulus
while at the same time maintaining a pleasure response.
TRAINING PREPARATION (the
Tools)
To use positive reinforcement we must decide what PRIMARY
or unconditioned reinforcer we are going to use. The most practical to use would
be food that the dog likes.
FOOD: Type of food, the dog must want it.
Size of pieces of food, must be easy for the dog to eat.
Handling suitability, easy to carry and use.
Adjust the dog’s food intake to allow for treats.
Decide on what CONDITIONED REINFORCERS we are going to
use. Before conditioning can take place there must be a NEUTRAL RESPONSE
to them from the dog. The purpose in using a conditioned reinforcer is to be
able to instantly "mark" a wanted behaviour as rewarding to our dog.
This is a MARKER or COMMUNICATION with the dog that the dog can
understand.
Use Classical Learning to make the association between the
"Marker" and the Food.
MARKER
à
FOOD
Repeat a number of times till the marker is conditioned, this
will be seen as an instant response from the dog "where’s my treat?"
Every time you use a conditioned reinforcer you MUST
always follow it with a treat to maintain the conditioning.
Examples:
· The word (sound)
"YES" can be conditioned to mark wanted behaviour, must always be
followed by a treat.
· The
"CLICK" of a clicker can be conditioned to mark wanted behaviour
very precisely, must always be followed by a treat.
IMPORTANT POINTS TO REMEMBER:
When using a Conditioned Reinforcer we must pay great
attention to the following.
· The TIMING OF
MARKER must coincide with the wanted behaviour.
· The RATE OF
REINFORCEMENT must be high to maintain interest and focus on learning the
wanted behaviour.
· The CRITERIA
must allow the dog to win at least 80% of the time.
SHAPING BEHAVIOUR
Most behaviour we see in our dog is in a constant state of
change. The rate of change may vary from very small changes to large ones. When
we use a marker to give a dog the information on which behaviour or part of a
behavioral response, the timing is critical if we want to change the behaviour
by small amounts so that a clear signal is given to the dog. Poor timing can
lead to what is known as "lumping" which usually leads to unstable
long-term performance of the finished behaviour.
To shape behaviour we must use very small changes in the
criteria we set for reinforcement (to maintain the 80% rate), this is the key to
successful shaping of behaviour. This shaping process is known as using SUCCESSIVE
APPROXIMATIONS.
PRESENT BEHAVIOUR
Small change
Small change
Small change
Small change
Small change
Small change
Small change
Small change
Small change
TARGET BEHAVIOUR
When we reach our target behaviour we attach a
"cue" to it using classical learning to put the behaviour "on
cue" there is now no further need to use the marker for this behaviour…
but we should always reinforce wanted behaviour (shaped or
otherwise). The value of the reinforcement may vary to reward excellence.
i.e. one "yes" followed by lots of treats.
ATTACHING A "CUE" TO A BEHAVIOUR:
To form an association between a stimulus "cue" and
a response we must first know that the behaviour is going to happen. We then
introduce the "cue" immediately prior of during the early part of the
behaviour we want to put on "cue".
START TRAINING (Using the Tools)
By using positive reinforcement training you instantly
enhance your relationship with your dog, no matter what previous experiences
good or bad your dog has had, he will develop a positive relationship with you.
The training process consists of a simple learning sequence.
· Decide on what
behaviour you want from your dog.
· Get the
behaviour, or the nearest behaviour to it and use your marker to shape it to
what you want using Reinforcement and Successive Approximations.
· When you have the
behaviour you want attach a ‘cue’ to that behaviour, then only reinforce
"cued" examples of the behaviour to achieve STIMULUS CONTROL
of the behaviour.
LEARNING SEQUENCE
1. "CUE"
2. BEHAVIOUR with MARKER ("yes")
3. Verbal Praise "Good Boy"
4. Patting and Petting.
5. REINFORCEMENT
By selectively using many different combinations of Nos. 3. 4. 5. We can vary
the value of the reinforcement for wanted behaviour.
LEARN PATIENCE
The rate of learning a behaviour may vary from dog to dog,
depending upon previously learned associations and environmental influences with
that behaviour and/or stimulus. This is especially true in a class situation. It
is beneficial to keep training sessions short and allow the dog time to absorb
newly learned behavioral responses. Always find something done well to
reinforce the owner’s effort too.
NEUTRAL RESPONSE
When a dog displays an unwanted behaviour or one that we want
to extinguish, we should NOT reinforce that behaviour by reacting to it, as our
reaction may be seen as a positive interaction by the dog. By using a NEUTRAL
RESPONSE, ignoring the dog for a count of three seconds, we can change what the
dog sees as a possibly rewarding response to a non-rewarding response. Remember,
unwanted or ‘bad’ behaviour is only ‘bad’ to us.
LEAST REINFORCING STIMULUS ( LRS ):
The LRS is the minimum amount of reinforcement required by
the dog to maintain interest in the trainer and the behaviour/tasks being
trained. It is interesting to note that the value of an LRS is quantified by the
dog not the trainer, this can range from the mere 'presence' of the trainer in
the dog's environment, to food treats in a highly distracting one.
Unfortunately many dog trainers seem to misunderstand the
idea of the LRS as being a positive response by the trainer, and tend to think
of it as the 'removal' of something rewarding to the dog (-P), which will in
fact reduce behaviour, so it cannot be a LRS.
COLLARS
The only collar we should use is a flat leather or webbing
one to hold our dog’s registration and identification disc. The purpose of a
collar should be as a neutral "fail-safe" device, this means that it
should not be used as a controlling device when training.
When a problem exists with a learned and reinforced unwanted
behaviour, a collar, or a head collar, may be used to assist the handler during
periods of intense unwanted behaviour to manage the problem. We must ensure that
the dog receives little or no reinforcement either positive or negative for this
behaviour, and certainly no PUNISHMENT. A flat collar attached to a leash is
used as a safety device while we teach an alternate wanted behaviour.
LEASHES
As for using a leash in training, it is best done without
one. Our dog wants to be with us because we are reinforcing to be with. If this
is not the case, we need to work on our relationship with our dog. Having our
dog off leash (in a safe environment) avoids the possibility of unwanted
stimulus affecting training caused by the leash. The leash when used should be a
neutral "fail-safe" device, used only to ensure our dog’s safety. In
a club situation, or on the street they are necessary to protect our dog.
INSTRUCTING
Many training problems exist because of the unwitting
reinforcement of unwanted behaviour, the first duty of a class instructor is to
ensure the safety and well being of the dogs and handlers in their care. The
next most important job is to start to build a positive ‘happy’ relationship
between handler and dog this will be the foundation for all successful training
using positive reinforcement.
"HAPPY TRAINING"
Robert Loftus 2001 Top
of page |