Dealing With Errors

There is no perfect way to deal with an "error" in training. Once the dog is doing something other than what you were hoping to see--whether you are building a new behavior or practicing a "known" one--every strategy has potential pitfalls.

Let's look at the main options.

There's punishment. Hurting or scaring a dog are off the table for me--but I might close a door that a dog is walking toward so they can't go through it, or move food away from the edge of the counter so a dog can't reach it, or slow my pace on leash so a dog can't charge forward. (Or I might not . . . keep reading.) Those are examples of "negative punishment," where the consequence for a behavior is the removal of access to positive reinforcement. But even mild penalties can poison cues or dampen the dog's enthusiasm for working with you--particularly if you find yourself penalizing over and over.

And then there is extinction--aka nonreinforcement. You have probably heard the common advice to just reward what you like and ignore what you don't. But it's not quite that simple, and dosage is key: if the rate of reinforcement in your training drops too low for too long, your dog may just leave.

Of course, that's not usually the first thing that happens when you hold out for too much too soon. When a previously reinforced behavior doesn't work, what you will see is other behavior that has worked in similar situations (the technical term for this is "resurgence"). If you don't respond to an error, and the dog has a strong or recent reinforcement history for the thing you were looking for in this situation, then it may shuffle back to the top quickly, and you can reinforce it. But if not, you may see what's called an "extinction burst," a flurry of variability and intensity that might include some behavior you would probably call "frustrated." Maybe in that burst you'll also see the behavior that you were hoping for, but it may come with outriders like snorting, huffing, sneezing, barking, or tippy tappy feet. Both punishment and extinction can even provoke aggression.

Even if your dog is not easily frustrated, you may want to avoid repeated "ignoring" of errors. Say you're training your dog to do both a paw target and a nose target. You are working on the paw target today, but they target with their nose instead. When you don't click the nose target, then they try the paw (resurgence!), which you reinforce. When that kind of thing happens, it's key to watch carefully what they do on the next rep. If they just do the paw again on the next round, you might be on the right path, but if you find yourself repeatedly swiping left on the "wrong" behavior to get to the "right" one, you may be starting to train the dog to do the whole sequence--nose then paw, nose then paw.

Redirection/resetting is often the best of our imperfect options. It keeps the dog engaged in the training session, and you can use it to get the dog into a position from which they are more likely to be successful. But do it too often and you may end up reinforcing the error more than the "right" behavior with your "reset cookies" or your go-to cues. Example: your dog lays down when you give the sit cue, so you toss a treat or give him a hand target to get him up, which you maybe also reinforce with a click and a treat. Do this once or twice, occasionally, and it's not likely to create a real problem. But do it over and over and over again and you are doing a nice job training your dog to lie down on the sit cue.

What you really want is not just a high rate of reinforcement but a high rate of reinforcement for mostly the desired behavior. If you are frequently repeating any of the strategies above, that is not what is happening.

Though I have my obvious preferences, the reality is that I might do any of these imperfect options depending on the specifics of the situation. But so long as you're not hurting or scaring your dog, then what you do at the moment an error is already occurring probably does not matter as much how often you do it or what you do next.

So, TL;DR, are some things you might do in the moment when a dog does something other than what you hoped:

  • Remove, diminish, or delay the reinforcer if safety requires, or if the behavior getting reinforced even once would really set things back. But then adjust your plan.
  • Wait a few seconds and see what else the dog offers, if you're pretty sure it will be what you want. If it's what you want, or something that sets the dog up to be more successful next time, reinforce that, but then adjust your plan. (Many trainers like to teach a "default" behavior of looking at you for further instruction--default meaning that it is so heavily reinforced in so many situations that it is likely to resurge first.)
  • Don't give the reward, but neutrally bring them into position to try again, if you think they're likely to get it right. (No guarantees that this won't reinforce the error, but you can take an educated guess.) If they don't, adjust your plan.
  • Just reset promptly, without waiting or dithering about whether you're reinforcing the error--but do it strategically. Do it fast, so you are likely to reinforce less of the "wrong" behavior. And do it so that when the dog looks up again, they are in a better position to start the "right" behavior than they were last time. But then adjust your plan.
  • If you're not able to make these calls in real time, end the training session, give the dog something else to do, and take a longer break while you . . . you guessed it, adjust your plan.

Note the common theme here is "but then adjust your plan." Here are some starting points for thinking about how:

  • How can I tweak the environment to better facilitate the "right" behavior?
  • What might have cued the "wrong" behavior? Do the conditions look like they did when that behavior was reinforced? (I see this often when people try to capture two different behaviors without changing their position, the props, the dog's starting position, or where they are delivering their treats.)
  • Are your criteria too high for the situation? (Spoiler alert: If your rate of reinforcement for the right thing is low, it's a good bet they are. Change the situation, or change your criteria.)
  • If you're cuing the behavior, is your cue unclear, or perhaps not what you think it is?
  • Is there another cue in the environment for a behavior that has been reinforced more than the one you are trying to ask for or teach?
  • Is your reinforcer valuable enough for the situation? (But note that there are many other things to look at before upping treat value! If the dog doesn't understand what to do, he still can't get the reinforcer.)
  • Is the way I'm delivering my reinforcer, or where I'm delivering it, helping or hurting my cause?
  • Does your dog have a competing need (e.g., to eliminate, to eat, to relieve pain or discomfort)?
  • Are there any prerequisite skills or pieces of the desired behavior you need to pop into the dog's repertoire on before you can expect them to pop out when you don't reinforce what they tried? What do you want to see resurge?

I may have missed some considerations in trying to recreate my thought process, but the most important question of all is: How are you going to use this error as information to change what you are doing?

What is "Differential Reinforcement of Other Behavior"?

You keep using that word. I do not think it means what you think it means.

At the time of this posting, there were a couple different explanations of this concept floating around dog-pro social media recently, all of which perpetuated common misconceptions. Having written a chunk of my thesis about what a differential reinforcement of other behavior (DRO) procedure looks like, why it's thought to work, and what best practices for using it may be, I thought I'd throw this into the mix.

Although DRO stands for "differential reinforcement of other behavior," it does not involve reinforcing "any other behavior" that occurs instead of the unwanted behavior. In fact, it doesn't necessarily involve reinforcing any behavior at all.

In DRO, reinforcement is contingent only on the passage of a certain amount of time without the occurrence of the unwanted behavior.

There are two main flavors of DRO, both of which can be split into subtypes depending on how the time intervals are set (e.g., the same every time or varied from rep to rep).

In "interval DRO," which is more common, the whole interval of time must go by without the behavior that you want to reduce. When time is up, reinforcement is delivered, regardless of what else the learner is doing.

If the learner does the unwanted behavior, typically the time requirement starts over. Sometimes, if the unwanted behavior occurs right as time runs out, extra time is added to avoid reinforcing the unwanted behavior.

In "momentary" DRO, reinforcement is delivered if the behavior is not occurring at the moment that an interval of time ends, regardless of whether the unwanted behavior occurred at any other time during that interval. If the unwanted behavior is occurring, the reinforcer is not delivered. This appears to be mostly used when it would be too hard to monitor the learner for the whole interval (i.e., if you have a classroom with 30 students and can't watch them all at once). It's also probably less effective than interval DRO.

Delivering the reinforcer when you see any of multiple acceptable alternative behaviors occurring--which is how I have seen DRO described in training circles sometimes--would be differential reinforcement of alternative behavior (DRA). That's a different procedure, in which reinforcement is contingent on the occurrence of specific behaviors, and withheld or diminished somehow when the unwanted behavior occurs.

Although it's referred to as a differential reinforcement procedure, reinforcement may not always be at work in DRO. There are at least four mechanisms thought to make it go; these are some of the most commonly discussed:

  1. Satiation. Best practices in DRO are to start with very low time criteria, so the start of DRO usually involves reinforcement presented densely, which may temporarily diminish the value of the reinforcer and therefore make behavior whose purpose is to get that reinforcer less likely to occur.

  2. Extinction, or the elimination/weakening of the relationship between the unwanted behavior and the reinforcer that maintains it. What we traditionally think of as extinction is the non-availability of reinforcement for a previously reinforced behavior, and DRO typically involves some of that. But noncontingent reinforcement (giving the reinforcer on a time-based schedule whether or not the behavior occurs) can also weaken the contingency between the behavior and the reinforcer, in essence making the behavior seem unnecessary.
    In fact, as Susan Friedman pointed out once during a joint presentation, DRO without extinction is identical to noncontingent reinforcement. Which is an antecedent intervention--something you do before the behavior occurs--not a reinforcement procedure.

  3. Punishment. Yes, that's right--DRO might run on punishment, particularly if the learner can tell that you are resetting the time. A signal that predicts an upcoming delay to reinforcement can become an aversive stimulus (one a learner will behave to avoid or escape). If something tells the learner that reinforcement has been delayed (a timer is reset) contingent on their behavior, that something can become a conditioned punisher, or punisher by association.

  4. And, finally, DRO may work because of accidental reinforcement of alternative behaviors. Whatever is happening when reinforcement gets delivered might get strengthened. But it takes consistency to develop a contingency (a relationship) between a behavior and a reinforcer, and that doesn't always happen when you're delivering reinforcement contingent on time elapsed. Evidence for reinforcement of other behaviors during DRO is sparse, and in my research and my experience with running DRO with clients, the alternative response that emerged during DRO was not one that was occurring with any regularity when the timer went off/reinforcement was delivered in the early stages.

This is not a complete investigation of DRO--I just wanted to clarify the definition for my fellow pros and other interested parties. But we could also talk about why it's often slower than DRA, why some people call it labor intensive or think it's a procedure to be avoided, the potential advantages of interval vs. momentary, best practices for its use, etc.

You can get more deets in any applied behavior analysis textbook. Here I used Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Applied behavior analysis (3rd ed.). Pearson.

But here are other sources related to the "mechanisms" section: Jessel, J., Borrero, J. C., & Becraft, J. L. (2015). Differential reinforcement of other behavior increases untargeted behavior. Journal of Applied Behavior Analysis, 48(2), 402–416. https://doi.org/10.1002/jaba.204

Jessel, J., & Ingvarsson, E. T. (2016). Recent advances in applied research on DRO procedures. Journal of Applied Behavior Analysis, 49, 991-995. https://doi.org/10.1002/jaba.323

Langford, J. S., Pitts, R. C., & Hughes, C. E. (2019). Assessing functions of stimuli associated with rich-to-lean transitions using a choice procedure. Journal of the Experimental Analysis of Behavior, 112(1), 97–110. https://doi.org/10.1002/jeab.540

Poling, A., & Ryan, C. (1982). Differential-reinforcement-of-other-behavior schedules: Therapeutic applications. Behavior Modification, 6(1), 3–21. https://doi.org/10.1177/01454455820061001

Rey, C. N., Betz, A. M., Sleiman, A. A., Kuroda, T., & Podlesnik, C. A. (2020). The role of adventitious reinforcement during differential reinforcement of other behavior: A systematic replication. Journal of Applied Behavior Analysis, 53(4), 2440–2449. https://doi.org/10.1002/jaba.678

Thompson, R. H., & Iwata, B. A. (2005). A review of reinforcement control procedures. Journal of Applied Behavior Analysis, 38(2), 257–278. https://doi.org/10.1901/jaba.2005.176-03

Thanks for Barking: Addenda

This 2021 blog post I wrote to elaborate on a 2017 quickie for clickertraining.com continues to be the most read thing on my website. (For the actual steps in the protocol, go to that post. Then come back and read this one.) And I get a lot of heartening messages about how it has helped people live more harmoniously with their dogs. But like most responsible content producers, I also worry a great deal about sending “recipes” out into the wild without enough context--be it through social media or presentations at conferences that are always shorter than I’d like. So here are some additional thoughts that have been percolating over the past couple years.

The number one thing people ask, with concern, about this protocol is: aren't you reinforcing barking? Initially yes. And, maybe forever. It should, however, be less barking, and less intense barking, and if you follow the protocol all the way through, you might get closer to none.

The number two thing people ask is "can I use it for x," and this is related to the number one thing. I think it's important to emphasize that this protocol is designed to address barking at stimuli that surprise both you and the dog, like stuff outside your window, in your apartment hallway, or outside a fence. If the things your dog barks at are not surprises, and you can see them coming and get ahead of barking consistently to reinforce something else, definitely do that instead. Leslie McDevitt's "Look at That" from Control Unleashed is great for this.

This protocol works best with initial preventive management, pretrained skills, and consistency. I don’t know about you, but I don’t want to do any of those things if I don’t have to. To potentially save yourself some work, if you have a brand-new dog and they are barking at stuff outside the window or fence, or your dog just started barking at sounds when you moved . . . then if your neighbors won't murder you, try doing nothing about it for a week and seeing what happens before starting any other type of protocol. I gave this advice to trainer friend Christie Catan when she moved to the country and you can read about it here.

This protocol initially uses food (usually) to reinforce barking. Less barking, and then if carried out to completion, the reinforcer is used to select a different part of the chain that results. I don’t worry about reinforcing the barking, though, because if you are a place where you are really motivated to do something about the barking, the barking is occurring regularly, which means something is reinforcing it already. The reinforcer is probably the sound or sight going away, or possibly the way you react to the barking. These explanations need to be ruled out before deciding that the barking is “self-reinforcing" as I often hear people do.

Because something in the environment is reinforcing (which means "strengthening") the barking, preventive management is key. Without it, the behavior will continue to be reinforced, and you may find yourself weakening or poisoning the cue you are trying to train (read on). Your management may not be perfect, but don’t let perfect be the enemy of better. If you are not prepared to train, do your best to minimize exposure, and by extension, opportunities to keep getting reinforced by the environment for barking. Use window film or white noise, close blinds, close the back door, put slats through or tarp over the chain link fence, etc. You may find this solves your problem without further training, but maybe you also like to be able to look out the bottom half of the windows and still want to move ahead with some training.

This may sound similar to advice you have heard about training a recall--don't put yourself in a position where you'll need it if you don't think it will work, right? Well, a clean recall cue is the main prerequisite for this protocol. And “thank you” is that clean recall cue. It doesn’t have to be “thank you,” but most people have ruined their first recall cue already (by not reinforcing, reinforcing poorly, punishing late arrivals, only using it when it will cost the dog something to come, etc), and most people haven’t ever said “thank you” to their dog yet, so it usually has no prior associations. It also tends to work as a mindset changer for the humans; it’s hard to say in a mad tone. But you could use anything that is a 100% reliable predictor of stuff your dog loves--even, gasp, your marker signal. A marker is just a cue that means come get your reinforcer.

If your recall isn’t working yet and your management fails, go to your dog and deliver the treats anyway. Drop them, many of them, in a straight line to the ground right past her eyeball. (Don't try to shove them in the dog's mouth; dogs are often less likely to take them that way, and you may get bitten.) When I started this protocol with my own dog Pigeon, my recall cue was not working in this situation, and I hadn’t thought of retrraining a new one “thank you” yet. When she tore down our gangway to hurl herself snarling at the gate, I hustled after her and stood right behind her, gave the cue, and then fed handfuls of Stella and Chewy’s if she so much as gave me a dirty look. Now, I would feed regardless of whether I got the dirty look. After a few sessions, she was paying much more attention to me when I followed her, both her barking and her visible physiological reactions (e.g., piloerection) got less intense, and I was able to start successfully recalling her out of the barking from farther away.

As with any other recall, I don’t recommend adding punishment—intentional or inadvertent—after the "thank you," as you risk poisoning the cue. If your dog won't take the treats even at the location of barking and you absolutely must bring them away, do it—but then adjust your plan, including preventive management, so you are not routinely hauling them off by the collar after you have said “thank you,” which could turn “thank you” into signal for avoidance. If you think "thank you" won't work, don't say it before you go get them.

I like to have the treats (and any other reinforcers) that follow “thank you” come in a predictable location so that (a) you don’t have to have treats on or near you all the time and (b) the dog will anticipate the location and start heading there—but if that’s not practical, the location can just be you.

Reinforce the earliest response your dog has to the stimulus. Don’t wait for barking if you can catch the head swivel or ear flick toward the sound.

Don’t reinforce the same type of (same look, same sound, pointing in the same direction) barking if it is offered when there is no stimulus to bark at. I once met a dog who barked at people outside the window, but also appeared to have learned to go around barking at windows even with nothing outside in order to get his person to get his toys out of hiding. That’s the sort of problem you may create if you reinforce the barking when there is no stimulus outside.

If the dog really is barking at surprise stimuli to get treats, congratulations! Many people seem to be happy to just have a nice way to stop barking quickly, before the thing goes away, but to me the coolest part is that now you can use the treats to shape less barking and faster turning. With Pigeon, I had trained up a predictable pattern of start toward the fence, bark a few times--here's the recall!--then turn around and start looking for the treats. So I began to delay the recall until I saw that turn. She started to bark even less and turn faster—skipping ahead to the behavior closest to the reinforcer.

The other things that may be happening are that the aversiveness of the stimulus may be reduced through the pairing with food (which you may not see if you are also routinely hauling your dog away by the collar after the cue); that when the stimulus starts to go away even without the continued barking, it may reveal barking to be unnecessary to remove the stimulus; and that if your response was reinforcing barking, the dog now gets attention or treats from you for less barking or for other behaviors like turning.