What is "Differential Reinforcement of Other Behavior"?

You keep using that word. I do not think it means what you think it means.

At the time of this posting, there were a couple different explanations of this concept floating around dog-pro social media recently, all of which perpetuated common misconceptions. Having written a chunk of my thesis about what a differential reinforcement of other behavior (DRO) procedure looks like, why it's thought to work, and what best practices for using it may be, I thought I'd throw this into the mix.

Although DRO stands for "differential reinforcement of other behavior," it does not involve reinforcing "any other behavior" that occurs instead of the unwanted behavior. In fact, it doesn't necessarily involve reinforcing any behavior at all.

In DRO, reinforcement is contingent only on the passage of a certain amount of time without the occurrence of the unwanted behavior.

There are two main flavors of DRO, both of which can be split into subtypes depending on how the time intervals are set (e.g., the same every time or varied from rep to rep).

In "interval DRO," which is more common, the whole interval of time must go by without the behavior that you want to reduce. When time is up, reinforcement is delivered, regardless of what else the learner is doing.

If the learner does the unwanted behavior, typically the time requirement starts over. Sometimes, if the unwanted behavior occurs right as time runs out, extra time is added to avoid reinforcing the unwanted behavior.

In "momentary" DRO, reinforcement is delivered if the behavior is not occurring at the moment that an interval of time ends, regardless of whether the unwanted behavior occurred at any other time during that interval. If the unwanted behavior is occurring, the reinforcer is not delivered. This appears to be mostly used when it would be too hard to monitor the learner for the whole interval (i.e., if you have a classroom with 30 students and can't watch them all at once). It's also probably less effective than interval DRO.

Delivering the reinforcer when you see any of multiple acceptable alternative behaviors occurring--which is how I have seen DRO described in training circles sometimes--would be differential reinforcement of alternative behavior (DRA). That's a different procedure, in which reinforcement is contingent on the occurrence of specific behaviors, and withheld or diminished somehow when the unwanted behavior occurs.

Although it's referred to as a differential reinforcement procedure, reinforcement may not always be at work in DRO. There are at least four mechanisms thought to make it go; these are some of the most commonly discussed:

  1. Satiation. Best practices in DRO are to start with very low time criteria, so the start of DRO usually involves reinforcement presented densely, which may temporarily diminish the value of the reinforcer and therefore make behavior whose purpose is to get that reinforcer less likely to occur.

  2. Extinction, or the elimination/weakening of the relationship between the unwanted behavior and the reinforcer that maintains it. What we traditionally think of as extinction is the non-availability of reinforcement for a previously reinforced behavior, and DRO typically involves some of that. But noncontingent reinforcement (giving the reinforcer on a time-based schedule whether or not the behavior occurs) can also weaken the contingency between the behavior and the reinforcer, in essence making the behavior seem unnecessary.
    In fact, as Susan Friedman pointed out once during a joint presentation, DRO without extinction is identical to noncontingent reinforcement. Which is an antecedent intervention--something you do before the behavior occurs--not a reinforcement procedure.

  3. Punishment. Yes, that's right--DRO might run on punishment, particularly if the learner can tell that you are resetting the time. A signal that predicts an upcoming delay to reinforcement can become an aversive stimulus (one a learner will behave to avoid or escape). If something tells the learner that reinforcement has been delayed (a timer is reset) contingent on their behavior, that something can become a conditioned punisher, or punisher by association.

  4. And, finally, DRO may work because of accidental reinforcement of alternative behaviors. Whatever is happening when reinforcement gets delivered might get strengthened. But it takes consistency to develop a contingency (a relationship) between a behavior and a reinforcer, and that doesn't always happen when you're delivering reinforcement contingent on time elapsed. Evidence for reinforcement of other behaviors during DRO is sparse, and in my research and my experience with running DRO with clients, the alternative response that emerged during DRO was not one that was occurring with any regularity when the timer went off/reinforcement was delivered in the early stages.

This is not a complete investigation of DRO--I just wanted to clarify the definition for my fellow pros and other interested parties. But we could also talk about why it's often slower than DRA, why some people call it labor intensive or think it's a procedure to be avoided, the potential advantages of interval vs. momentary, best practices for its use, etc.

You can get more deets in any applied behavior analysis textbook. Here I used Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Applied behavior analysis (3rd ed.). Pearson.

But here are other sources related to the "mechanisms" section: Jessel, J., Borrero, J. C., & Becraft, J. L. (2015). Differential reinforcement of other behavior increases untargeted behavior. Journal of Applied Behavior Analysis, 48(2), 402–416. https://doi.org/10.1002/jaba.204

Jessel, J., & Ingvarsson, E. T. (2016). Recent advances in applied research on DRO procedures. Journal of Applied Behavior Analysis, 49, 991-995. https://doi.org/10.1002/jaba.323

Langford, J. S., Pitts, R. C., & Hughes, C. E. (2019). Assessing functions of stimuli associated with rich-to-lean transitions using a choice procedure. Journal of the Experimental Analysis of Behavior, 112(1), 97–110. https://doi.org/10.1002/jeab.540

Poling, A., & Ryan, C. (1982). Differential-reinforcement-of-other-behavior schedules: Therapeutic applications. Behavior Modification, 6(1), 3–21. https://doi.org/10.1177/01454455820061001

Rey, C. N., Betz, A. M., Sleiman, A. A., Kuroda, T., & Podlesnik, C. A. (2020). The role of adventitious reinforcement during differential reinforcement of other behavior: A systematic replication. Journal of Applied Behavior Analysis, 53(4), 2440–2449. https://doi.org/10.1002/jaba.678

Thompson, R. H., & Iwata, B. A. (2005). A review of reinforcement control procedures. Journal of Applied Behavior Analysis, 38(2), 257–278. https://doi.org/10.1901/jaba.2005.176-03

Thanks for Barking: Addenda

This 2021 blog post I wrote to elaborate on a 2017 quickie for clickertraining.com continues to be the most read thing on my website. (For the actual steps in the protocol, go to that post. Then come back and read this one.) And I get a lot of heartening messages about how it has helped people live more harmoniously with their dogs. But like most responsible content producers, I also worry a great deal about sending “recipes” out into the wild without enough context--be it through social media or presentations at conferences that are always shorter than I’d like. So here are some additional thoughts that have been percolating over the past couple years.

The number one thing people ask, with concern, about this protocol is: aren't you reinforcing barking? Initially yes. And, maybe forever. It should, however, be less barking, and less intense barking, and if you follow the protocol all the way through, you might get closer to none.

The number two thing people ask is "can I use it for x," and this is related to the number one thing. I think it's important to emphasize that this protocol is designed to address barking at stimuli that surprise both you and the dog, like stuff outside your window, in your apartment hallway, or outside a fence. If the things your dog barks at are not surprises, and you can see them coming and get ahead of barking consistently to reinforce something else, definitely do that instead. Leslie McDevitt's "Look at That" from Control Unleashed is great for this.

This protocol works best with initial preventive management, pretrained skills, and consistency. I don’t know about you, but I don’t want to do any of those things if I don’t have to. To potentially save yourself some work, if you have a brand-new dog and they are barking at stuff outside the window or fence, or your dog just started barking at sounds when you moved . . . then if your neighbors won't murder you, try doing nothing about it for a week and seeing what happens before starting any other type of protocol. I gave this advice to trainer friend Christie Catan when she moved to the country and you can read about it here.

This protocol initially uses food (usually) to reinforce barking. Less barking, and then if carried out to completion, the reinforcer is used to select a different part of the chain that results. I don’t worry about reinforcing the barking, though, because if you are a place where you are really motivated to do something about the barking, the barking is occurring regularly, which means something is reinforcing it already. The reinforcer is probably the sound or sight going away, or possibly the way you react to the barking. These explanations need to be ruled out before deciding that the barking is “self-reinforcing" as I often hear people do.

Because something in the environment is reinforcing (which means "strengthening") the barking, preventive management is key. Without it, the behavior will continue to be reinforced, and you may find yourself weakening or poisoning the cue you are trying to train (read on). Your management may not be perfect, but don’t let perfect be the enemy of better. If you are not prepared to train, do your best to minimize exposure, and by extension, opportunities to keep getting reinforced by the environment for barking. Use window film or white noise, close blinds, close the back door, put slats through or tarp over the chain link fence, etc. You may find this solves your problem without further training, but maybe you also like to be able to look out the bottom half of the windows and still want to move ahead with some training.

This may sound similar to advice you have heard about training a recall--don't put yourself in a position where you'll need it if you don't think it will work, right? Well, a clean recall cue is the main prerequisite for this protocol. And “thank you” is that clean recall cue. It doesn’t have to be “thank you,” but most people have ruined their first recall cue already (by not reinforcing, reinforcing poorly, punishing late arrivals, only using it when it will cost the dog something to come, etc), and most people haven’t ever said “thank you” to their dog yet, so it usually has no prior associations. It also tends to work as a mindset changer for the humans; it’s hard to say in a mad tone. But you could use anything that is a 100% reliable predictor of stuff your dog loves--even, gasp, your marker signal. A marker is just a cue that means come get your reinforcer.

If your recall isn’t working yet and your management fails, go to your dog and deliver the treats anyway. Drop them, many of them, in a straight line to the ground right past her eyeball. (Don't try to shove them in the dog's mouth; dogs are often less likely to take them that way, and you may get bitten.) When I started this protocol with my own dog Pigeon, my recall cue was not working in this situation, and I hadn’t thought of retrraining a new one “thank you” yet. When she tore down our gangway to hurl herself snarling at the gate, I hustled after her and stood right behind her, gave the cue, and then fed handfuls of Stella and Chewy’s if she so much as gave me a dirty look. Now, I would feed regardless of whether I got the dirty look. After a few sessions, she was paying much more attention to me when I followed her, both her barking and her visible physiological reactions (e.g., piloerection) got less intense, and I was able to start successfully recalling her out of the barking from farther away.

As with any other recall, I don’t recommend adding punishment—intentional or inadvertent—after the "thank you," as you risk poisoning the cue. If your dog won't take the treats even at the location of barking and you absolutely must bring them away, do it—but then adjust your plan, including preventive management, so you are not routinely hauling them off by the collar after you have said “thank you,” which could turn “thank you” into signal for avoidance. If you think "thank you" won't work, don't say it before you go get them.

I like to have the treats (and any other reinforcers) that follow “thank you” come in a predictable location so that (a) you don’t have to have treats on or near you all the time and (b) the dog will anticipate the location and start heading there—but if that’s not practical, the location can just be you.

Reinforce the earliest response your dog has to the stimulus. Don’t wait for barking if you can catch the head swivel or ear flick toward the sound.

Don’t reinforce the same type of (same look, same sound, pointing in the same direction) barking if it is offered when there is no stimulus to bark at. I once met a dog who barked at people outside the window, but also appeared to have learned to go around barking at windows even with nothing outside in order to get his person to get his toys out of hiding. That’s the sort of problem you may create if you reinforce the barking when there is no stimulus outside.

If the dog really is barking at surprise stimuli to get treats, congratulations! Many people seem to be happy to just have a nice way to stop barking quickly, before the thing goes away, but to me the coolest part is that now you can use the treats to shape less barking and faster turning. With Pigeon, I had trained up a predictable pattern of start toward the fence, bark a few times--here's the recall!--then turn around and start looking for the treats. So I began to delay the recall until I saw that turn. She started to bark even less and turn faster—skipping ahead to the behavior closest to the reinforcer.

The other things that may be happening are that the aversiveness of the stimulus may be reduced through the pairing with food (which you may not see if you are also routinely hauling your dog away by the collar after the cue); that when the stimulus starts to go away even without the continued barking, it may reveal barking to be unnecessary to remove the stimulus; and that if your response was reinforcing barking, the dog now gets attention or treats from you for less barking or for other behaviors like turning.

Pro Tips: Strategic Treat Delivery

One thing that distinguishes basic from excellent positive reinforcement training is reinforcer delivery. You may be hearing a lot about this in the dog world lately, including Mary Hunter's brilliant demonstrations of building "reinforcement systems" for more effective teaching (she's cohosting a summit on the topic, registration for which closes today), as well as Hannah Branigan's presentation on how to build what's often called "drive" and Eva Bertilsson and Emelie Johnson Vegh's presentation on "starting points" at last week's ClickerExpo.

Here are a few practical considerations and tips that I originally wrote for my professional students in the Karen Pryor Academy on how to thoughtfully deliver just one of the many types of reinforcers that positive reinforcement trainers can use: food, aka "treats."

The delivery of the reinforcer (e.g., the treat) for the behavior that just happened is also the antecedent for whatever the dog is going to do next.

So consider how to deliver the treat so that when the dog is done eating, they are perfectly in position to offer the next thing you want to see.

Some simple examples:

• Want the dog to be ready to move its feet toward a target again? Deliver the treat away from the target.

• Want the dog to remain in position and offer eye contact after eating? Deliver the treat right to the dog's mouth so the dog’s front feet don’t move.

• Want the dog to immediately walk a few more steps by your side? Deliver the treat at your hip or slightly behind.

• Dog shoots ahead after each treat? Toss the treat for attending to you or walking by your side out into the grass, then walk forward and wait for attention or movement to catch up.

• Want the dog to look at a surprise trigger from a little farther away? Toss the treat for the first glance (or have the dog follow it a few steps) in the opposite direction from the trigger.

Behavior between the marker signal and the treat will also be reinforced—and after a lot of predictable reps, you may start to see it occur in anticipation of the marker instead of after it.

So consider how to deliver the treat to avoid things you don’t want between the marker and the treat, and encourage things you do.

Some simple examples:

• If you don’t want a puppy to jump up your leg between the click and the treat, preload your treat hand and deliver treats quickly to the floor after the click. This leaves no time for jumping, and no reason because it’s wasted effort if the treat is predictably going to appear down low. You'll then be able to move the treats back into your pouch once the puppy is waiting down low.

• If you don’t want a dog to curve in front of you every few steps on a walk, deliver treats from the side of your body that the dog usually walks on, rather than bringing them across your body with the opposite hand.

• If you don’t want your dog to turn toward you after going over a jump, mark as the dog begins the jump and send the treat (or toy) in a straight line ahead of them.

• If you'd like your dog to start to sniff when they see another dog, then mark for looking at the dog and sprinkle your treats on the ground rather than handing them to the dog. Your dog will likely begin to look down on hearing the marker, and may then begin to look down on seeing the other dog.

How you deliver reinforcers can add value to or subtract value from the reinforcer itself.

So consider whether:

• Your delivery might be boring: for instance, offering a treat (or toy--this is a common problem in tug) by holding it still rather than moving it away from the dog.

• Your delivery might be aversive, e.g., shoving food into the dog’s mouth or backing a dog up with the treat.

• Your delivery is getting delayed because the dog didn’t see where you tossed the treat or it went under something.

• Conversely, sniffing around in the grass for a treat might add value in some situations.

• A bowled- or “catch”-type treat delivery might be more exciting than a treat delivered to mouth (I often like this for helping food compete with birds and squirrels!)

You may want to change where you deliver the reinforcer at different points in your training of a behavior.

If you deliver in the same place every time without thinking about it, the behavior is likely to drift in the direction of reinforcement delivery. So be ready to watch behavior and constantly adjust.

Examples:

• If you toss your treat to reset every time you click down, your dog may start to pop up in anticipation of the click, or not lay down all the way. You can fix that a number of ways, including being more careful with the timing of your click, but another way is to begin to deliver the reinforcer in place once the dog understands that the behavior is to put their torso on the floor.

• When I attended Bob Bailey's chicken workshops, one exercise was to shape a chicken to walk a figure-8 around two cones . . . after shaping them to walk an oval around them. I clicked and treated a lot for the first step the chicken took to cross through the center, and after a while the chicken would pause there and glance back at me.

• This can be used for good as well as for evil; if you want to teach a dog to shift its weight back when it reaches a threshold (like a door or curb), mark them for arriving at the threshold and deliver the treat behind them.

Consider teaching the dog how a treat will be delivered in a given session before starting to work on the goal behavior.

• The click is frequently talked about as a conditioned reinforcer, but it may be better thought of as a cue to do whatever behavior is required to collect the treat or other "terminal" reinforcer.

• When cues are not clear, their value as reinforcers is weakened. So make your cues about where reinforcement will be delivered clearer.

• Let’s say you are going to click and toss a treat behind the dog. Start with just that—reinforce the default stand, say, with a click and a clearly telegraphed toss to a spot on the floor.

• Once the dog has that pattern, then add in your cue for the established behavior you want to work on, or start capturing or shaping a new behavior.

Check back later; I'll try to add videos to illustrate these various suggestions. But I wanted to get the post up today in case anyone wants to sign up for the Behavior Explorer event.