All of us can predict the future.
No, we’re not psychic. But we can all learn. At its core, learning is predicting the future based on the past.
When something good or bad unexpectedly happens, you learn what events or cues predict that outcome. Your brain is full of predictive associations. For example, I predict that eating a Godiva truffle will lead to several moments of bliss. I predict that finishing one Godiva truffle will lead me to eat a second Godiva truffle. I predict that eating a second will lead to eating a third, and that eating the whole box will lead to a stomachache. These are things I have learned from experience and help me to predict the future.
Learning relies on noticing a discrepancy between what you expect and what you get. When something happens exactly as you predicted, you don’t need to learn anything new. You can just keep doing what you’ve been doing. But when something happens unexpectedly, that’s when you need to figure out why it happened so you can change your future behavior.
How does your brain know when something unexpected happens? This is a more complicated problem that it might seem.
First, your brain needs to know what you’re expecting. For example, I expect that chocolate will taste good. That’s a no-brainer.
Second, your brain needs to know what’s actually happening. Like, whether I’m currently savoring the chocolatey flavor in my mouth or suffering from stomach pains.
Finally, your brain can compare these two things—your expectations vs. real life—to determine whether something unexpected has happened.
Neurons that like to be surprised
So how does your brain actually make this comparison? This is a question that neuroscientists have long pursued.
Studies performed in the 80s and 90s, culminating in a seminal Science paper in 1997,1 identified a cluster of neurons in the brain that seem to compare your expectations with real life: their activity reflects the difference between what you expect and what you get. These neurons release dopamine and reside in a brain region called the ventral tegmental area (VTA), which forms part of the reward system of your brain.
The VTA dopamine neurons fire a lot when you receive an unexpected reward, like getting a free doughnut with your morning coffee from the barista who thinks you’re cute. They’d fire even more if you got two free doughnuts. They don’t fire nearly as much when the reward is expected, like buying your usual morning doughnut for its normal price. And dopamine neuron firing is actually suppressed when you expect a reward and don’t get it, like getting to work and realizing that you accidentally left your doughnut on the subway.2
These dopamine neurons therefore seem poised to serve as a teaching signal: they tell your brain when something unexpectedly good (free doughnut!) or bad (MIA doughnut) has happened. That signal helps you learn to more accurately predict the future so you can do things to maximize your reward, like flirting with the barista or taking better care of your doughnut while in transit.
We’ve known about these VTA dopamine neurons for awhile now, but it wasn’t clear how their special properties arise. How do they compare the value of the actual vs. expected reward?
In a recent Nature paper from Nao Uchida’s lab at Harvard,3 researchers set out to determine how the VTA dopamine neurons make this comparison. First, they wanted to determine what kind of computation is actually occurring: the dopamine neurons could be performing subtraction (actual reward minus expected reward) or division (actual reward divided by expected reward).
Subtraction and division often give you similar results—in both cases the neurons fire more when you get a bigger reward or when the reward is more surprising. But subtraction and division yield very different results in some cases, and they have different implications for how the circuit might work.
Uchida’s group designed careful experiments to distinguish between these possibilities. Basically, they had to vary both the surprisingness and the value of the reward. To do so, they provided water drops as a reward to thirsty mice. Sometimes the water drop could be predicted by a smell that turned on just beforehand, which meant that it was an expected reward. The water drops also varied in size, representing different reward values.
By recording how dopamine neurons responded to rewards of different sizes and expectations, the researchers showed that these neurons are performing subtraction: how strongly they fire corresponds with the actual reward size minus the expected reward. Subtraction is what most neuroscientists had long assumed was going on but never actually proven. Using subtraction means that the difference between an expected and unexpected reward is always the same regardless of the reward size, which seems useful in a world where rewards can span a wide range.
Circuit models for subtraction
Next, Uchida’s group got to the more interesting part: how does this subtraction actually happen?
One model for how the dopamine neurons could perform subtraction is by receiving input from two different sets of neurons—one encoding the actual reward and the other encoding the expected reward.
In this model, getting an actual reward would activate one set of neurons, which would excite the dopamine neurons. The expectation of reward would activate a different set of neurons, which would inhibit the dopamine neurons. This inhibitory input is what achieves the subtraction: the more you expect a reward, the less the dopamine neurons fire.
Okay, so that’s a nice model and all, but where the heck are we going to find these mystery input neurons?
Previous studies had identified a set of inhibitory neurons in the VTA that connect to the dopamine neurons.4 These neurons release GABA, a neurotransmitter that suppresses the firing of other neurons. A few years ago, Uchida’s group found that the VTA GABA neurons encode the expected value of a reward: they fire depending on how strongly you expect a reward, but they don’t care about whether you actually get it.5
So basically, the GABA neurons are perfect candidate for providing inhibitory input to the dopamine neurons to get them to subtract the expected reward value. To test this model, Uchida’s group manipulated the GABA neurons and recorded how the dopamine neurons changed.
First, the researchers artificially activated the GABA neurons. If their hypothesis was right, this should make the mice expect more water reward than usual, which should make dopamine neurons fire less when they get the same reward as always. That’s exactly what they saw.
Second, the researchers silenced the GABA neurons. This should make the mice expect less reward than usual, which should make the dopamine neurons fire more when they get the normal reward. Indeed, the dopamine neurons were activated more strongly.
Finally, Uchida’s group tested whether manipulating the GABA neurons could actually affect the behavior of the mice. Activating the GABA neurons should make a mouse expect more reward—but if it doesn’t actually get more reward, then it’ll be disappointed and will learn to start expecting less reward. So once the artificial activation is over, the mouse should act like it’s expecting a crappy reward even though the actual reward is the same as always.
You might be wondering, how can researchers tell how much reward a mouse is expecting? It’s not like they can read its mind. But they can measure how much the mouse licks the water spout just before the water reward is delivered. Mice lick more when they think they’re going to get lots of water; they don’t lick much when they’re expecting a meager reward.
Uchida’s group found that activating the GABA neurons produced exactly the effect they expected: the mice licked less in anticipation of the same reward. This shows that the GABA neurons not only encode the expected reward value, but in fact actively modulate the behavior of animals. That’s pretty cool.
Behind the scenes of dopamine neuron firing
Overall, I think this study is significant for a few reasons. First, it confirms the long-standing assumption that VTA dopamine neurons subtractively compute the difference between the actual and expected reward value.
Second, this study identifies the VTA GABA neurons as the source of the subtractive input to the dopamine neurons. I feel like I’ve been constantly hearing about the VTA dopamine neurons and their special firing properties ever since I was a fledgling neuroscientist, but it’s always as if they just exist in a vacuum. I rarely hear about the broader circuits in which they function. So I appreciate that this paper sheds light on the mechanism that gives rise to their special properties. It’s like getting to see what’s behind the scenes at your favorite TV show.
There’s still a lot we don’t know, like how the GABA neurons receive information about reward size, which neurons are affected by dopamine, and how the whole circuit works together to ultimately enable us to learn. This study contributes a small but important chapter to the story of how VTA dopamine neurons compare our expectations to real life, a comparison that lies at the heart of learning.
References and Notes:
1. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science 275:1593-1599 (1997).
2. Yes, I realize that this post contains a disproportionate number of chocolate- and doughnut-themed examples.
3. Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J, Uchida N. Arithmetic and local circuitry underlying dopamine prediction errors. Nature 525:243-246 (2015).
4. Omelchenko N, Sesack SR. Ultrastructural analysis of local collaterals of rat ventral tegmental area neurons: GABA phenotype and synapses onto dopamine and GABA cells. Synapse 63:895-906 (2009).
5. Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 482:85-88 (2012).