A little more than a decade ago, Mike Mendl developed a new test for gauging a laboratory rat’s level of happiness. Mendl, an animal welfare researcher in the veterinary school at the University of Bristol in England, was looking for an objective way to tell whether animals in captivity were suffering. Specifically, he wanted to be able to measure whether, and how much, disruptions in lab rats’ routines—being placed in an unfamiliar cage, say, or experiencing a change in the light/dark cycle of the room in which they were housed—were bumming them out.
He and his colleagues explicitly drew on an extensive literature in psychology that describes how people with mood disorders such as depression process information and make decisions: They tend to focus on and recall more negative events and to judge ambiguous things in a more negative way. You might say that they tend to see the proverbial glass as half-empty rather than half-full. “We thought that it’s easier to measure cognitive things than emotional ones, so we devised a test that would give us some indication of how animals responded under ambiguity,” Mendl says. “Then, we could use that as a proxy measure of the emotional state they were in.”
First, they trained rats to associate one tone with something positive (food, of course) and a different tone with something negative (hearing an unpleasant noise). They also trained them to press a lever upon hearing the good tone. Then, for the test, they’d play an intermediate tone and watch how the animals responded. Rats have great hearing, and the ones whose cage life wasn’t disturbed were pretty good judges of where the new tone fell between the other two sounds. If it was closer to the positive tone they’d hit the lever, and if it was closer to the negative one they’d lay off. But the ones whose routine had been tweaked over the past two weeks judged this auditory information more negatively. Essentially, their negative responses bled into the positive half of the sound continuum.
Since Mendl published his so-called judgment bias task in 2004, it’s been shown to work in at least 15 other species, including dogs, sheep, bees, and even us humans. Some scientists—himself included—have begun to ask whether there’s a role for it beyond animal welfare. Considering that it probes one of the core clinical measures of depression, could it be used to evaluate the efficacy of much-needed new medicines for that condition?
Drug discovery in neuroscience has hit a wall, with just 1 in 10 drugs tested in the final stage of clinical trials reaching the finish line of approval. With very few exceptions, no new types of drugs for mind disorders have been approved for decades. You might think drugs fail because they’re found to be toxic, but most die in clinical trials because they aren’t shown to work. Trace that back to the root of the problem, and one big stumbling stone along the drug development pathway is the point where animal tests—and most are done in rodents—wrongly predicted they would.
“We have lots of experience with this—15 to 20 years of failure,” says Ricardo Dolmetsch, the global head of neuroscience at the Novartis Institutes for Biomedical Research. “I can name 14 or 15 examples [of tested drugs] that were just fantastic in animals and did not do anything at all in humans.”
Even as these failures have accrued, neuroscientists armed with increasingly potent tools for pinpointing the genes that play a role in psychiatric disorders and the brain circuits those genes control are getting closer to understanding the pathologies of these illnesses. As drug companies—which had largely abandoned or strongly curtailed their efforts in neuroscience and mental health over the past several years—begin to dip their toes back into the water, it seems a fitting time to ask whether modeling aspects of the human mind in rodents is even possible.
One word explains why testing neuropsychiatric drugs in animal models is hard, and that word is language. If we want people to tell us how they feel, we ask them. Animals, of course, have to show us—and it turns out some of our widely used methods for guiding them to do so haven’t been that great. That’s particularly true for depression. How do we know a rat is depressed?
An experiment called the “forced swim test” or “Porsolt test,” after its founder Roger Porsolt, has been widely used since the late 1970s, at least by pharmaceutical companies and drug regulators.
It’s a remarkable story. Before the mid 20th century, treatments for mental or psychiatric disorders consisted primarily of psychotherapy or interventions like sleep cures, insulin shock therapy, surgeries such as lobotomy, or electrical brain stimulation—most prominently, electroconvulsive therapy. Quite suddenly, spurred by the accidental discovery of an antipsychotic drug called chlorpromazine in 1952, these conditions were re-imagined as chemical imbalances that could be corrected with a well-designed pill.
Initially, these new compounds had their first runs in institutionalized patients. Medicinal chemists had a synthesizing frenzy, riffing off compounds that had seemed effective in the hopes of adjusting potency and side effect profiles, or of further expanding the cornucopia of psychoactive drugs. Soon, companies began freely giving out early-stage compounds to academic researchers who were up for observing how animals that ingested these novel chemical entities behaved.
“I can name 15 examples that were just fantastic in animals and did not do anything at all in humans.”
By the 1970s, companies were deeply invested in conducting their own behavioral testing, primarily in rodents. Anti-anxiety medicines were big sellers, and there was a handful of ways to screen for them; for example, seeing whether experimental compounds could boost a rat’s interest in exploring an unfamiliar environment, or its willingness to engage in a behavior that it had been conditioned to avoid. It’s hard to say whether or how closely those tasks reflected anxiety as experienced by humans. Certainly, though, such drug testing drew on the relatively new fields of ethology and behaviorism, which generally assumed that behavioral principles gleaned from laboratory animals broadly applied to people.
For depression, however, as well as for conditions like psychosis, the tests weren’t very good because they relied too strongly on pharmacology. Give a rat a drug known to induce a state that seems to have features of the disorder, and then see if an experimental compound reverses the effect. The problem was, this system was inherently rigged to find drugs that worked by the same mechanisms that the inducing agents did.
At the time, Porsolt was working at Synthelabo (later acquired by Sanofi), a French pharmaceutical company. While conducting a water maze experiment, he noticed that his rats had a propensity to just stop swimming in the middle of the task. He found it curious, and it reminded him of the work of another researcher, Martin Seligman. A few years earlier, Seligman had found that dogs trapped in adverse situations from which they couldn’t escape eventually stopped trying—a phenomenon he termed “learned helplessness.” What Porsolt observed with his rats looked similar.
Porsolt soon designed the forced swim test, which made its debut in a two-page report in Nature in 1977. It’s very easy to perform. Researchers place a mouse or rat in a beaker of water from which there is no exit. Invariably, after a few minutes, it stops struggling to escape and simply hangs in the water, immobile. Animals given antidepressant drugs before undergoing the procedure a second time, Porsolt reported, struggle longer before apparently succumbing to what he poetically called “behavioral despair.”
Pretty much right away, academic researchers studying depression and pharmaceutical companies developing new medicines began using it in full force. “There’s not a single dossier of a newly introduced antidepressant in the last 20 years where they have not used the swimming test,” Porsolt says now. “It’s become the standard test.”
The assay gave what you’d call the correct answer for the early antidepressants available in the 1970s, on which Porsolt validated it—that is, the drugs that kept the animals afloat longer also relieved depression in people. And by most accounts it worked great in predicting efficacy for the first serotonin reuptake inhibitors, specifically Prozac (fluoxetine), which was approved in the United States in 1987.
Porsolt concedes he made a hugely anthropomorphic leap in the reasoning that he attributed to the animals’ experiences. But he doesn’t see it as a problem. “You know, I’m a pragmatist,” he says. “There’s nothing wrong with engaging in anthropomorphism, provided you put it to the test.”
What convinced Porsolt his assay was the real deal is the antidepressants available then tended to make animals sluggish, but in the context of the test they had the opposite effect. “That was the first big surprise—that you give the classical antidepressants of the time—tricyclics like imipramine—at doses which otherwise are sedative, and the animals become active again,” he says. Because inducing this state didn’t require pre-administering drugs, like earlier tests had, he believed his behavioral assay could in theory identify antidepressant effects in any type of chemical compound.
There is little to quibble with in the initial models. Neurobiology was in a larval state and the “animal assays were really smart,” says Steven E. Hyman, director of the Stanley Center for Psychiatric Research at the Broad Institute of the Massachusetts Institute of Technology and Harvard, and director of the National Institutes of Mental Health from 1996 to 2001. But when psychiatric drugs went mass-market in the 1980s, companies doubled down on the strategy of relying on simple behavioral tests, like the forced swim test, to screen new compounds.
For a while, Hyman says, the drugs improved in terms of safety and side effects. Their efficacy, however, generally didn’t, and it soon became clear that behavioral tests didn’t help identify new types of chemical compounds. Yet companies kept turning the same animal model crank. “They were the accepted models and they were quick and easy to do,” says Mark Tricklebank, who founded and, until a few years ago, directed Eli Lilly’s Centre for Cognitive Neuroscience, an industry and academic partnership to improve animal models of cognition. “Too much focus on results and deadlines tends to push people to worry only about collecting data and not its quality,” he says.
Today, 30 years after Prozac arrived on the market, it’s remarkable how few novel types of antidepressants have been found. (Other psychiatric conditions, like schizophrenia, have fared no better.) The fact is, the forced swim test is a poor stand-in for depression. There’s just no way to conclude why rats or mice cease swimming in the bucket. “It may be that those are actually the wise rodents, because they’re conserving energy once they realize they’re not drowning,” says Hyman. “If you give them imipramine or a drug like it and they struggle longer, why is that better?” Emma Robinson, a psychopharmacologist at the University of Bristol, agrees. The Porsolt test may have been key to the development of Prozac and second-generation antidepressants, she says, but, “to be honest, I don’t think we know what the forced swim test is measuring. It’s given a lot of false data.”
A big part of the difficulty in judging what such behavioral tests in animals do and don’t reveal about the human mind comes down to what we might call human errors of implementation. Take another ubiquitously used behavioral test, the Morris water maze, in which researchers release a rat or mouse into a pool of water, then time how long it takes to find a submerged platform to stand on. Normally, over several trials the animal gets quicker at putting something solid beneath its feet, revealing its use of spatial memory.
But the test has also been adopted as a stand-in for clinically relevant memory loss, like the kind experienced in Alzheimer’s disease—even though there is no evidence that it is applicable there. “It’s a measure of fear-based, fear-motivated escape, which is of very little relevance to the disorders for which it’s regularly used,” says Joseph Garner, a neuroscientist at Stanford University who studies animal models. In fact, notes Garner, although the Morris water maze is one of the most widely used tests in behavioral research, a 2007 study found that an animal’s performance correlates strongly with visual acuity, suggesting it is as much a test of vision as of memory.
Behavioral tests used in pain research provide another good example. One standard measure for whether an analgesic is effective in a mouse is how quickly it withdraws its paw from a heat lamp. Reflexively withdrawing from heat, though, is very different from the debilitating pain that generally troubles people—which tends to be chronic rather than acute, and to come from within rather than from outside the body. If a drug can treat one type of pain in an animal, says Jeffrey Mogil, a pain researcher at McGill University in Montreal, that doesn’t mean it will work for the other types in humans. “It’s a mismatch of the symptoms in humans and the choice of symptoms in animals,” Mogil says. This dissociation has beguiled the search for novel pain medicines, but he explains that we shouldn’t be surprised. “We use that test because it’s convenient for us, and reliable.”
With all these problems, a growing cadre of researchers says that the use of animal models in psychiatry needs a major rethink because the kinds of behavioral tests the field has relied on to probe rodent minds simply don’t match up to the human mind. Behaviors in rodents and humans have been shaped by very different evolutionary trajectories, notes Hyman, and assuming they are supported by the same brain circuitry simply because they appear similar to us is “in the same intellectual ballpark as [classifying] insects, birds, and bats together because they all fly.”
“Too much focus on results and deadlines tends to push people to worry only about collecting data and not its quality.”
Does that mean that it’s impossible to come up with tasks that do allow researchers to compare an animal’s mental state directly with a human’s? Perhaps not. Over the past few decades, neuropsychiatrists have developed standardized batteries of human cognitive tasks for probing processes like attention and impulsivity, with the aim of better understanding the cluster of symptoms that come together in any given disease. Because a strong focus was placed on tasks that don’t rely on language, the field was able to build on that work by “reverse-translating” them—basically, recreating them as closely as possible in animal models.
Meanwhile, advances in neuroimaging—both in humans and in rodents—mean it’s possible to make sure that the same brain regions and the same circuits are engaged in the human and the animal model. “If we see the same neural circuits involved in the rat as in the human, or if some particular task or drug strengths communication between different parts of the brain, then we know we have a translatable task,” says Holly Moore, a neuroscientist at Columbia University who uses animal models to study the neural basis of schizophrenia. “We just now have the imaging chops to do that.”
A few years ago, a grant from Pfizer launched Robinson—the University of Bristol psychopharmacologist—on an effort to build on Mendl’s rat happiness task and develop one that’s more suitable for drug testing. Rather than having the animals judge the similarity between different tones, Robinson has them dig for food, a task more relevant to their lives. She trains them to associate a specific digging material—say, sawdust, or sand—with the food reward. When asked to choose between the two types of digging material later, their choice is colored by whether or not they were having a good day, by lab rat standards, when they trained in it. Her lab has already begun to explore how to compare these rats’ measured mental states to those of people trained to do a human version of such a task. But Robinson admits there’s a lot left to do in determining what it means to make this match-up.
For now, nobody can say for sure whether all of this activity will produce novel medicines for people who need them. In schizophrenia, there have already been some positive effects for the field, Moore says. “I see the literature moving toward a more thoughtful approach to behavioral tasks and more widely questioning assumptions underlying both the human and animal research,” she says. “I do know we won’t be wasting as much time as we have been.”
It could be the way forward is not to abandon rodent behavioral tests, but to refine them. Garner argues that for researchers who are well-versed in rat or mouse behavior, there is no a priori impediment to designing studies in which rodent cognitive faculties are directly compared to human ones. Many behaviors are in fact evolutionarily conserved, he says, and brain imaging or other techniques can be used to ensure that the same neural circuits are engaged across species. Even if the approach works, however, it’s unclear whether drug companies will follow that route, since such tests would most likely be more complicated and time-intensive—and more expensive.
Novartis, for one, is taking a different route. The company plans to test new drugs in rodents. But rather than futz with behavioral tests that make assumptions about rodent minds and human diseases, they will use the animals just to determine that the drug hits the cells or brain regions it has been intended for. As for testing whether or not a drug treats some component of a psychiatric disorder, Novartis is going back to the future—that is, straight to humans. Dolmetsch guesses that other companies in psychiatry are doing the same. Some of the companies’ leads will come through developing better versions of compounds like ketamine, serendipitously found to have psychiatric effects in humans. Others will come through dissecting neural circuits in people with rare mutations that point to some mechanism underlying brain and mind diseases.
“I think studying animal behavior is still valuable for its own sake,” says Dolmetsch. “It’s just not necessarily the best way of modeling psychiatric diseases.”
Alla Katsnelson is a freelance science writer with a doctorate in neuroscience. She lives in Northampton, Massachusetts.