Can the path of a child’s life—things like their future grade point average—be predicted using computer models?
In theory, this idea isn’t outlandish. In today’s digital world, algorithms are often trained to predict the health outcomes of patients, or how likely someone is to pay back their loans. So a team of researchers wondered whether this sort of analysis could help predict—and eventually buffer—future hardships of children, particularly from less-resourced families.
To that end, the scientists analyzed data on more than 4,000 American families gathered over 15 years, beginning at a child’s birth, including information about the children, their parents, schools, and the stability of their environments. The researchers took the data from the first nine years—and attempted to predict six key academic and personal outcomes for the kids when they turned 15.
But things like academic success and family hardship are, it appears, more fickle than many other computer-driven predictions. In 2020, the team published their findings. “To our surprise, the predictions were not very accurate,” says Ian Lundberg, a sociologist at the University of California, Los Angeles, who was one of the co-authors of the 2020 paper. “So this left us wondering: Why?”
In a new study, Lundberg and his colleagues delved into why the earlier study failed to forecast the kids’ outcomes accurately—zeroing in on GPA. They reconnected with 40 families who were monitored in the dataset and interviewed them extensively to learn more about the nuances of their lives that were missed in the numbers. The findings, published earlier this year in Proceedings of the National Academy of Sciences, suggest that the shortcomings in predicting outcomes wasn’t just about a lack of data or computational limits. Rather, that there is a fundamental limit on our ability to foretell the complexities of life.
“To actually go back and try to understand the reasons for not performing well, especially on the outliers … is the really innovative part of the study,” says Ramina Sotoudeh, a sociologist at Yale University who was not involved with the new research.
There is a fundamental limit on our ability to foretell the complexities of life.
This failure of prediction can be attributed to two main sources, the study authors note. First is something called irreducible error. An example of this is an unexpected event that could happen to a child in their adolescent years that can’t be foreseen by a factor like income—something like a parent’s death, says Lundberg. “In that case, there’s really no machine learning or computational methods that can make prediction better,” he adds.
The second is learning error: errors within an algorithm’s learning process. The kinds of outcomes the scientists were trying to measure—grades, grit, eviction, family hardships—are influenced by a lot of different variables, which can form patterns that an algorithm can learn and then use to predict an outcome. But when there are too many variables, sometimes algorithms can learn the wrong pattern, says Lundberg. This type of learning error can be made smaller with more individuals. But for long-term longitudinal studies like this, it’s difficult to get more than a few thousand people to participate. “It’s a fundamental problem of [studying] complexity,” says Sotoudeh.
The new findings also highlight the value of qualitative research—conducting interviews and talking with human beings can yield insights that a quantitative approach can’t. Some qualitative observations made by sociologists, such as how people interact and form relationships, are hard to translate into a number, says Sotoudeh, and could also be influencing outcomes.
For Lundberg, diving into the qualitative side of this research was eye-opening, even though he spent a lot of time studying the original dataset. The new study, as an example, highlights the story of “Bella” (a pseudonym for a study participant). She had a stable childhood and by the variables that the researchers measured, seemed to be headed for a high GPA in her teens. But between age 9 and 15, Bella’s father died unexpectedly, and her mother descended into depression; Bella started struggling academically and socially—due to these factors that weren’t captured in the data.
There is a strong desire to be able to anticipate what’s going to happen in the future, whether for yourself or for your loved ones, and we often do this using our own human judgment, says Lundberg. “There’s growing interest in the idea that computers might be able to help us do that more accurately … But the most important implication of our study is that we should not default to the belief that all outcomes are going to become predictable with increasing computational power,” he says.
“The answer is not always more data,” echoes Sotoudeh. “Social outcomes, they’re unpredictable and they’re complex. And we just have to make peace with this unpredictability.”
Lead image: Jorm Sangsorn / Shutterstock