Understanding how Machine Learning Impacts Survey Research


Surveys allow us to ask fundamental questions about people’s internal states (emotions, expectations, opinions, etc.) that cannot often be learned from observational data alone.  Through sampling, survey statisticians are able to select respondents in a way that closely reflects a population’s composition. For decades, high-quality statistical estimates generated with these methods have transformed the landscape of public policy, business, and social science research. As we move into a more digital age, survey research is now facing several foundational challenges, including declining response rates, rising survey costs, and distrust in official statistics. 

Given these challenges, machine learning offers ways to enhance survey operations by providing complementary perspectives through discovery, prediction, and optimization. During the inaugural BigSurv18 conference, we discussed machine learning applications for survey research; presentations ranged from model-assisted coding of open-ended survey responses to machine-learning inspired imputation methods. My colleagues and I had the opportunity to present on two different lines of research that utilize machine learning to augment survey operations:

While I believe that the cross pollination of ideas from survey research and machine learning is a fruitful area of interdisciplinary research, the collaboration is still in its infancy.  From what I’ve seen at this year’s BigSurv18 conference, I am deeply encouraged by the excitement of researchers in both disciplines to learn more about each other’s work and the commitment of both groups to finding areas of common purpose. Though some have said the coming of Big Data is the end of surveys, I believe that advances in technologies and new methods will only amplify our ability to continue learning about each other by asking the right questions, to the right people.