Peter Baumgartner, of RTI’s Center for Data Science, combines data science and design thinking to build software that solves analytic problems. Mr. Baumgartner’s expertise is in computational social science, focusing on natural language processing (NLP), machine learning (ML), data visualization, and open-source technologies to solve problems with data. While at RTI, he has leveraged his technical expertise on various projects, extracting value from unstructured data sources including social media, online news articles, and verbatim text from open-ended survey questions.
Mr. Baumgartner applies his expertise in ML and NLP across multiple domains. He is currently the lead data scientist on an an R21 grant from NIDA to understand how people self-treat opioid withdrawal symptoms, using over 3.5 million narrative reports from reddit.com. For this project he's engineered a named entity recognition pipeline to identify substances and effects of those substances—including creating an annotation scheme, labeling data, and training a model that achieved strong performance. He also works on the Survey of Earned doctorates, developing an approach using state-of-the-art NLP models to analyze open-ended survey questions regarding the impact of COVID-19 on the experience of survey respondents. Past work includes cross-disciplinary collaboration to assist with the NC DHHS-funded Rapid COVID-19 Hospital Capacity Scenario Modeling and Forecasting project, where he was responsible for a 72 percent increase in runtime speed of an agent-based model, during a time when rapid forecasts were essential for decision-making. He also led the development and evaluation of a custom warrant management software application developed in partnership with the Greensboro, North Carolina, Police Department, which helped the agency in prioritizing warrant service based on agency policy.
He's also developed and released several open-source tools during his tenure at RTI. Gobbli is a python package that makes training state-of-the-art NLP models easier. He's also developed a machine learning model and coding tool that allows researchers and analysts to classify verbatim criminal offense texts into the National Corrections Reporting Program (NCRP) offense code classification. He's also developed PushshiftRedditDistiller, a Julia library that allows other researchers to download, extract, and filter datasets from an archive of content from reddit.com.
Before joining RTI in 2015, Mr. Baumgartner worked as a consultant at Deloitte in its Advanced Analytics and Modeling practice. While there, he worked on several analytics initiatives in insurance underwriting, workforce analysis, veteran career services, health care and life sciences, and business development.