Skip to Main Content
RTI International
  • About
    • Our History
    • Office Locations
    • Executive Leadership
    • Corporate Governance
    • Partner with Us
      • U.S. Government
      • Clients and Funding Agencies
      • Industry and Commercial Clients
      • Foundations and Associations
      • Bilateral Agencies and Multilateral Banks
      • Universities and Academic Research Institutions
      • Suppliers and Small Businesses
      • Non-US Partner Organizations
    • Commitment to Quality
      • RTI's Client Listening Program
    • Ethics and Human Research Protection
    • Living Our Mission
    • Open Science Initiative
    • Veteran Opportunities at RTI
    About
  • Practice Areas
    • Health
      • Public Health and Well-Being
      • Health Care Transformation
      • Behavioral Health
      • Health Behavior Change
      • Precision Medicine
      • RTI Health Solutions (RTI-HS)
      • Health Equity
      • Community & Workplace Health
      • Public Health Economics
      • Child and Adolescent Research and Evaluation Program
      • RTI Health Advance
    • Transformative Research Unit for Equity​
      • Equity Capacity Building Hub
      • Social and Economic Justice Research Collaborative
      • Narrative Research and Community Engagement Lab
    • Education and Workforce Development
      • Early Childhood
      • K-12 Education
      • Postsecondary Education
      • Career and Adult Education and Workforce Development
      • Education Policy, Systems, and Governance
      • Education Research Methodologies
      • Education Technologies
      • International Education
    • International Development
      • Climate Solutions
      • Energy for Development
      • Environment
      • Global Food Security, Agriculture, and Nutrition
      • Global Health
      • International Education
      • Monitoring, Evaluation, Research, Learning, and Adapting (MERLA)
      • Youth and Economic Opportunity
      • Water, Sanitation, and Hygiene (WASH)
      • RTI Center for Global Noncommunicable Diseases
      • RTI Center for Governance
      • RTI Center for Thriving Children
    • Climate Change
      • Clean Energy Technology and Renewables
      • Climate Finance
      • Climate Justice and Equity
      • Climate Planning, Preparedness and Resilience
      • Climate Policy
      • Climate Vulnerability, Adaptation, and Mitigation
      • Economic Impacts of Climate Change
    • Water
      • Food-Energy-Water Nexus
      • Water Quality
      • WASH (Water, Sanitation, Hygiene)
      • Water Resources Management
    • Energy Research
      • Carbon Capture and Utilization
      • Biomass Conversion
      • Natural Gas
      • Energy Efficiency
      • Industrial Water
      • Syngas Processing
    • Environmental Sciences
      • Air Quality
      • RTI Center for Water Resources
      • Urban Sustainability
      • Contaminants of Concern
      • Building Resiliency in the FEW Nexus
      • Climate Change Sciences and Analysis
      • Environmental Policy
      • Environmental Justice
      • Sustainable Materials & Waste Management Solutions
    • Justice Research and Policy
      • Community Safety, Crime Prevention, and Victimization Response
      • RTI Center for Policing Research and Investigative Science
      • Child Well-Being and Family Strengthening
      • RTI Center for Forensic Sciences
      • Evidence-Based Strategies to Reduce Firearm Violence
    • Food Security and Agriculture
      • Food Security and Food Assistance
      • Nutrition, Physical Activity, & Obesity
      • Food Safety
      • Food Systems and Policy Analysis
      • Food Loss and Waste Research
      • Market Systems Strengthening
      • Global Food Security, Agriculture, and Nutrition
      • Climate-Smart Agriculture
      • Agricultural Innovation
    • Innovation Ecosystems
      • Innovation Advising
      • Innovation for Economic Growth
      • Innovation for Emerging and Developing Economies
      • Innovation for Organizations
      • Research, Technology, and Innovation Policy
      • Technology Acceleration
    • Military Support
      • Military Behavioral Health
      • Military Health and Human Performance
      • Military Sexual Assault, Harassment, and Domestic Violence Prevention
      • Wearable Sensor Technologies
      • Military Health System Transformation
      • North Carolina Center for Optimizing Military Performance
    Practice Areas
  • Services + Capabilities
    • Surveys and Data Collection
      • Survey Design
      • Instrument Development
      • Survey Methodologies
      • Data Collection
      • Establishment Surveys
      • Health Registries
      • Data Analysis and Reporting
      • Research Operations Center
    • Statistics and Data Science
      • Survey Statistics
      • Environmental Statistics
      • Coordinating Centers for Multisite Studies
      • Analysis and Design of Complex Data
      • Biostatistics
      • RTI Center for Data Science
    • Evaluation, Assessment and Analysis
      • Evaluation Design and Execution
      • Advanced Qualitative, Quantitative, and Mixed Methods
      • Evaluation, Monitoring, and Assessment
      • Economic Analysis
      • Evaluating Communication Interventions and Campaigns
      • Evidence Synthesis for Policy and Practice
      • Risk Assessment and Prediction
    • Program Design and Implementation
      • Systems Strengthening and Scaling
      • Capacity Assessment and Building
      • Policy Reform Support
      • Curriculum and Teacher Professional Development
      • Interventions and Prevention Programs
      • Implementation Science
    • Digital Solutions for Social Impact
      • Human-Centered Design of Digital Solutions
      • Digital Product Development
      • Digital Communication Campaigns
      • Digital Data Analytics
    • Research Technologies
      • Survey Technologies
      • Data Management and Decision Support Systems
      • Geospatial Science, Technology, and Visualization
      • ICT for Limited-Resource Settings
      • Mobile Applications
      • Web Applications
      • Bioinformatics
      • Interactive Computing
    • Drug Discovery and Development
      • Medicinal Chemistry
      • Molecular Design and Cheminformatics
      • Behavioral Pharmacology
      • Drug Metabolism and Pharmacokinetics (DMPK)
      • In Vitro Pharmacology, Bioassay Development, and High-Throughput Screening (HTS)
      • Isotope Labeling
      • Regulatory Consulting and Support for Medical Products
    • Analytical Laboratory Sciences
      • Bioanalytical and Toxicology Research
      • Forensic Sciences
      • Physicochemical Characterizations
      • Metabolomics
      • Proficiency Testing and Reference Materials
      • Microbiology
      • Analytical Chemistry and Pharmaceutics
    • Engineering & Technology R&D
      • Biomedical Technologies
      • Decarbonization Sciences
      • Environmental Exposure & Protection
      • Materials & Environment
      • Sustainable Energy Solutions
    Services + Capabilities
  • Impact
    • Newsroom
    • Insights Blog
    • Events
    • Publications
    • RTI Press
      • About the RTI Press
      • Instructions for Authors
      • RTI Press Collections
    • Projects
    • Global Reach
      • Asia
      • Eastern Europe and Central Asia
      • RTI International India
      • Africa
      • Middle East and North Africa (MENA)
      • Latin America and the Caribbean (LAC)
    Impact
  • Experts
    • Our Experts
    • In-Depth With Our Experts
    • Related News
    • Experts In the Media
    • RTI Fellow Program
    Experts
  • Emerging Issues
    • COVID-19 Research
    • Artificial Intelligence
    • Global Health Security
    • Cannabis Research
    • Opioid Research
      • Interventions for Opioid Use Disorders
      • Preventing Opioid Misuse and Overdose
      • Treating Opioid Use Disorders
    • Policing Research and Investigative Science
    • Drone Research and Application
    • E-cigarette Research
    • Zika Virus Research
    • Integrated Governance
    Emerging Issues
  • Global Reach
  • Insights Blog
  • Newsroom
  • RTI Press
  • Publications
  • Partner With Us
  • Careers
  • Facebook Icon X.com Icon Instagram Icon YouTube Icon Linkedin Icon

Breadcrumb

  1. Home
  2. Impact
  3. Publications
  4. Classification Scoring for Cleaning Inconsistent Survey Data

Classification Scoring for Cleaning Inconsistent Survey Data

Thissen, M. (2017). Classification Scoring for Cleaning Inconsistent Survey Data. International Journal of Data Engineering, 7(1), 1-14. http://www.cscjournals.org/library/manuscriptinfo.php?mc=IJDE-122

Copy citation

Abstract

Data engineers are often asked to detect and resolve inconsistencies within data sets. For some data sources with problems, there is no option to ask for corrections or updates, and the processing steps must do their best with the values in hand. Such circumstances arise in processing survey data, in constructing knowledge bases or data warehouses [1] and in using some public or open data sets.

The goal of data cleaning, sometimes called data editing or integrity checking, is to improve the accuracy of each data record and by extension the quality of the data set as a whole. Generally, this is accomplished through deterministic processes that recode specific data points according to static rules based entirely on data from within the individual record. This traditional method works well for many purposes. However, when high levels of inconsistency exist within an individual respondent's data, classification scoring may provide better results.

Classification scoring is a two-stage process that makes use of information from more than the individual data record. In the first stage, population data is used to define a model, and in the second stage the model is applied to the individual record. The author and colleagues turned to a classification scoring method to resolve inconsistencies in a key value from a recent health survey. Drawing records from a pool of about 11,000 survey respondents for use in training, we defined a model and used it to classify the vital status of the survey subject, since in the case of proxy surveys, the subject of the study may be a different person from the respondent. The scoring model was tested on the next several months' receipts and then applied on a flow basis during the remainder of data collection to the scanned and interpreted forms for a total of 18,841 unique survey subjects. Classification results were confirmed through external means to further validate the approach. This paper provides methodology and algorithmic details and suggests when this type of cleaning process may be useful.

Download/Explore
Share
  • Share on Facebook
  • Share on X.com
  • Share on Linkedin
  • Email
Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

  • +1 919 541 8787
  • publications@rti.org

Recent Publications

Article

Treatment preferences among patients with mild-to-moderate atopic dermatitis

December 31, 2023
Article

Understanding how infection prevention influences nurses' task sequencing using a mixed-methods, simulation-based approach

December 01, 2023
Article

Consensus-based framework for evaluating data modernization initiatives

October 01, 2023
Article

∆8-THC-COOH cross-reactivity with cannabinoid immunoassay kits and interference in chromatographic testing methods

September 15, 2023
View All Publications
Navigate to RTI Home
  • Partner With Us
    • U.S. Government
    • Commercial
    • Foundations & Associations
    • Multilateral Donors
    • Universities
    • Suppliers
  • Site
    • Privacy Policy
    • Security Policy
    • Site Map
    • Terms of Use
    • Accessibility
    • Contact Us
Contact Us
Facebook Icon X.com Icon Instagram Icon YouTube Icon Linkedin Icon
delivering the promise of science
for global good
RTI International Logo
RTI Health Solutions
RTI Innovation Advisors
RTI Health Advance

© Copright 2021 RTI International. RTI International is a trade name of Research Triangle Institute. RTI and the RTI logo are U.S. registered trademarks of Research Triangle Institute.