RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Improved decision making for water lead testing in U.S. child care facilities using machine-learned Bayesian networks
Mulhern, R. E., Kondash, A. J., Norman, E., Johnson, J., Levine, K., McWilliams, A., Napier, M., Weber, F., Stella, L., Wood, E., Lee Pow Jackson, C., Colley, S., Cajka, J. C., MacDonald Gibson, J., & Hoponick Redmon, J. (2023). Improved decision making for water lead testing in U.S. child care facilities using machine-learned Bayesian networks. Environmental Science & Technology, (46). https://doi.org/10.1021/acs.est.2c07477
Tap water lead testing programs in the U.S. need improved methods for identifying high-risk facilities to optimize limited resources. In this study, machine-learned Bayesian network (BN) models were used to predict building-wide water lead risk in over 4,000 child care facilities in North Carolina according to maximum and 90th percentile lead levels from water lead concentrations at 22,943 taps. The performance of the BN models was compared to common alternative risk factors, or heuristics, used to inform water lead testing programs among child care facilities including building age, water source, and Head Start program status. The BN models identified a range of variables associated with building-wide water lead, with facilities that serve low-income families, rely on groundwater, and have more taps exhibiting greater risk. Models predicting the probability of a single tap exceeding each target concentration performed better than models predicting facilities with clustered high-risk taps. The BN models' Fβ-scores outperformed each of the alternative heuristics by 118-213%. This represents up to a 60% increase in the number of high-risk facilities that could be identified and up to a 49% decrease in the number of samples that would need to be collected by using BN model-informed sampling compared to using simple heuristics. Overall, this study demonstrates the value of machine-learning approaches for identifying high water lead risk that could improve lead testing programs nationwide.