This post was originally published June 8, 2020 on Towards Data Science.
In a racist society, it is not enough to be non-racist [data scientists]. We must be antiracist [data scientists] — Angela Davis
This post was originally published June 8, 2020 on Towards Data Science.
In a racist society, it is not enough to be non-racist [data scientists]. We must be antiracist [data scientists] — Angela Davis
Data scientists are data stewards. We collect data, store data, transform data, visualize data, and ultimately impact how data are used. In our data-driven world, we have found ourselves with the responsibility to use data to tell stories and effect change.
But with this responsibility, it is not enough for us non-Black data scientists to simply not be racist. It is not enough for us to sit behind our computer screens to write code and feel angry but not take action after the deaths of George Floyd, Breonna Taylor, Ahmaud Arbery, and too many other Black individuals. It is not enough for us to acknowledge the racist systems that continue to exist in the United States but not actively do anything about them.
As Angela Davis said, “In a racist society, it is not enough to be non-racist. We must be antiracist.” Both non-racists and antiracists recognize that racism and white supremacy are wrong. Antiracists are those who take action to do something about it.
As non-Black data scientists, we must be antiracist data scientists. We must take responsibility for our power and privilege. We must confront the ways in which data and algorithms have been used to perpetuate racism, and eliminate racist decisions and algorithms in our own work. We must recognize that our field is lacking diversity (only ~3% of data scientists identify as Black) and contribute to pathways that change this. Being antiracist data scientists isn’t a one-time decision or something we will ever fully achieve, but instead a commitment we make each day, now and in the future, towards building a more equal world.
Here are 5 steps we can take to get started:
To be antiracist data scientists, we must take the steps to be antiracist individuals. Being antiracist is different for white people than it is for people of color. As written in this toolkit by the National Museum of African American History and Culture: “For white people, being antiracist evolves with their racial identity development. They must acknowledge and understand their privilege, work to change their internalized racism, and interrupt racism when they see it. For people of color, it means recognizing how race and racism have been internalized, and whether it has been applied to other people of color.” This excerpt from The Racial Healing Handbook by Dr. Anneliese Singh is a great place to start as it walks through the six responsibilities that individuals can take in the ongoing process to be antiracist: Read, Reflect, Remember, Risk, Rejection, and Relationship Building.
To white readers specifically who have begun to acknowledge privilege and are looking to Read and Reflect — before burdening Black, Indigenous, or People of Color (BIPOC) friends with requests for reading resources or conversation, start with the many resource lists that are currently available online, such as here and here, and reach out to white friends who are also on this journey for conversation.
As data scientists, we use data to answer questions, solve problems, and (hopefully) have a positive impact. But history has repeatedly shown that good intentions are not enough. Data and algorithms have been used to perpetuate racism and racist societal structures. It is imperative that we educate ourselves about these realities and the uneven effects they have had on Black lives*. This list is meant as a starting point and is by no means exhaustive; we must continue to learn from, contribute to, and amplify research and reporting on this work in our efforts to confront these challenges.
New Articles: Racial Bias in a Medical Algorithm Favors White Patients Over Sicker Black Patients; Many Facial-Recognition Systems Are Biased, Says US Study; Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks; As Cameras Track Detroit’s Residents, a Debate Ensues Over Racial Bias; Facebook’s ad-serving algorithm discriminates by gender and race; How community members in Ramsey County stopped a big-data plan from flagging students as at-risk
Lectures: Big Data, Technology, and the Law; Algorithmic Justice: Race, Bias, and Big Data; Legitimizing True Safety (which includes discussion of facial recognition and how police surveillance is currently being used against Detroit residents accused of violating social distancing orders)
Books (consider purchasing from a Black bookstore): Algorithms of Oppression: How Search Engines Reinforce Racism (Safiya Noble); Artificial Unintelligence: How Computers Misunderstand the World (Meredith Broussard); Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor (Virginia Eubanks); Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech (Sara Wachter-Boettcher); Weapons of Math Destruction (Cathy O’Neil)
Experts to Follow: Nasma Ahmed (Digital Justice Lab); Alvaro Bedoya (Visiting Professor of Law at Georgetown University and Founding Director of the Center on Privacy and Technology); Meredith Broussard (Associate Professor at NYU); Joy Buolamwini (MIT Media Lab); Max Clermont (Senior Political Advisor to Holyoke Mayor Alex Morse); Teresa Hodge (Co-founder and CEO of R3 Technologies); Tamika Lewis (Fellow at Data Justice Lab); Yeshimabeit Milner (Co-founder and Executive Director, Data for Black Lives); Tawana Petty (Non-Resident Fellow at the Digital Society Lab and Director of Detroit Community Technology Project); Rashida Richardson (Director of Policy Research at AI Now); Samuel Sinyangwe (Co-founder of Campaign Zero); Latanya Sweeney (Professor of Government and Technology in Residence at Harvard University, Director of the Data Privacy Lab)
Organizations to Follow: Data & Society; AI Now; Digital Civil Society Lab; Center on Privacy and Technology; Data for Black Lives; Campaign Zero; Digital Equity Laboratory; Data Justice Lab
*While this post specifically focuses on antiracist support for Black individuals, there is also a long history of data-driven discrimination related to ethnicity, gender, sexuality, and other demographic attributes that I encourage readers to also learn more about
As antiracist data scientists, we must commit to taking action every day in our own work to eliminate racist decisions and algorithms. There is no one checklist that will accomplish this, but I have found myself regularly applying a series of questions to the data science projects that I contribute to. Portions of these questions come from a 2018 lecture I attended titled “The Data You Have and the Questions You Ask It” by Logan Koepke, a Senior Policy Analyst at Upturn.
If the answers to these questions reveal underlying racism, we must speak out and challenge the status quo.
There is a growing body of research of technical approaches to addressing race in algorithms in a way that considers fairness. Simply not including race as a variable in an algorithm and saying that you have “Fairness through unawareness” is unacceptable: just because an algorithm does not include race as a predictor does not mean that it is unbiased. Instead, data scientists should explicitly consider the sensitivity of algorithms to race. This article provides an introduction to algorithmic fairness including the concepts of Demographic Parity, Equalized Odds, and Predictive Rate Parity, and tools that can be used to reduce disparity during pre-processing, training, and post-processing. This article illustrates how to explore Demographic Parity using SHAP, an explainable AI tool. The report Exploring Fairness in Machine Learning for International Development by the MIT D-Lab explores how to integrate fairness into a machine learning project with considerable detail. For additional learning, utilize this free online textbook and these videos: Google Machine Learning Crash Course Fairness in ML; 2017 Tutorial on Fairness in Machine Learning; 21 Fairness Definitions and Their Politics.
The 2020 Harnham US Data and Analytics Report found that only 3% of Data and Analytics professionals identified as Black, and even fewer in leadership positions. This is unacceptable, particularly as we (non-Black data scientists) continue to use data collected from and write algorithms that impact Black communities.
To push the organizations we work for and the data science community at-large to change, we must commit to:
It is no secret that data science is a lucrative field with a mean annual salary of approximately $100,000. Since we were not born knowing data science, many of us have likely entered this field thanks to robust educational experiences. As antiracist data scientists, we must recognize that we live in a racist society where education opportunities are distributed unequally. Since data science impacts everyone, we must commit to using the financial resources we’ve received for our work to support educational experiences that increase diversity in the data science workforce (and make this lucrative field more accessible) as well as data awareness for everyone.
Set up recurring monthly donations to Black-led and community-driven organizations contributing to data awareness, data collection, and data visualization of timely issues such as police violence. Organizations to consider include:
Set up recurring monthly donations to support data science and tech programs that serve Black students. While it may be tempting to volunteer for teaching opportunities, it can be extremely powerful for BIPOC students to learn from BIPOC data scientists. Consider financially supporting programs such as:
In 2016, Google completed research highlighting the role that community colleges can play and the challenges they face in creating a pathway to increased diversity in computer science. Community colleges generally have substantially smaller financial requirements than universities for starting a scholarship, and these scholarships can go a long way. Reach out to the financial aid office at your local community college to get started today.
Many HBCUs have existing or new data science programs including:
Reach out to these programs directly to learn more.
We cannot stand by as the decisions we make as data stewards continue to cause irreparable harm to the Black community. I’ve committed to the steps in this post while knowing that the work will not end so long as racism continues to exist.
I hope you join me.
I welcome feedback and additional contributions.
Thanks to the friends, colleagues, and family members who provide feedback on the draft of this post, and to the many role models who have provided guidance on my journey thus far. The work continues.