Trump’s Census Bureau is looking at changing how it fudges data the bureau gives to states to draw legislative districts after the 2020 Census, citing privacy concerns.
Under the 1965 Voting Rights Act, the Census Bureau is required to give detailed counts that include information on race and ethnicity, but advances in computer science mean researchers could potentially use those numbers to put together information about individual households and businesses.
“The Census Bureau has constrained its ability to protect confidentiality,” said John Abowd, the chief scientist for the Census Bureau, in a paper he co-wrote.
An example of potential privacy concerns is the now-defunct Netflix challenge, a contest by the company to help it better predict how subscribers would rate movies. A closeted lesbian mom and others sued, saying the information provided by the company in a database was enough to identify users.
A census example might be researchers putting together a list of people in our country who aren’t citizens if the citizenship question, which is being challenged in a court case, is included on the 2020 census.
Aboud and co-author Ian Schmutte, an associate economics professor at the University of Georgia, wrote that providing the exact numbers in small areas called block groups affects the Census Bureau’s ability to protect confidentiality.
Allison Riggs, senior attorney for the Southern Coalition for Social Justice in Durham, N.C., said federal law requires the Census Bureau to provide the exact numbers for redistricting. States need this information to redraw the boundaries of precincts.
“You can’t do redistricting without block-level data,” Riggs said. “You’d have states suing them.”
For decades, the Census Bureau has tried to ensure anonymity with measures such as altering some records and censoring information, techniques known as statistical disclosure limitation. These methods distort data in ways that aren’t always clear to researchers and can lead to faulty reports.
The Census Bureau is now looking at applying differential privacy, developed by Harvard professor Cynthia Dwork and researcher Frank McSherry which is supposed to balance the needs for privacy and accuracy.
Wisconsin researchers found the potential for harm in such trade-offs when they applied privacy protections and looked at the effect on patient outcomes in a public database of patients on warfarin, a medicine that can prevent strokes but can also cause strokes at the wrong dosage or lead to uncontrolled bleeding.
The researchers found that in pretend clinical trials patients were more at risk of strokes, bleeding and death when effective privacy controls were applied.
The Census Bureau’s Data Stewardship Executive Policy Committee will decide what privacy levels should be applied to the 2020 Census. Researchers are raising alarm about potential accuracy problems, but the Census Bureau is not looking for comments about the possible impact on redistricting.