Skip to Main Content

Over and over, the pandemic has reinforced the reality of racial disparities in the U.S. health system. But that story remains difficult to see in the data, which is still inconsistently collected and reported across the country.

On Wednesday, a coalition of researchers and advocates launched a tool they hope will fill some of those gaps: the Health Equity Tracker, a portal that collects, analyzes, and makes visible data on some of the inequities entrenched in U.S. medicine.


“For far too long it’s been ‘no data, no problem,’” said Nelson Dunlap, chief of staff at the Satcher Health Leadership Institute at Morehouse School of Medicine, which developed the tool with funding and resources from, Gilead Sciences, Annie E. Casey Foundation, and CDC Foundation. By making data that do exist on racial health disparities accessible, the tracker aims to empower local advocates to drive change in their communities — and inspire action to fill in holes in data that are themselves reinforced by structural racism. In the tracker’s display, 38% of federally-collected Covid-19 cases report unknown race and ethnicity.

Those gaps are exemplified by the winding path the group had to take to access its Covid-19 data. At its inception, the tracker used state-reported race and ethnicity data collected by the Atlantic’s Covid Tracking Project — a foundation-funded, volunteer-driven effort that, in the absence of a strong, public-facing federal data effort, became a de facto data authority after launching in March 2020; it started tracking race and ethnicity the next month.

It wasn’t until late 2020 that the Centers for Disease Control and Prevention started releasing detailed Covid-19 case surveillance data to a limited pool of applicants, a county-level database that drives the Health Equity Tracker today. “We got access to it in late December, early January,” said Larry Adams, a fellow working on the project. “Which was serendipitous because Covid Tracking Project stopped publishing their data in early March.”


Today, the tracker includes the 26 million lines from that restricted CDC database, each of which represents a single Covid-19 patient — including their state and county, race and ethnicity, sex, age, whether they were hospitalized, and whether they died. It combines that information with state-level health insurance and poverty data from the American Community Survey, and details on diabetes and COPD prevalence from America’s Health Rankings.

Pairing these resources allows users to easily identify patterns across a limited number of health determinants and outcomes. Critically, the tracker’s county-level Covid-19 data makes it a cinch to visualize to what degree Covid-19 has disproportionately impacted communities of color, by comparing the share of total Covid-19 cases against a group’s share of the population.

health equity tracker
(Health Equity Tracker)

In New Haven County, Connecticut, for example, the Hispanic or Latino population accounted for 31% of Covid-19 cases but just 18% of the population; Black or African American community members accounted for 16% of cases while making up 13% of the population.

The CDC’s centralized resource could be more powerful for researchers given the scope of its data, but it is difficult for the public to access and still has major gaps. In the New Haven data, half of all records reported unknown race and ethnicity. And there are roughly 7 million cases missing from seven states that haven’t provided sufficient data disaggregated by race and ethnicity to the CDC. On the tracker’s national Covid-19 maps, those states — Louisiana, Mississippi, Missouri, New Hampshire, North Dakota, Texas, and Wyoming — appear in grey.

That missing and unlabeled data doesn’t erase the mountains of evidence for the structural inequities baked into the health system. Still, it prevents the kind of detailed epidemiological work that can prove powerful in unseating those problems. “We have no idea whether the missing data is distributed according to the rest of the population,” said Adams. “I think all of the data deserves deep inspection.”

Dunlap agreed: “That’s why we’re doing what we’re doing.”

Reporting at the federal level is just one slice of the problem. A Satcher analysis of Covid Tracking Project data published this month found that while almost all states started reporting some race and ethnicity data after the August deadline set by the Trump administration to do so, the proportion of records missing the data only dropped from 29% to 23% between April and November 2020.

“Sometimes people don’t collect it; they just don’t ask,” said Adams. “Sometimes a provider can just check a box on behalf of the patient. Sometimes the patient refuses to answer — and there are legitimate reasons somebody might not want to answer.”

Common standards for reporting race and ethnicity come from the Office of Management and Budget, which uses five race categories and two ethnicity categories. But those checkboxes don’t come close to representing every American’s self-identified race and ethnicity. “For very high level aggregations, I think the current OMB standards work,” said Janet Hamilton, executive director of the Council and State and Territorial Epidemiologists. “But I think the OMB standards often do not have the granularity that I think are meaningful for individuals.”

Sometimes there simply isn’t enough bandwidth to collect and report data. “In a provider’s office you can never not include time and resources” when adding another step to administrative work, said Hamilton, “but I also think the infrastructure itself is not well set up to facilitate the reporting process.” Race and ethnicity is recorded frequently in electronic medical records, for example — but when a lab order is placed through the EMR, Hamilton explained, that data doesn’t automatically carry over. Labs are a critical reporting path for epidemiological information.

Solutions to those problems don’t come easy. Hamilton said the federal government could provide more incentives to encourage reporting, more of a carrot than a stick. “What we don’t want, when we talk about sticks, is that the stick becomes something that impedes care,” said Hamilton. “What we don’t want is for someone to say, ‘Okay, we’re not going to offer services because this data is hard for us to collect.’”

The impact of Covid-19 on individual health will long outlive the public emergency, and the tracker will aim to highlight the disparities in those long-term outcomes.

“I think the most important thing to do right now is to not lose focus on Covid,” Adams said. “The CDC only tracks cases, hospitalizations, and deaths; there’s no indication of long Covid, there’s no indication of disability.”

Eventually, the pandemic will end, and the virus may settle in next to other endemic sources of disease. Then, the tracker hopes bring in other sources of data — from the American Community Survey and elsewhere — to provide a fuller picture of the country’s health outcomes, and the social, economic, and political factors driving them.

Dunlap is looking ahead to incorporating more data into the tracker that can shine light on the relationship between local leadership and community health, or political determinants of health.

“I think that the policy decisions that were made at the height of the pandemic and will continue to be made provide the most clear roadmap for a way out of these times of crisis,” said Dunlap. “If we can get that right, we’ve got a model to move forward.”

  • Collecting this data is great, and it’s something we should always be doing.

    However, this passage…

    “…the tracker’s county-level Covid-19 data makes it a cinch to visualize to what degree Covid-19 has disproportionately impacted communities of color, by comparing the share of total Covid-19 cases against a group’s share of the population.”

    . . .makes a gigantic and rather suspect statistical leap.

    For example, how statistically significant is it that 16% of Covid cases in New Haven came from the Black community, while accounting for 13% of the population? That fairly narrow difference may simply be random, or a function of other non-racial issues like median age of the Black population, incidence of obesity, asthma, hypertension, and, unfortunately but realistically, the prevalence of illegal drug use.

    Even a sizable percentage difference in the Hispanic community may be a function of other factors besides racial discrimination — the real sometimes unspoken subtext of stories like this — and rather inherent biological susceptibility DUE to ethnicity.

    Again, gather the data, that’s great. But let’s be truly sophisticated about its actual implications.

Comments are closed.