Pandemic data about Latinos has been severely limited due to inaccurate categorizations of race and ethnicity as well as outreach hobbled by language and other research barriers. But small changes to data collection methods could have a big impact on recognizing and addressing long-standing health disparities in the Latino community.
Latinos are one of the groups most impacted by COVID-19, according to multiple studies. Dr. Kathleen Page, associate professor in the division of infectious diseases at the Johns Hopkins School of Medicine, asserts that there are major complications with collecting data that accurately reflects Latino communities. We need to expand and refine our definitions of demographics and deliver more inclusive outreach in order to improve our data and empower us to make meaningful reductions to health disparities in Latino communities.
There were times when “Hispanic” was considered a race even though there are people of many races who are part of the Hispanic ethnicity. There has been progress in that sector, primarily separating race from ethnicity. The biggest remaining problem is that we are a very diverse group, so while racial categorization may be obvious for some, the majority don't fit clearly into “Black,” “White,” or “Asian.” We just saw data from the 2020 census where more than 50% of Hispanics classify their race as “other.” That can be problematic because nobody knows what “other” means.
We also use the same categories in our studies because we're required to do so by our funders, but participants get really confused. Many Latinos who I work with are of indigenous background. In Latin America they may be called mestizo, which is a mix of indigenous and white, but that is not a category in U.S. surveys. Maybe they would consider themselves Native Americans as they are from the Americas, but we recognize that “Native American” in the United States has a very specific definition, which does not apply for people of Mayan, Quechua, or Aymara descent. There's definitely room for improvement and expansion in those labels.
Most of the time, you'll see “non-Hispanic Whites,” “non-Hispanic Blacks,” and “Hispanics,” which includes Hispanic Blacks, Hispanic Whites, Hispanic Asians, and Hispanic Others. Within the Latino community there are huge disparities, not only due to socioeconomics, but also race. There's well-documented literature that Latinos and Latinas of African descent have worse health outcomes by some measures. I work a lot with foreign-born Latinos and one parameter that they feel comfortable sharing is their country of origin, which can provide some nuanced information. The risks for some conditions vary between people who are from Mexico, Puerto Rico, Central America, etc., but it still doesn't get to that issue of race and how it drives disparities within the Latino community.
Country of origin is a very useful data point that we need to collect. It’s not a difficulty; we just need to actually ask people that question. What is more difficult to answer are questions concerning visa status. That data is important because undocumented immigrants are one of the most marginalized groups globally. There are sensitivities around disclosure of documentation status, especially if there's uncertainty about where this data is going to be shared and who can access it. There are confidentiality protections for research subjects, but even the perceived risk of that data being shared can dampen participation. My approach is often not to directly ask about documentation status, but try to discern who's most vulnerable through other markers like socioeconomic status and English proficiency.
The CDC regularly publishes data on the risk of infection, death, and hospitalization from COVID-19, and the risk of infection in Latinos is generally much higher than in Whites. While there were even larger discrepancies for Latinos in the early stages of the pandemic, the difference is still profound. We also know that one of the most marginalized groups of Latinos that have suffered the most during this pandemic is low-income essential workers. Often, they are immigrants with limited English proficiency, living in crowded housing, without access to economic and housing assistance. Those markers, like who was low income but not eligible for a stimulus check, give you a sense that these people are mostly undocumented immigrants, but we don't have national-level data proving that.
If you're going to address disparities in the Latino community, the more you can integrate different levels of data with high granularity is helpful. It's critical though to remember that big data often systematically excludes some groups. Medicaid data, which is used so effectively in many studies, doesn’t include data from undocumented immigrants, who are ineligible for Medicaid. You have to keep those data gaps in mind. Then, to complement the data you have, you need to do community-engaged work, paying special attention to ensure that people are getting access at all levels. You can’t perform outreach in a format that only a few people can utilize. It's like when the vaccine rolled out and the only way to get a vaccine was through websites and forms that not everyone could understand. The same thing is true when we collect data.
Data, although limited, has been incredibly useful to reveal what's really going on with healthcare. A lot of this data is geographically coded, so we can use it to identify hotspots of transmission and high risk, but those are also good markers of areas where there are structural inequities. We can use that data to identify places where we should be looking at other outcomes like diabetes, heart attacks, or violence, and addressing some of the broader aspects of the health disparities agenda. At this point in the pandemic, the hotspots may be more reflective of areas with low vaccination rates due specifically to vaccine hesitancy which may or may not be correlated directly with the presence of health disparities. But earlier pandemic data can really serve as a roadmap to locating and addressing health outcomes in high-risk areas apart from COVID-19.