searchSearch data by region...
Pandemic Data Outlook

Collecting Comorbidity Data to Identify COVID-19 Correlations

Little data on comorbidities is currently reported publicly. Comorbidity data is critical to better design treatments, craft health policy and response strategies, and aid in patient care for those infected by SARS-CoV-2. If they aren’t already, States should begin producing, using and sharing this data.

Beth Blauer, Associate Vice Provost, JHU
October 4, 2021

Understanding a covid patient’s pre-existing health conditions has been an essential component to navigating the pandemic response. Whether it was understanding who was most at risk for hospitalization for death, how to prioritize vaccination dissemination, or how to communicate with the public, comorbidity data is critical. A comorbidity is simply the simultaneous presence of more than one disease or condition in a patient. Physicians and patients are most interested in comorbidities that increase one’s risk of contracting COVID-19, as shown below with diabetes. Reports show that about 10.5% of Americans are diabetic,1 yet 15.7% of COVID-19 deaths are associated with diabetes.2 This demonstrates a clear increased risk that should influence and inform public behavior especially for people who live with diabetes.


The Centers for Disease Control and Prevention currently labels the following as comorbidities that increase a patient’s likelihood of contracting severe COVID-19 disease leading to hospitalization or death:

Chronic kidney disease, chronic lung disease, dementia or another neurological condition, diabetes, Down syndrome, heart conditions including hypertension, HIV, an immunocompromised state, liver disease, excess weight or obesity, pregnancy, sickle cell disease, organ or stem cell transplants, stroke or cerebrovascular disease, or substance use disorders including current and former cigarette smokers.3

That list may seem overwhelming, but it’s reflective of the current state of research on the SARS-CoV-2 virus: there is still so much we do not know. Data on patient comorbidities are crucial to scientific research and development of new treatment options. The better we know how the virus works, the more likely we are to defeat it.

Due to the absence of unified data management systems in the United States, much of what is known about COVID-19 comorbidities is not from active patient records, but death certificate data from the National Center for Health Statistics (NCHS). There is also some comorbidity data in the CDC’s Case Surveillance Data Set, but this is a binary variable regarding whether the patient had any comorbidities, preventing analysis on the impact of any specific conditions. The other source of comorbidity information is individual researchers performing retrospective studies on patients who have already left the hospital. These studies involve isolated hospitals and medical systems complicated by the disconnected nature of electronic medical records and the proprietary nature of patient data. Medical research has historically been performed in this manner, which is hyper-focused, repetitious, and limited in scope.

For example, cardiovascular comorbidities have been shown to correlate with increased risk of being put on a ventilator or dying from COVID-19 based on separate retrospective patient studies in Wuhan,4 New York City,5 and Los Angeles6 among others. There are now even studies summarizing and aggregating the data from the separate cohort studies.7 While these physicians and scientists are performing incredible work, this system for procuring comorbidity data from isolated health systems to make local conclusions that can then be compared to other studies is overly complex and time-consuming. The limiting step is data.

As of now, the only regularly updated, national source of COVID-19 comorbidity data is the CDC. They provide COVID-19 death data disaggregated by 22 comorbidity categories.2 The CDC produces this dataset based on the release of official death records to the NCHS, which then processes, codes, and tabulates the data before releasing it to the public. This data offers the best comprehensive view of COVID-19 comorbidity data we have in the United States, but the CDC itself admits that this data has significant lag and is incomplete due to how different states process and report official death records.

When circumventing lag at CDC due to official recordkeeping, some states have done an incredible job throughout this pandemic providing real-time data disaggregated by demographics. Comorbidity data should be another prong of this data collection and reporting. In addition to being faster than the CDC, states can provide comorbidity data from COVID-19 cases and hospitalizations in addition to deaths. State dashboards have been rich data sources throughout this pandemic, and they would be well-suited to providing this data. States should also look at the capacity of comorbidity data to shape public health policy making in the most general of terms.


Utah’s dashboard is an excellent example of a state providing highly-detailed COVID-19 comorbidity data (A partial view is shown above).8 Apart from comorbidity data for COVID-19 cases, hospitalizations, and deaths, the state also provides the data disaggregated by race and ethnicity. The more data and cross-categorization that is accessible, the easier it becomes to analyze this disease. Utah is a great model, but unless other states also begin collecting and providing the same data, then the data can only be analyzed as an isolated study of COVID-19 in Utah.

That brings us to the constant issue with public health data in the United States – standardization. States have been unable to consistently define demographic categories, vaccine status, and booster shot data, which are all data types with far fewer options and choices than comorbidities. A standard set of comorbidity options will need to be established and used by all states. The CDC and NCHS utilize the “ICD-10” codes, which are disease categorizations established by the World Health Organization and utilized throughout the medical field, making them excellent standards to employ nationwide. It is best to standardize prior to data collection and reporting as rebuilding the systems is much more difficult than building them correctly from the start, as we have seen with other COVID-19 data streams.

This data is important, but how it is collected and reported will determine how useful it truly is for defining a global understanding of COVID-19. A national unified data management system with clear and immediately useful definitions would remove the need to discuss standardization, as medical records would have the same options across the country. Analysis would also be simplified as much of the bureaucratic red tape preventing researchers from utilizing larger cohorts would be removed. This type of system exists outside of the United States, and it will be a major boon to combatting future public health crises if we begin working towards a unified medical record system now.

However, we currently have a disjointed, federated system whose confines we must work within. Therefore, the burden lies on the individual states once again to take the initiative to record this data and share it with the public. Decision-making is highly influenced by this data. Policy makers need to know if they represent populations that are at increased risk due to comorbidities to design more effective relief and mitigation strategies. There may also be conditions unlisted as risk factors due to the smaller sample sizes and incomplete data of studies already performed. Making this data easily available for researchers will also boost the speed and efficacy of COVID-19 research. While we do not have much data on COVID-19 and comorbidities right now, we have the capacity to collect, share, and analyze that data to hone in on the function and true danger of SARS-CoV-2.


  1. Centers for Disease Control and Prevention, National Diabetes Statistics Report, Atlanta, GA, Centers for Disease Control and Prevention, U.S. Dept of Health and Human Services, 2020.
  2. National Center for Health Statistics, Weekly Updates by Select Demographic and Geographic Characteristics, Updated 22 September 2021. (Accessed 27 September 2021 2021).
  3. National Center for Immunization and Respiratory Diseases, People with Certain Medical Conditions, 20 April 2021. (Accessed 27 September 2021).
  4. F. Zhou, T. Yu, R. Du, G. Fan, Y. Liu, Z. Liu, J. Xiang, Y. Wang, B. Song, X. Gu, L. Guan, Y. Wei, H. Li, X. Wu, J. Xu, S. Tu, Y. Zhang, H. Chen, B. Cao, Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study, Lancet 395(10229) (2020) 1054-1062.
  5. T. Kuno, M. Takahashi, R. Obata, T. Maeda, Cardiovascular comorbidities, cardiac injury, and prognosis of COVID-19 in New York City, American Heart Journal 226 (2020) 24-25.
  6. T.S. Chang, Y. Ding, M.K. Freund, R. Johnson, T. Schwarz, J.M. Yabu, C. Hazlett, J.N. Chiang, A. Wulf, U.H.D.M.W. Group, D.H. Geschwind, M.J. Butte, B. Pasaniuc, Prior diagnoses and medications as risk factors for COVID-19 in a Los Angeles Health System, medRxiv (2020) 2020.07.03.20145581.
  7. J. Sabatino, S. De Rosa, G. Di Salvo, C. Indolfi, Impact of cardiovascular risk profile on COVID-19 outcome. A meta-analysis, PLOS ONE 15(8) (2020) e0237131.
  8. Utah Department of Health, COVID-19 Cases by Age and Sex, Updated 27 September 2021. (Accessed 28 September 2021).

Comorbidity case data taken from Utah’s Coronavirus Dashboard.8

Beth Blauer, Associate Vice Provost, JHU

Beth Blauer is the Associate Vice Provost for Public Sector Innovation and Executive Director of the Centers for Civic Impact at Johns Hopkins. Blauer and her team transform raw COVID-19 data into clear and compelling visualizations that help policymakers and the public understand the pandemic and make evidence-based decisions about health and safety.