From the start of the pandemic, Johns Hopkins University and some media organizations recognized that the national public health infrastructure was not well equipped to track COVID-19 data in near real time. These data pioneers rose up to provide the public and policymakers with high-quality, trustworthy information to help keep people safe and inform critical decisions.
On Nov. 14, we convened a panel of our peers who have provided accurate, timely public health data throughout the COVID-19 pandemic: Dr. Lauren Gardner, founder of the COVID-19 Global Map that supports the JHU Coronavirus Resource Center; Robinson Meyer, co-founder of the Covid Tracking Project at The Atlantic; Ian Hodgson, data reporter from the Tampa Bay Times; and Archie Tse, graphics director for the New York Times. These distinguished panelists stepped up to provide the world with the reliable, openly accessible data necessary for decision-making throughout the rapidly changing COVID-19 pandemic. They undertook this effort in the absence of adequate government intervention and amid a confusing landscape of federal, state, and local reporting practices. In all likelihood, the pandemic would have been even more deadly without the work of these panelists.
The panel’s COVID-19 data pioneers stepped up because there were clear and critical data gaps left by governments across the globe, particularly here in the United States. The Covid Tracking Project began as a weeklong endeavor to discern how many COVID-19 tests had been administered in the United States because no one was reporting negative tests. Their early results showed that in the first week of March 2020, only around 2,000 tests had been administered, revealing how underprepared the nation was to expand the testing infrastructure to fight COVID-19 spread. Archie and the team at the New York Times produced some of the most reliable national data on cases in long-term care facilities (LTCFs) and prisons, becoming the first to report that one third of COVID-19 deaths were occurring in LTCFs (shown below).
The New York Times also stepped up to provide national county-level data on cases and deaths when no one else was providing it. Lauren Gardner and her team became the go-to resource for global, and then national, COVID-19 case and death data as they manually aggregated information from authoritative sources around the world. The ability of Lauren’s team and these other organizations to make executive decisions quickly without the federal government’s bureaucracy and official reporting requirements allowed for quick and effective action.
Our pandemic data efforts, while rewarding, have been time-consuming and draining. All of our panelists reported that their team members were consistently working 18+ hour days, seven days a week. The Global Map, which was the brainchild of Ensheng Dong in Lauren Gardner’s lab, has now expanded into the Coronavirus Resource Center (CRC), which employs a diverse team of data scientists, programmers, engineers, communications, and public health experts and provides around-the-clock data updates and insights.
The Covid Tracking Project was uniquely a volunteer effort, with people contributing their time and skill while on sabbatical or leave from their regular jobs due to their dedication to improving public health. What started as a graphic of individual cases of COVID-19 on a map of China evolved into a 77-member team at the New York Times, which was initially pulled from interns, college journalists, and freelance reporters. They submitted daily data acquisition requests to health departments across the country in the pursuit of complete, granular data. Ian Hodgson and other reporters at the Tampa Bay Times often had to threaten or actually pursue legal action to receive data from the government of Florida. Acquiring all this rich data required the most dedicated and talented individuals, to whom we are all immeasurably grateful.
We all agreed that our work would have been less strenuous had the United States employed rigorous standards for public health data across all states and territories. New York Times team members called individual county offices regularly to have definitions clarified and datasets properly labeled. Ian and the Tampa Bay Times observed the highest quality data and metainformation coming out of Florida in the beginning of the pandemic, but authorities later changed data definitions, reduced the release of data, and reclassified pandemic data as non-essential to public health, allowing it to be held back.
The CRC has managed data collection from over 200 countries with constantly changing temporal and spatial granularity for data (shown in part above), which limits the efficacy of automation and relies on manual anomaly detection. With a focus on testing definitions — a constant source of stress here at the CRC — the Covid Tracking Project had to reach out to state testing sites and health authorities to clarify the types of tests and testing data available. These talented data experts had immense difficulties navigating byzantine state reporting policies, and none of it would have been necessary had there been a stronger federal policy for data standards.
The audience was interested in how we plan to transition from a pandemic to an endemic disease. How and when do we start wrapping up our independent data collection efforts if the situation is less urgent? The Covid Tracking Project shut down on March 7, 2021 to allow their volunteers to rest and encourage the federal government to take over the process. Ian argued that our jobs are necessary as long as health departments are underfunded and understaffed, making data collection and analysis a low priority in a crisis. As he showed in the image below based on reporting by his Tampa Bay Times colleagues Steve Contorno and Neil Bedi, Florida has some of the lowest staffing and lowest pay within health departments across the United States despite its critical role in pandemic data.
Archie insisted that the job of independent data collectors isn’t done until the government steps up and can be relied upon to prioritize data collection enough to provide the same quality of publicly accessible data that our organizations have produced. Lauren is assisting with archiving the CRC data, but sees the need for these public health data systems to exist in perpetuity for non-COVID applications. Academia and the media have continued roles to play in public health, but we all long for a day when COVID-19 data is not in daily headlines. I am encouraged to know that these passionate data pioneers are committed to serving the global community. I hope that the insight and outlook of this panel can be manifested in the continued work of the Pandemic Data Initiative as we work to better prepare our data systems for the next public health crisis. This panel was an honor to convene, and I encourage you to both watch the recording of this forum and pre-register for our next exciting forum on Jan. 21.