Dr. Lauren Gardner explains the difficulties in acquiring public health data during the COVID-19 pandemic and gives her perspective on how public health data systems should be maintained and democratized moving forward.
Dr. Lauren Gardner, director of the Center for Systems Science and Engineering at Johns Hopkins, had worked with public health data systems related to infectious disease outbreaks for many years prior to the COVID-19 pandemic. She and her team built the COVID-19 global map that is the keystone of the Johns Hopkins Coronavirus Resource Center. Governments, media outlets, and members of the public from around the globe have used the map to make data-driven decisions throughout the pandemic. This has generated an extremely high-value public health data set, which Dr. Gardner and the CRC have made publicly available to help inform and engage the global community during this crisis. Challenges in gathering the data have made it clear that everyone involved in public health must band together to maintain newly established data ties and build a stronger data infrastructure in anticipation of the next major crisis.
The challenges that we faced this pastyear with data collection efforts are not specific to COVID-19 at all. I've been working with infectious disease data for over a decade, and this whole field is really data poor, especially in the emerging stages of an outbreak. However, high quality data is imperative to be able to conduct useful risk assessment and make evidence-based decisions in real-time. This recognized data gap is why we started the COVID-19 global dashboard effort in the first place.
For measles, in particular, we had a lot of similar challenges. We wanted to know where measles outbreaks were happening and where there were likely to be new hotspots. However there was no centralized data set of measles cases available for the US at the county or state level. We had to go around state by state to collect the data ourselves from many disparate sources.
Furthermore, when we do these data collection efforts, they aren’t transferable because there is no systematic way the data is provided. Therefore, when we want to update our risk models (say for future years), we have to go around and re-collect all the same data, for a different year; which is essentially starting from scratch. For COVID-19, states are currently providing data in a better format than for measles with designated webpages and dashboards, but I'm not optimistic that these will become standing data infrastructures moving forward.
It’s not the time to slow things down. The states should be using this experience as an opportunity to build sustainable infrastructures that can continue to provide data at more regular intervals moving forward. However, they seem to be doing the opposite. Concerningly, when states shift from daily to weekly reporting it is harder to compare them with states that are continuing to release updates every day. They think that they're rounding out this experience and closing down shop, but COVID is so far from over in so many parts of the world. With the borders opening and so much transmission still happening, there is still a substantial risk posed within and to the US. States that shut down reporting will eventually have to rebuild and start up again, either when COVID-19 flares up or when there's another pandemic or other public health crisis. States definitely need to be moving towards a more reliable and consistent reporting infrastructure, rather than scaling back.
It is inappropriate to sell public health data during a public health crisis. We never considered monetizing this data. We collected this data so it could be made available to anyone that needed it to make better decisions, reduce harm, and mitigate risk.
At first, the data was open for use by education, public health, and research institutions, but it was restricted from for-profit applications by corporations. However, early in 2020 we relaxed those requirements and opened it up for anyone to use for any application with an attribution license. We did this because there were many one-off for-profit use cases proposed to us, such as planning office re-openings, that we felt were justified uses of the data, and giving permission one at a time. This approach bothered me since smaller organizations couldn’t be heard with their requests for special case uses. This is why we opened up the license.
It should never have to be done this way again. The type of data we collected and shared clearly needs to exist, but it cannot be provided by building a system from scratch in real time. There needs to be an infrastructure in place that systematically centralizes data, and required reporting standards to complement it. Then, when new viruses emerge, they can be easily incorporated into these standing systems that are centralized, democratized, and public.
An additional thing I learned was the hunger for information from the general public. We built the global tracking map thinking that our small research community of infectious disease modelers would love to have access to this type of data. We never foresaw how much it would be valued and relied upon by the general public.