searchSearch data by region...
Expert Insight

Q&A: Lessons in Pandemic Data from Institutional Research

Johns Hopkins has supported COVID-19 research since the beginning of the pandemic, but researchers may still experience challenges to data collection and collaboration. Dr. Shruti Mehta has served as a principal investigator on multiple COVID-19 research projects and shares how these experiences can inform future pandemic data collaborations to improve the process for researchers, participants, policymakers, and the public.

Share
Authors:
Joshua E. Porterfield, PhD
November 3, 2021

When the COVID-19 pandemic arose, academics across disciplines began investigating all aspects of the pandemic. Dr. Shruti Mehta, a professor of epidemiology in the Bloomberg School of Public Health, has been a co-lead on multiple COVID-19 projects including: Pandemic Pulse, COVID Long, and C-FORWARD. These experiences with data collection and analysis in the midst of an ever-changing public health landscape have provided her with a perspective on pandemic research that should be applied to future pandemic preparedness as well as ongoing COVID-19 studies.

Can you describe the origins of the Pandemic Pulse study?

We were funded by the Johns Hopkins COVID-19 Research Response Fund, which funds C-FORWARD — a large community-based cohort study. Pandemic Pulse was a smaller effort that spawned off of that because we were interested in measuring real-time behavior changes. As restrictions were starting to ease, we wanted to measure mobility and how much people were engaging in non-essential activities. Did behaviors differ across different groups and geographies and how did this influence transmission?

“We recognized a data gap in that there was a lot of data being collected on symptoms and testing, but not much about upstream behaviors.”

For example, cell phone data was being analyzed to identify mobility patterns, but what cell phone data misses is the ability to link mobility data to demographic characteristics or geographies and, in turn, to SARS-CoV-2 positivity. We hypothesized that if we could monitor behaviors over time and see where there was a change in the amount or type of activities that people were participating in, we could potentially predict a spike in cases.

What were your major findings regarding behavior and testing positivity?

What we interestingly saw was that movement and activity patterns were comparable across the 10 diverse states that we had surveyed, but we found differences within demographic groups. In each state, 80-85% of the population was doing almost nothing apart from going to the grocery store and maybe visiting friends and family; however, there was this small group of young, predominantly white individuals that were highly active, and those were the individuals who were significantly more likely to test positive for SARS-CoV-2. In terms of testing, by asking questions of everyone and sampling without respect to symptoms or testing, we've been able to actually get an estimate of COVID-19 undercounting. Even in the height of the pandemic, 50% of people who reported that they had symptoms or an exposure actually got a test. Then, we were able to demonstrate barriers to testing.

“Without behavioral data that you can analyze by demographic and geographic characteristics, you're missing a critical piece of the puzzle.”

What lessons did you learn from Pandemic Pulse and how have they influenced your other studies?

There is a balance between collecting data rapidly and ensuring things like representativeness. We have tried to find the balance between these two things across the different studies I have been involved with. It is critical for the public to have access to data, but whoever is analyzing it needs to understand where it's coming from, what the biases are, what the limitations are, so that inferences aren't drawn that shouldn't be because of those limitations. With Pandemic Pulse we had these goals of making data available, making summaries available in real-time, and sharing with health departments. We did what we could, but we struggled to keep up with the pace that was needed. We learned that later, instead of making the full data set available in a dashboard, using a story format with key findings and messages worked better. We did this around the holidays with a message about how travel and participation in non-essential activities was associated with high levels of positivity.

For COVID Long we capitalized on some of the things that we had learned already with Pandemic Pulse to improve the process, looking once again at data gaps. There are many studies suggesting that some percentage of people still have symptoms post-infection, but most of these samples were coming from patients in clinical care, hospitalized patients, or people with severe COVID-19. We wanted a larger, potentially more representative sample of individuals who've had COVID-19 to get an estimate of burden as well as symptomatology since, even with the data that was out there at the time we started, there was no established case definition.

On the other end of these rapid online assessments is the C-FORWARD study, which in some ways uses the gold standard methods for arriving at a representative sample. We took all of the census block groups in Baltimore, grouped them by race/ethnicity and poverty and selected a sample from within these groups and reached out to targeted households to get what we call a “population representative sample.” But it has taken months. The challenge is the time that it takes to do that type of sampling correctly. And then even with all of our careful selection, we know that the people that participate are sometimes different — people who may have had COVID-19 or who are more concerned about their health. We are always worried about certain groups that are more difficult to engage and minority populations being underrepresented. We have worked to engage the community and ensure that we are compensating for time to support participation across groups. The thing that we've seen with COVID-19 studies is that some of those populations have been repeatedly tapped and overburdened. They’re becoming understandably frustrated, and it means we need to take a better look at our approaches.

How do we improve collaboration and the interconnectivity of data?

Unfortunately, we're not inherently set up to do this. Academics particularly at Hopkins are entrepreneurs, and we work in our small groups and we do it well. The university tried by setting up research groups, giving out funds, and encouraging people to work together, but it was still challenging. The pandemic demanded that things happen quickly and it was impossible to integrate all that was going on and prevent duplication. Even more, systems are not in place to support researchers to do this kind of work quickly. Everyone tried to make it faster. NIH tried to have expedited reviews. The IRBs tried to review more frequently, but then who's reviewing the IRB protocols? Overextended faculty.

“I am still amazed at how much was accomplished in a short time, but I can’t help but think of how much more we could have done if we had a system that supported the speed that was needed.”

While it is in some ways impossible to be fully prepared, looking forward, what we need is some sort of pandemic response unit. This pandemic is not over and there will be others. We need a dedicated effort to pandemic response that is always working in this area, so that there is at least a small group that doesn’t have to pivot their work and that provides some central organizational structure. I think we learned a lot from the rapid surveys that we and other groups put out. It would be great to have behavioral surveillance be a core part of the response. This is not new — this is done annually for HIV surveillance by the CDC. While it may not be possible to fully integrate efforts and be 100% prepared, I think we have learned a lot for the next time!

Joshua E. Porterfield, PhD

Dr. Joshua E. Porterfield, Pandemic Data Initiative content lead, is a writer with the Centers for Civic Impact. He is using his PhD in Chemical and Biomolecular Engineering to give an informed perspective on public health data issues.