Understanding COVID19 Through Data

COVID-19 has been raging on for months now. While some countries have passed the worst phase, others are just now beginning to reach the peak. The number of confirmed cases worldwide continues to grow, and countries like Germany, the UK, the US, and China hurry to create and test vaccines. But, there are still so many questions to be answered. 

A recurring question heard everywhere is, "Who is infected?" 

We have turned to Antonio Fernandez Anta, Research Professor at IMDEA Networks Institute in Madrid and speaker of J on The Beach 2020, to help answer this. 

Antonio, together with an international team of scientists, has launched a study called the @CoronaSurvey Project, which aims to estimate the real number of COVID-19 infections. They are reaching people worldwide through the Twitter page @CoronaSurveys to get the responses needed for their study. His team analyses and updates the data collected daily to make estimates and adds it to their webpage. 

JOTB: So, Antonio, how did you come up with the idea for your research? 

Antonio: I was listening to the number of infected cases that were being mentioned all over the news. You know, newspapers, television, everywhere, and I knew that what they were reporting were only the confirmed cases, which it was by no means the actual number of cases because they didn’t have enough laboratories and tests to be able to account for every single infected person in the country, particularly in Spain. So, I was thinking, "how can I help?" and I was exploring multiple options. 

JOTB: What is the Corona Survey? 

Antonio: It ended up being a simple Twitter survey. I woke up on March 13, and I said, "Look, I can ask people, 'are you infected or not,' but if I do that I’m going to get only a few hundred responses, in the best of cases, and that’s not going to tell me anything that's statistically significant". So, instead of asking people about their health, I asked people about the health of the people they know. I launched on Twitter [the question] "How many people do you know that are infected with COVID-19?" 

JOTB: How are you able to use the data you collect to estimate for a whole country? 

Antonio: You don’t really know how many people a person knows, but there is this result in social science, called the Dunbar number, that tells that each of us on average connects with 150 people. So, by using that, if you tell me that you know 15 people that are sick, and I make the assumption that you know 150 people, that means 10% of the population should be infected. That's the rule of thumb we've been using. 

JOTB: How is your data different from what is out there? 

Antonio: The data that is right now available, which is the number of confirmed cases, is by far underestimating the figures of infected people. 

JOTB: Is this data just for Spain? 

Antonio: It started as a very simple experiment in Spain, a week later we had deployed similar experiments in Italy, Portugal, UK, US, and Cyprus. Now, we are two months down the river, and today or tomorrow we are going to be deploying another survey which will be translated into 56 languages and is going to collect regional information from probably 100 countries.

Antonio explains that ideally, contributors would return to the surveys periodically, maybe once a week or so, especially if there is a change in the number of infected people they know. This is so that the team can see how the number of people infected has progressed. The idea is to keep updating the numbers and analysing the trends. 

The team believes the study can be especially useful in countries where there isn't much data available, or where such data is not very reliable. 

According to Antonio, they have received positive results when applying the survey in Ukraine. They found that the results from the surveys in that country, and the number of infections estimated from the number of deaths disclosed, was very inconsistent, which was most probably due to a lack of data. They found that by applying a reverse algorithm, they could correctly estimate the actual number of deaths related to COVID-19. 

The people behind the CoronaSurveys are all volunteers who believe this information is something governments can use to make future decisions. 

All the data they collect is open-source on GitHub, and the team encourages people to download and play around with it. Antonio even jokes that they wouldn't mind if someone wanted to organize the data.

You can contribute your data anonymously to help improve this study by finding your region and filling out a short survey. You can also learn more about Antonio and his team, and stay up to date with the @CoronaSurvey results by following the project's social media.