How many Finns have really been infected with COVID-19?

It is still unknown how many more people have been affected by COVID-19 than the numbers of confirmed cases suggest. This key question is tackled in the study published in the International Journal of Epidemiology. This study introduces a new demographic scaling model for estimating the total numbers of COVID-19 infections in a straightforward manner for countries worldwide.

Applied to Finnish data, the demographic scaling model estimates that more than 14 000 people have already been infected with COVID-19 - which is about one-and-a-half times the number of confirmed cases, 9 682, as of September 27, 2020. In the early stages of the pandemic, the underestimation of the confirmed numbers was even higher. For example, on May 16, 2020, the number of confirmed cases was 6 286, whereas the estimated true number is almost twice as large,  12 150.

“Put into relative terms, 0.3% of the 5.5 million Finns are estimated to have been affected by COVID-19 yet,” says university lecturer Christina Bohk-Ewald from the Center for Social Data Science at the University of Helsinki.

 

Even though the estimated number of infections is markedly higher than the number of confirmed cases, the course of the coronavirus pandemic has been comparatively mild in Finland so far. This is shown when comparing the number of estimated infections in Finland with those in the U.K., Italy, France, and Spain, as of September 27, 2020. In this sample of five European countries, the U.K. is the country with the most people who have been infected with COVID-19, 1.6 million, while Spain is the country with the largest COVID-19 infection prevalence, 3%.

 “Across these five countries, the estimated numbers of COVID-19 infections are, on average, three times larger than the numbers of confirmed cases,“ says Bohk-Ewald.

 

 

How critically important the true numbers of COVID-19 infections are for decision makers, shows the practical example of implementing control measures in due time in order to effectively prevent the pandemic to spread further.

“If travel restrictions were based on the true numbers of COVID-19 infections, they would probably be imposed much earlier than if they were based on confirmed cases. Good estimates of the numbers of COVID-19 infections could perhaps help to save valuable time,“ Bohk-Ewald states.

Many people suspect that the coronavirus pandemic is far more widespread than commonly known, and that the confirmed cases are just a lower estimate of the true numbers of people who have been actually infected with COVID-19. But it is still unknown how many more people have been affected by COVID-19 than the numbers of confirmed cases suggest.

Publication: This article has been accepted for publication in the International Journal of Epidemiology, published by Oxford University Press. (A preprint is available on medRxiv: doi.org/10.1101/2020.04.23.20077719)

The R source code of the demographic scaling model is freely available here:
https://github.com/christina-bohk-ewald/demographic-scaling-model
 

Further information:
Christina Bohk-Ewald, University Lecturer for Applied Data Science and Demography, Center for Social Data Science, University of Helsinki, christina.bohk-ewald@helsinki.fi

Christian Dudel, Research Scientist, Max Planck Institute for Demographic Research, dudel@demogr.mpg.de

Mikko Myrskylä, Professor for Social Data Science, Center for Social Data Science, University of Helsinki & Director, Max Planck Institute for Demographic Research, Germany. Email mikko.myrskyla@helsinki.fi

Key question and background about demographic scaling model

Why is it needed? 
Existing seroprevalence studies for COVID-19 have largely relied on samples that are not representative for the total population and population representative studies are only slowly becoming more available.

Existing approaches to estimate the spread of COVID-19 typically rely on complex statistical methodology and often have high data demands, which makes them less applicable in many countries.

What is it?
The demographic scaling model estimates the total numbers and prevalence of COVID-19 infections based on minimal data requirements: COVID-19-related deaths, infection fatality rates (IFRs), and life tables, that are all freely available online.

As many countries lack IFR estimates, we scale them from a reference country based on remaining lifetime in order to better match the specific context in a target population with respect to age structure, health conditions, and medical services.

The demographic scaling model is build on two key assumptions:

1.     The COVID-19-related death counts are fairly accurately recorded.
2.     The infection fatality rates from a reference country are fairly accurately recorded and become applicable in a target population through proper scaling based on remaining life expectancy.

These two key assumptions may only partially hold at the moment.

Why is it great?
To the best of our knowledge, none of the other existing models has provided a broadly applicable data-based approach that estimates COVID-19 infections using only a few inputs, and that takes into account cross-country differences in age structures, health conditions, and health care systems.

The demographic scaling model provides good estimates of the total numbers and prevalence of COVID-19 infections in many countries. It facilitates the timely monitoring of the spread of the COVID-19 pandemic and shows in applications that the coronavirus pandemic is more widespread than commonly known. The demographic scaling model could even provide input data for more complex epidemiological models.

The answers were given by Christina Bohk-Ewald.