Many people suspect they’ve been infected with COVID-19 by now, despite the fact that only 0.5% of the UK’s population has actually been diagnosed with it. Similar numbers have been reported in other countries. Exactly how many people have actually had it, however, is unclear. There is also uncertainty around what proportion of people who get COVID-19 die as a result, though many models assume it is around 1%.

We believe there has been over-confidence in the reporting of infection prevalence and fatality rate statistics when it comes to COVID-19. Such statistics fail to take account of uncertainties in the data and explanations for these. In our new paper, published in the in the Journal of Risk Research, we developed a computer model that took these uncertainties into account when estimating COVID-19 fatality rates. And we see a very different picture.

Our model, called a Bayesian Network, allows us to combine multiple data sources and assess how sensitive the infection prevalence and fatality rates are to two dominating sources of uncertainty.

One is the accuracy of serological (antibody) testing, which is crucially dependent on our ability to accurately measure whether an individual has antibodies. We account for factors such as false positives or negative rates for manufacturer test kits.

We also take account of the reliability of fatality data. This is important because the fatality rate, the probability of death for a Covid-19 infected patient, is defined as the death count divided by the number of infected people in the community. If either of these variables is uncertain, any policy decisions based on the resulting fatality rate will themselves be unreliable, or potentially dangerous.

Both of these factors are much more uncertain than is being reported. When we account for them in our model, we discovered high community infection rates in many regions across the world. For Kobe, Japan, our model suggested that over 800 times more people have had COVID-19 than has been reported. For England and Wales, this figure is 28 times more.

As for the fatality rate, the team from Imperial College in the UK, which is advising the UK government, has previously estimated this number to be 1%. But this is uncertain. The team states that its model “relies on fixed estimates of some epidemiological parameters such as the infection fatality rate…”, while also acknowledging that “amidst the ongoing pandemic, we rely on death data that is incomplete, with systematic biases in reporting, and subject to future consolidation”.

When we adjusted for these uncertainties, we discovered that the fatality rate estimates are most likely to be in the range 0.3%-0.5% for the countries/regions we considered.

Illustration of the new coronavirus

Although not covered in our study, we also applied our model to New York City data. Here the “actual” NYC fatality count is stated as 23,430, with an estimated fatality rate of 1.4%. But, when the data is input into our model, the estimate for the fatality rate can be adjusted down to range between 0.6% to 1.3% – potentially half of the official figure.

#### Uncertainties in death numbers

So how could we account for these uncertainties? Each country calculates deaths differently – which is a problem to begin with. And, in many countries, the “actual” fatality count is estimated by adding confirmed deaths, where COVID-19 appears on the death certificate alongside a positive COVID-19 test result, deaths where COVID-19 is on the death certificate but where no test took place, and a statistical estimate of “excess deaths” (how many more deaths it is believed there were than normal).

For example, in New York City the “actual” fatality count is the sum of confirmed 13,156 deaths, where COVID-19 appears on the death certificate alongside a positive COVID-19 test result, 5,126 deaths where COVID-19 is on the death certificate but where no test took place and 5,148 excess deaths. But we don’t actually know whether some of these people died “of” or “with” COVID-19. Many of these deaths are labelled “actual” when they are actually highly uncertain.

What’s more, excess deaths are often calculated by comparing against the preceding five years, excluding years with “bad” influenza seasons – which is a problem. Also, COVID-19 may be accelerating deaths that were imminent. And if the effects of lockdown are preventing people with serious conditions such as strokes and heart attacks from accessing healthcare and dying as a result, there is a risk that including them as “excess deaths” due to COVID-19 has contributed to serious overestimation.

#### Herd immunity?

This sort of research is worth considering when debating if we are close to herd immunity, or whether a “second wave” of the virus is likely. Taking Sweden as an example, antibody studies show COVID-19 was much more prevalent, at 7% a few weeks ago, than confirmed cases suggested at that time. However, this is still far from the 65% assumed to guarantee herd immunity. If Sweden has not reached herd immunity and not mandated lockdown, why are their death numbers not increasing?

One controversial explanation that we didn’t account for in our study is the existence of “antibody dark matter” that does not show up in antibody testing but nevertheless offers some protection against the virus.

The immune system involves two types of white blood cell: T cells and B cells. But only B cells produce antibodies. Studies show immunity might more rapidly develop from previous infections “similar” to COVID-19, such as SARS-v1, via immunity “T-cells” rather than the B-cells. This means many people may have had coronavirus but not developed antibodies – leading to an underestimation of the number of infections, including in our model.

So while one recent study claimed that about 10% of the population of England and Wales may in fact have been infected, the real number could in fact be even higher.

Clearly, we cannot fully trust statistics on death and infection rates before we get more accurate data and include it into a model such as ours.