News

Coronavirus and the numbers that clarify or obfuscate

Written by Dr David Pendleton | 12.10.2020

In the pandemic, our lives are ruled and dominated by numbers, especially when they change unexpectedly. The number of cases, deaths, infection rates (R), and hospitalisations are meant to inform us about the impact of the virus on the nation’s health. The employment statistics, furloughs, bankruptcies, stock market indices and exchange rates inform us about its impact on the economy.

Or do they? You might imagine that it should not be too difficult to know how many cases there are in each country, but that is not the case and, without good data, even inspired leadership is reduced to guesswork. The problem is that the numbers grabbing the headlines are not always the most representative of the situation we are in. The number of ‘cases’ reported each day is actually the number of ‘detected’ cases. We know this is an under-estimate of the actual number of cases in each country because many people with the virus are asymptomatic and blissfully unaware they have it. We do not know how many of these there are, let alone those who have it, feel dreadful and have self-isolated until it goes away.

A worsening situation?

As the government increases the number of tests, so they raise the number of detected cases and it looks as though the country is getting sicker. It is not, or it is not necessarily getting sicker. But the international league tables still move the UK up the table as the number of detected cases rises giving the impression of a worsening situation. In addition, there is the matter of who is getting tested and where. We are doing more tests in hot-spots so we can track, trace and isolate those who could pass on the disease to others. Understandable as this is, it gives the pessimistic impression that the country is getting more infected because the number of ‘cases’ is rising. According to our purpose for the data, so we need to collect it differently.

#1

We need data that allows generalisation to the population so we have a good idea of the spread of the virus across the nation. We need sampling of the population designed to be broadly representative of the population as a whole. On the basis of this, we can generalise to the population and can track the rises and falls in the prevalence of the infection across the nation.

#2

We need to add continual monitoring of the positive test rate: i.e. for each 100,000 tests, how many are positive. As this changes, we learn about how R is changing.

#3

In parallel, we can test, trace and isolate in hotspots for the purpose of intervention there. But don’t add the numbers together with the random sample because that causes confusion and once again we won’t know what the numbers mean. Unless we separate these numbers, we won’t know what to do differently, collectively or individually. Fourth, the data need to be current: delays distort the picture. There are many different sources of information about the spread of the virus but the best, such as the ONS and some universities, use different methods so comparisons are difficult. There are also delays in publishing the data and hotspots exist so averages can be misleading.

We all need the best data

As for overseas countries who are entered onto the UK government’s ‘red’ list when their number of cases rises above 20 per 100,000 people, how are we to interpret these data? We don’t know who has been tested, or why, or on what basis. We do not know if this is a reasonable estimate of the prevalence of the infection in the country or what changes in the number mean, yet severe damage has been caused to the hospitality and transportation industries on this basis. Maybe we have little choice but to act cautiously in this way to protect us all, but information is supposed to reduce uncertainty. We all need the best data, better reported to clarify and guide rather than obfuscate or we and our leaders are likely to make poor decisions about health, the economy and other matters.