Across the world, there is a desperate attempt to measure the cost of the Covid-19 pandemic, and one such measure is the excess deaths. In countries where robust infrastructure exists to register every death promptly, reliable and verifiable estimates of excess deaths are publicly available. However, in many countries with inadequate birth and death registration infrastructure, estimates of excess deaths are subjected to uncertainty and shrill politicking. Even with sophisticated statistical methodology, unreliable data would lead to imprecise and speculative estimates. Unfortunately, social and news media’s overzealousness has led to the politicisation of excess deaths under these circumstances. The latest to join this fray is the World Health Organisation (WHO).
Existing literature shows that methods for estimating deaths when data are not yet available tend to fail spectacularly as hard data become available. Hence this has serious implications for estimates of deaths in developing countries and overall global estimates. Complex mathematical models with simplistic assumptions become inaccurate and difficult to interpret — and no matter what the urgent compulsions to publish simple estimates of Covid-19 deaths, complex models should not take precedence over high quality data.
In the Indian context, provisions of the Registration of Births and Deaths Act of 1969 require every death to be registered within 21 days of the event. The Civil Registration System (CRS), which is “defined as a unified process of continuous, permanent, compulsory and universal recording of the vital events and characteristics thereof, as per legal requirements in the country”, is the repository of all registered births and deaths in the country. The CRS reports data at the national, state, and district levels. In the context of the pandemic, the CRS has become the go-to place for estimating excess deaths in India. Registered deaths during the pandemic are compared to an average of registered deaths before the pandemic (baseline estimates) to produce estimates of excess deaths.
However, careful research of the death data from CRS has repeatedly revealed serious shortcomings. For example, for 2019 (before the pandemic), researchers C Rao et al. showed that the CRS data on deaths (7.64 million) undercounted the number of dead by 2.28 million, which was systematically more severe for the elderly (above 60 years) and the children (under five years), who accounted for 56 per cent and 30 per cent, respectively of the additional deaths. Not surprisingly, they also found that adjustments in the states of Bihar, Jharkhand, Madhya Pradesh, Maharashtra, Rajasthan, and Uttar Pradesh accounted for 75 per cent of the additional deaths.
This implies that death data from the CRS, mainly to produce the baseline estimates before the pandemic compared with registered death data during the pandemic, is not a reliable source of death unless adjustments are made for sex, age, and location. It is important to reiterate that in 2019, the CRS reported an overall registration of 7.64 million deaths, which was 92 per cent of the overall deaths estimated by the Sample Registration System (SRS). However, according to these researchers, after adjustments were made for age, gender, and location, the total death count for 2019 was 9.92 million. Therefore, the overall level of registration (LOR) or completeness of death data after adjustments for age, sex, and the location was 77 per cent, which was 15 percentage points lower than what was reported by the CRS.
Furthermore, when one disaggregates the data by gender, they report that the completeness of death data was 81 per cent for males and 72 per cent for females. Given that the level of registration of death data in the CRS was much lower in earlier years; it ranged from 75 per cent in 2015 to 85 per cent in 2018, one would expect the bias or undercount of the death data in the CRS to be much larger for earlier years.
The bottom line from this research is that using the CRS death data for the pre-pandemic period as the baseline without adjusting for age, gender, and location (given that the registration level has not been uniform over the years) would lead to exaggerated numbers of excess deaths. Unfortunately, researchers, journals and journalists have chosen to completely ignore this fact. The rush to pronounce damning evidence of loss of lives from the pandemic is leading to one set of “guesstimates” after another.
Another source of data that researchers have used to estimate excess deaths is the household survey, such as the CVoter tracker survey. Even though it is a national survey carried out daily using computer-assisted telephonic interviews, its primary purpose is to track perceptions of governance, media, and other social indicators. The sampling methodology and the questionnaire are not designed to collect death data from households. A reliable source of death data in India is the Sample Registration System (SRS), a large-scale demographic survey. It covers more than 8 million people across all states and Union Territories. Its primary purpose is to produce birth and mortality rates at the national and state levels. Unfortunately, the SRS survey has not been carried out during the pandemic.
In contrast, the CVoter tracker survey, with a coverage of 0.14 million adults and death numbers based on self-reported data from telephonic surveys with no on-field verification, is a simplistic and imprecise methodology to elicit data on death. In addition, the low response rate also raises fundamental concerns of non response bias which are not easily quantifiable. Researchers have assumed no behavioural change in response to survey questions over time. Heightened media coverage and overall fear and interest levels during waves of the pandemic would instead imply likelihood of varying responses from people. For example, people would be significantly more sensitive to events and surroundings during a wave than during normal times. These simplistic assumptions make the estimates of excess deaths highly questionable.
The lack of accurate data on deaths has led to intense speculation and politicisation. The truth of the matter is that even before the pandemic, India did not have an infrastructure of collecting real-time robust death data. The core issue is not that the numbers are right or wrong, but no matter how sophisticated the statistical methodology, there is no substitute for high quality data. From 2015 to 2019, due to massive digitisation efforts, the level of death registration has improved drastically from 75.3 per cent to 92 per cent across India. However, this remains a work-in-progress with several shortcomings; a staggering 2.28 million deaths (approximately 23 per cent of the total deaths) were not accounted for in the CRS death data even in 2019. The situation was exponentially worse in earlier years with low levels of death registrations.
The pandemic has provided a window of opportunity to invest heavily in building a robust and reliable infrastructure that collects timely data on vital statistics, such as births, deaths and migrations. This should be a project of national importance and deemed an urgent priority requiring complete cooperation of central and state governments. Such an infrastructure would become the cornerstone of public health in India.
The writer is Vice President, Observer Research Foundation