# Transmission parameters of coronavirus disease 2019 in South Asian countries

## Article information

## Abstract

### Objectives

This study aimed to estimate the transmission parameters, effective reproduction number, epidemic peak, and future exposure of coronavirus disease 2019 (COVID-19) in South Asian countries.

### Methods

A susceptible-exposed-infected-recovered-death (SEIRD) model programmed with MATLAB was developed for this purpose. Data were collected (till June 28, 2021) from the official webpage of World Health Organization, along with the Center for Systems Science and Engineering at Johns Hopkins University. The model was simulated to measure the primary transmission parameters. The reproduction number was measured using the next-generating matrix method.

### Results

The primary transmission rate followed an exponential Gaussian process regression. India showed the highest transmission rate (0.037) and Bhutan the lowest (0.023). The simulated epidemic peaks matched the reported peaks, thereby validating the SEIRD model. The simulation was carried out up to December 31, 2020 using the reported data till June 9, 2020.

### Conclusion

The information gathered in this research will be helpful for authorities to prevent the transmission of COVID-19 in the subsequent wave or in the future.

**Keywords:**Compartmental model; COVID-19; Epidemic peak; Next generating matrix; Reproduction number; South Asia

## Introduction

South Asia consists of 8 countries: India, Bangladesh, Pakistan, Afghanistan, Nepal, Bhutan, Maldives, and Sri Lanka. This is the world most densely populated geographical region, where almost one-fourth of the world's population lives. Since its emergence in Wuhan in December 2019, coronavirus disease 2019 (COVID-19) has spread to 219 countries and territories in the world [1]. As of June 28, 2021, the world had 180,817,269 confirmed cases and 3,923,238 deaths. Similar to Europe and America, South Asian countries are also experiencing a continued COVID-19 outbreak, especially in India, Pakistan, and Bangladesh. Hence, COVID-19 has become a significant health concern for the region. Among the South Asian countries, the first COVID-19 confirmed case was found in Nepal on January 23, 2020 [1]. Afghanistan detected the first confirmed case on March 24, 2020. By June 28, 2021, the highest number of cumulative confirmed cases (30,316,897) had been recorded in India and the lowest number of cases (2,062) in Bhutan. The highest recovery rate from COVID-19 cases was found in Bhutan (99.94%), with 1 death, whereas in Afghanistan, 62,288 patients had recovered out of 115,751 confirmed COVID-19 cases [2]. The recovery rate of all countries has been above 95%, with the exception of Afghanistan (93%). On March 13, 2020, India reported the first death from COVID-19 among all South Asian Association for Regional Cooperation (SAARC) countries. In total, 7,750 deaths were recorded in India as of June 9, 2020, corresponding to the highest rate of all SAARC countries. The highest case fatality rate was found in Afghanistan (6.5%), followed by Pakistan (2.4%) and Bangladesh (1.7%).

The above statistics indicate that India was at high risk for COVID-19, in terms of the total number of confirmed cases, followed by Pakistan, Bangladesh, and Afghanistan, whereas the other 4 countries might be considered safe from COVID-19. A detailed study on the transmission rate and early reproduction number is necessary to understand the situation better. Estimating the transmission parameter is required to predict future transmission, the peak of the epidemic, and the impact of the government's action. Researchers have focused on the transmissibility of COVID-19 in China, Japan, Europe, and America, while less attention has been given to South Asian countries. However, the poor health facilities, high population density, and poor literacy rate of developing countries underscore the importance of studying disease transmission in these areas. Hence, detailed research on the transmissibility and probable future of the disease in this region is necessary to investigate the current scenario and potential danger.

Mathematical modeling [3−7] plays a crucial role in estimating the transmission, reproduction number, and prediction of infectious diseases. A few studies have focused on the interactions of individual behavior [5] and social distancing [6] with disease spread, while others have concentrated on disease control by manipulating the numerical values of social distancing parameters [7]. The effective reproduction number (Re), which is a crucial parameter to measure the transmissibility of the disease, defines the number of secondary cases produced by 1 infected individual from the total population of the region [8]. When Re>1, the number of active infected cases will increase; hence, there will be an epidemic. The disease will be endemic when Re=1, while Re<1 indicates a decline in the number of cases, meaning that there is no possibility of an epidemic.

On January 23, 2020, the WHO estimated the reproduction rate of COVID-19 as between 1.4 and 2.5. Later Liu et al. [9] reported the mean and median estimates of Ro to be 3.29 and 2.79 in their review of 12 published papers. A review of 50 published articles by Rahman et al. [10] found that the mean basic reproduction number was 2.71 and the median was 2.73, with an interquartile range (IQR) of 1.73 and a range of 0.32 to 6.47, in countries including Italy, Iran, South Korea, Singapore, Japan, Israel, Algeria, Brazil, and China. In the Middle East, the estimated Ro was found to be 2.60 to 7.41, with a mean of 3.76, a median value of 3.51, and an IQR of 1.16 from a susceptible-infected-recovered model [11]. Using an exponential growth rate and a time-dependent method, the real-time effective reproduction number of COVID-19 in Italy (3.27), France (6.32), Spain (5.08), and Germany (6.07) were estimated by Yuan et al. [12]. Zhuang et al. [13] assessed the basic reproduction number in Italy (2.6 or 3.3) and South Korea (2.6 or 3.2) using a stochastic model for different starting date. Epidemiological modeling was used to estimate the basic reproduction number in China [14] as 6.47, Japan [15] as 2.50 (before voluntary event cancellation) or 1.88 (after voluntary event cancellation), and Italy [16]. The variation in previous research is not surprising as the transmission of COVID-19 depends upon environmental factors, countries' control measures, people's behavior, hospital facilities, social distancing [17], and preventive measures [18]. The impact of social distancing on COVID-19 transmission in South Korea has been studied [17]. Preventive measures are crucial to prevent and control the rapid spread of COVID-19 [18].

To the best of the authors' knowledge, there is limited research on the comparison of the transmissibility of COVID-19 in the South Asian region till date. In this situation, it is necessary to conduct a comparative analysis of the transmissibility of COVID-19 in this region. Towards that direction, this paper presents an analysis using a detailed compartmental modeling approach to investigate the transmissibility of COVID-19 in South Asia. A susceptible-exposed-infected-recovered-death (SEIRD) model was developed for this purpose. The effective reproduction number, infection rate, cure rate, and death rate of COVID-19 in all 8 countries of the South Asian region were evaluated using the developed model.

Although most of the papers found in the literature used a constant transmission rate, this study modeled transmissibility as a dynamic phenomenon. The model described in this paper considered most compartments from susceptibility to recovery or death. Moreover, this model integrated a machine learning algorithm with the compartmental model, enabling the simulation results to closely mirror the reported data.

## Materials and Methods

### Data Collection

The WHO [19] and Center for Systems Science and Engineering at Johns Hopkins University [1] were the main sources of data used in this study. We collected data from those sources and used them in our model after matching them with each other. The data collection period was from the date of the first infection to June 28, 2021. Data on confirmed (infected), cured, and death cases of the first wave were used for the simulation.

### Development of the SEIRD Model

According to our model, we divided the population (*N*) of a particular country into 5 compartments: susceptible (S, vulnerable to COVID-19 infection), exposed (E, latent individual or asymptomatic infectious), infected (I, symptomatic infected), recovered (R, immune to COVID-19), and death (D, death due to COVID-19). The details of the SEIRD model are described below.

During the pandemic, we considered the total population of each particular region as constant, leading to the equation N=S+E+I+R+D at each time t. The infectivity parameters, *β*_{1} and *β*_{2}, control the rate of transmission. In this model, *β*_{1} represents the probability of infection per exposure when a susceptible individual (S) has contact with an infected person (I) and becomes a latent exposed individual (E), while *β*_{2} represents the potential rate per exposure when a susceptible individual (S) has mutual contact with an exposed individual (E) and transmits it to another exposed individual (E). A detailed diagram is shown in Figure 1. Since the probability of contact between susceptible and exposed individuals is higher than that between susceptible and infected individuals, we assume that *β*_{2}=5*β*_{1} [15]. The incubation rate, α, is the rate of latent individuals becoming infectious (the average duration of incubation is 1/α).

In the development of the model, a number of assumptions were considered, as summarized below.

• Births and natural deaths (excluding deaths due to COVID-19) during the epidemic were not considered.

• This paper did not consider external influences, such as weather, herd immunity, or vaccination, on the outbreak.

• During the forecast period, mobility, behavior, and social distancing are considered to evolve in the same manner as from the first date of infection to June 9, 2020.

### Numerical Model

Once the transmission parameters were estimated, we calculated the infection, cure, and death rate using the iterative technique explained below.

Considering

which can be further written in a simple form

where,

Discretizing the time variable as t = n ∆t, we derived the following form of

where,

The infection, cure, and death rates were calculated using equation (9).

### Effective reproduction number

We used the next-generating matrix method to calculate the effective reproduction number [20], as explained below:

Consider *X*_{1} (*E,I*) to be the group of exposed and infected individuals and *X*_{2} (*S,R,D*) to be the group of susceptible, recovered, and dead individuals.

Let *f*(*X*_{1},*X*_{2}) and *v*(*X*_{1},*X*_{2}) be the vectors for new infection parameters and other parameters, respectively. Assuming *N*≈*S*, then,

The maximum eigenvalue of *FV*^{-1} is

Hence, the expression of *R _{0}*is

### Parameter Estimation and Model Calibration

Parameter estimation is the most crucial part of the SEIRD model. To estimate the parameters, we used publicly available reported data from the first day of infection for a particular country up to June 9, 2020 during the first phase. The transmission rate parameters *β*_{1} and *β*_{2} depend on government actions such as lockdown, shutdown, social distancing, and migration, which change considerably during the pandemic. Considering *β*_{1} and *β*_{2} as constant, thus, would call into question the accuracy of the model. The death and recovery rates γ and λ are considered constant as they depend on the population's immunity, health facilities, and management of the country, which do not change substantially within a certain time. We used the reported values of *I(t),E(t),R(t)*, and *D(t)* in the SEIRD model and carried out a regression analysis to determine the value of *β*_{1} (t).

The regression analysis was carried out using linear regression [21], different support vector machine (SVM) models, and Gaussian process regression (GPR). The root mean square error, R-squared (R^{2}), and plot residual (Figure 2) indicated that the exponential GPR was best fitted for the primary transmission rate in Bangladesh. Figure 2 illustrates the primary transmission rate (SEIRD model) fit results with regression through the robust linear, fine Gaussian SVM, quadratic Gaussian SVM, and exponential GPR methods. The blue dotted line represents the result of the SEIRD model, while the red line shows the result of the regression methods. The regression results for South Asian countries obtained using the exponential GPR are shown in Figure 3. The blue dotted line represents the result of the SEIRD model, while the red line shows the result of the regression methods. Figures 4 and 5, which present the probability distribution and boxplot of the primary transmission rate. Figure 5 implies that India had the highest disease transmission rate among the South Asian countries, followed by Nepal, Bangladesh, and Afghanistan. The transmission rate of Pakistan and Maldives were almost the same. The lowest transmission rate was found in Bhutan and Sri Lanka. The mean (from a normal probability distribution) and median (from a boxplot) of the primary transmission rate are summarized in Table 1.

## Results and Discussion

Figure 6 shows the time evolution of the effective reproduction number in the 8 countries of the South Asian region. In each graph, the reproduction number is plotted along Y-axis and the day along the X-axis. The estimated reproduction numbers (mean values) with 95% confidence intervals are shown in Table 1. The boxplot of the effective reproduction number of different countries is shown in Figure 7. Figures 8−15 show the simulation results (with 95% confidence intervals) and reported data, respectively. Reported data are plotted with the red dotted line while, the blue line represents the simulation results, and the shaded region indicates the 95% confidence interval of the simulation data. From Figures 8−15, almost all the reported data lie within the 95% confidence interval of the simulation data. This clearly demonstrates our SEIRD model's potential to accurately estimate the disease parameters and severity of the outbreak.

Among the South Asian countries, Sri Lanka showed the lowest reproduction number (1.83±0.26), followed by Maldives (1.97±0.14). In contrast, the highest reproduction number was found in Nepal (5.63±0.62), followed by Afghanistan (5.28±0.16). The reproduction number of Bangladesh was (3.14±0.13), which was slightly higher than that of India (2.08±0.09) and Pakistan (2.08±0.16). Surprisingly, despite having the lowest number of infected cases, Bhutan showed a higher reproduction number (3.51±0.3) than that of other countries, except for Nepal and Afghanistan. The reason for this inconsistency is the low cure rate of the country (0.009). Similarly, although the cumulative number of infected cases was lower in Nepal than in Bangladesh, India, Pakistan, and Afghanistan, Nepal had the highest reproduction number. This is also due to the low cure rate of the country (0.006).

As shown in Figure 6, at the earlier stage of the epidemic, all countries experienced a higher reproduction number; however, a significant decline was noticed later. Government actions to control the outbreak, reduction of population mobility, and public awareness were primarily responsible for this decline. Bangladesh showed a gradual increase during 50 days after the first infection and a stable reproduction number later. Starting 125 days after the first infection, India experienced a decreasing pattern of the reproduction number. The effective reproduction number of Pakistan was above 4 in the first few days of the outbreak, which declined to 2 by 25 days. The number again increased after 25 days, plateaued, and then decreased again. Afghanistan started with a reproduction number above 6, which then steadily decreased. Similarly, Nepal started with a high value of the reproduction number, and then showed a decreasing curve from approximately day 25 to 125. Initially, the reproduction number of Bhutan was fairly stable. However, it started declining after roughly 50 days of the outbreak. Maldives experienced an early increase and a significant decay after almost 30 days. A stable reproduction number was shown for Sri Lanka up to the first 30 days of the outbreak. Later, a notable descent was shown for up to almost 100 days, followed by a gradual increase.

### Prediction of Epidemic Size and Peak Analysis

In this section, we made a forecast with early data of COVID-19 (up to December, 2020) and compared the scenario with reported data. Two main factors, the epidemic peak time (EPT) and epidemic size (ES), were considered for a comparison. The EPT was considered to be the time needed to reach the highest number of active infected cases, and the number of active infected cases at the EPT was considered the ES The forecast for COVID-19 in South Asian countries is shown in Figure 16. The EPT and the active number of infection cases obtained from the SEIRD model and reported data are summarized in Table 2. This forecast assumed that the transmission rate would follow the same trend from the date (June 9, 2020) as in the early days of infection (of a particular country). According to the simulation, Pakistan reached the epidemic peak after at 132 days after the first infection. For India, Bangladesh, and Afghanistan, the epidemic peaked after 164, 159, and 227 days, respectively. Bangladesh showed an epidemic peak at 163 days after the first infection in the reported data. For India, the epidemic reached its peak after 233 days, and the active number of infections started to fall since then. In Pakistan, the epidemic peak was reached at 141 days after the first infection. A gradual decline in the number of the active infected cases then occurred until early October. A probable second wave of the virus started afterward. Afghanistan showed an early epidemic peak, just 84 days after the first infection, which was much earlier than our numerical result. The country showed a significant reduction of active infected cases up to September 2020 and a minor increase in October 2020.

## Conclusion

This paper presents a compartmental model for analyzing the current trend and predicting the epidemic peak and ES for South Asian countries. The results from the simulation showed a good fit with the reported data. Though the predictions of stochastic models are critical for practical purposes, the predictions of the EPT were quite close to the reported peaks. The highest transmission rate was found in India, while the second-highest was in Bangladesh. From the simulation results, countries like India, Bangladesh, Pakistan, and Afghanistan were at serious risk of COVID-19 during the first phase of the pandemic. While disease transmission was relatively low in Maldives and Sri Lanka, however, it is apparent that the cure rate in that region was much lower than elsewhere in the world.

### Limitations and Future Recommendations

The current model used in this research considered 5 compartments: susceptibility, exposure, infection, recovery, and death. Including other compartments, such as quarantine, hospitalization, and intensive care, might enhance the model and research outcomes. Initially, some countries lacked test facilities, COVID-19 dedicated hospitals, and COVID-19 specialists. These factors were not considered in the study. Furthermore, including government actions might also improve the model. This study only considered data from the first wave, and there is scope for further research using data from the second and third waves for some countries.

## Notes

**Ethics Approval**

We did not collect or publish any data relevant to human or animal bodies. We used data from the World Health Organization. The requirement for informed consent was waived because of the retrospective nature of this study.

**Conflicts of Interest**

The authors have no conflicts of interest to declare.

**Funding**

This research was funded by the Shahjalal University of Science and Technology, Sylhet, Bangladesh.

**Availability of Data**

The data generated or analyzed during this study are included in this published article. For other data, these may be requested through the corresponding author.

**Authors’ Contributions**

Mridul Sannyal who was a student of Shahjalal University of Science and Technology, developed the detail model and MATLAB code, performed the analysis, and prepared the final manuscript. Abul Mukid Mohammad Mukadess who is a Professor of Shahjalal University of Science and Technology, designed the study, developed the conceptual and mathematical model, and prepared the manuscript's first draft.