Skip to the content. plot

Chart usage

The chart show the evolution of COVID-19 for:

  1. Total number of cases (left axis)
  2. New cases (right axis)
  3. Total number of deaths (right axis)
  4. New deaths (right axis)

The top part of the graph allows to:

Data Source

The data rely on the data from the European Center for Disease Control and Prevention. The European CDC publishes daily statistics on the COVID-19 pandemic. Not just for Europe, but for the entire world. ECDC website, data.

Predictions

The total number of Covid 19 cases is predicted using the Verhulstthe model, originally developed for growth modelling. We’ll assume the total number of Covid 19 cases follows :

\[f(t)=K \frac{1}{1+e^{-r(t - t_0)}}\]

\(\cdot \ t_{0}=\) the value of the sigmoid’s midpoint
\(\cdot \ K=\) the capacity, here the maximum number of people infected
\(\cdot \ r=\) the logistic growth rate or steepness of the curve

Model fitting

The model is fitted to minimise the mean squared error (MSE):

\[(t_0^{\star}, K^{\star}, r^{\star}) \in argmin \sum_{i=1}^{n}\big(y_i - f(t_i)\big)^2\]

When the capacity \(K\) is known, this model can be fitted using a linear regression transforming \(f(t)\) in \(g(t) = logit (\frac{f(t)}{K})\).

\[\operatorname{logit}\left(\frac{1}{1+e^{-r(t - t_0)}}\right)=\ln \left(\frac{1}{e^{-r(t - t_0)}}\right)= r t - t_0\]

Here, \(K\) is unknown, and the model need to be fitted with numerical optimisation. Still, under reasonable conditions existence of optimal parameters exists. paper

The Nelder-Mead optimisation algorithm is used to find the optimum parameters.

Parameters distribution

The estimation of optimum parameters distribution is computed with a Bootstrap method. The distribution used to sample the parameters is linear, granting more importance to recent values than old ones. The central curve of the prediction is obtained with the median parameters and the upper and lower are obtained respectively with the 1st and 3rd quantiles.

Parameters distribution
Parameters distribution

Future work

Verhulstthe model is unstable during the exponential growth of the virus . The model tend to underestimate the virus growth phase.

New growth modeling

The model concist to have two growth mode of the virus:

  1. Exponential growth: \(f(t) = e^{r_1t}\)
  2. Logistic growth: \(f(t)=K \frac{1}{1+e^{-r(t - t_0)}}\)

The total number of cases in the final model is:

\[f(t) = (1-y(t)) \ e^{r_1t} + y(t) \ \big(\ a + K \frac{1}{1+e^{-r_2(t - t_0)}} \big)\]

\(\cdot \ y(t)= 0\) during the exponential growth, 1 otherwise.

This modelisation allows to tale into acount the phase before and after containment measures with the \(y(t)\) variable. The difficulty relies on the estimaiton of time \(t_1\) when this variable change from \(0\) to \(1\).

In order to preserve both the function and it’s derivative continuity for \(t=t_1\) (\(  y(t_1)_-=0\) and \(y(t_1)_+=1\)) it is needed that:

  1. \[f(t{_1})_- = f(t{_1})_+ \implies a = e^{r_1t} - K \frac{1}{1+e^{-r_2(t_1 - t_0)}}\]
  2. \[f'(t{_1})_- = f'(t{_1})_+ \implies K= \frac{r_1}{r_2} \ e^{r_1t} \ \frac{ 1+e^{ - r_2 (t_1 - t_0) }}{1 - \frac{1}{1+e^{-r_2(t_1 - t_0)}}}\]
Différence between models
Différence between models