Explaining how Rstan estimates parameter using ODE and data.!

You’re right:

summary(seir.mcmc, pars = pars2)
$summary
                   mean    se_mean         sd 2.5%   25% 50% 75%    97.5%      n_eff
pred_cases[1]     9.921   3.364555   16.91999    0  0.00   3  12   58.025  25.289766
pred_cases[20]   16.980   5.300890   30.06587    0  1.00   5  21   98.000  32.169926
pred_cases[50]   43.576   4.799368   62.49209    0  4.00  20  58  216.000 169.543827
pred_cases[100] 496.266 364.258350 1095.62496    0 29.75 131 427 3640.075   9.047005
                    Rhat
pred_cases[1]   1.066488
pred_cases[20]  1.054154
pred_cases[50]  1.023725
pred_cases[100] 1.167377

$c_summary
, , chains = chain:1

                 stats
parameter            mean        sd 2.5% 25%  50%    75%   97.5%
  pred_cases[1]    13.188  16.73094    0   1  6.5  18.25  58.050
  pred_cases[20]   21.290  31.25610    0   2 10.0  28.00  95.150
  pred_cases[50]   52.400  67.29259    0   7 28.5  75.00 234.625
  pred_cases[100] 183.878 245.49327    0  17 86.5 239.00 891.925

, , chains = chain:2

                 stats
parameter            mean        sd 2.5%   25%  50%    75%   97.5%
  pred_cases[1]    13.984  21.55098    0  1.00  6.0  17.00  84.150
  pred_cases[20]   23.550  35.91232    0  3.00 11.0  29.00 114.050
  pred_cases[50]   50.848  73.35778    0  5.00 22.0  66.25 260.575
  pred_cases[100] 187.754 257.81339    0 20.75 94.5 252.00 953.675

, , chains = chain:3

                 stats
parameter            mean        sd 2.5%   25% 50%    75%   97.5%
  pred_cases[1]    12.248  16.61308    0  1.00   7  16.25  52.575
  pred_cases[20]   21.622  32.00293    0  2.75  10  28.00 112.525
  pred_cases[50]   49.824  65.23545    0  6.00  26  67.00 226.675
  pred_cases[100] 190.102 252.19928    0 23.00  84 261.00 923.025

, , chains = chain:4

                 stats
parameter             mean           sd 2.5%   25% 50%  75%   97.5%
  pred_cases[1]      0.264    0.6285652    0   0.0   0    0    2.00
  pred_cases[20]     1.458    2.3586782    0   0.0   1    2    8.00
  pred_cases[50]    21.232   28.4398986    0   2.0  11   30  105.05
  pred_cases[100] 1423.330 1862.8492735    2 160.5 701 1957 6649.55

That’s what I thought. In one of your first traceplots, the log-posteriors were stable per chain but quite far apart (-1425 vs -1300).

I’ve got another problem with the model, but I suspect its on my end.

I simulated some data and tried to fit the model. However, all of the parameters (except for the initial states) drift into weird places, with the posterior predictive plots looking nothing like the prior predictive plots or the simulated data.

See e.g. the observations (blue dots) vs quantiles of predictions (left=posterior, right=prior):

Edit: With model and gq blocks:

model {
  append_row(
      unconstrained_initial_states,
      append_row(unconstrained_ode_params, unconstrained_other_params)
  ) ~ normal(prior_locations, prior_scales);
  if (!simulate_only){
    observed_states ~ neg_binomial_2(incidence, 1/phi_inv);
  }
}

generated quantities {
  int predicted_states[no_timesteps];
  if (no_timesteps > 0){
     predicted_states = neg_binomial_2_rng(incidence, 1/phi_inv);
  }
}

1 Like

Indeed, the fact the lp__ varies from chain to chain suggests this isn’t a case of non-identifiability (or if it’s that, there’s also something else going on). Per the trace plots, the initial points also don’t determine where the chain converge.

I’ll keep tinkering.

Alright, I’m stupid, I don’t think there’s much wrong with the model, I fixed the problem on my side…

Maybe the data just aren’t that great? With a population of roughly 200M, the reported cases do not surpass 1000?

Thank you everyone for your precious time and effort here, trying to help me out. I appreciate. @Funko_Unko ,from what you said in regards to the limited data?, does it mean that the parameters can not be estimated?
The data is for Nigeria, gotten from World Health Organization(WHO) website, at the onset of COVID-19 last year. Have been trying to estimate those parameters, yet no head way. I am not yet good with Bayern and coding.

Hm, well I don’t know what to say.

I think you would need more, better and different data and a better model to be able to capture the pandemic across the whole of Nigeria sufficiently.

First of all, the sir model only predicts a decrease in new cases if the herd immunity threshold has been reached, which it can not have been.

But even if the sir model were accurate, it would imply a huge amount of underreporting of cases, and I don’t know how this would impact your ability to estimate the parameters.

1 Like