Unable to retrieve parameters from a dynamic model

Yes, sure! The model is based on the simulation models DSSAT (Jones et al., 2003) and APSIM (Keating et al., 2003), which are widely used in agriculture, albeit fully deterministic (Fig. 1 from Salmerón and Purcell, 2016, is also useful). With the current model I wanted to recreate the sub-rutine that simulates phenology but make it hierarchical-- that is, instead of a separate deterministic dynamic model per soybean genotype, a unique hierarchical dynamic model modeling many genotypes at the same time. This prologue is just to let you know that most of the parameters are known from the previous simulation models. Although I may have exaggerated the uncertainty of the priors expecting the model to rely more on the data. Perhaps this is not right and stronger priors are needed?

For the parameters I know their lower and upper limits, and if we were to put uniform priors on them these would be:

target += uniform_lpdf(b_dOptFiFl | 8, 16);
target += uniform_lpdf(b_dOptFlSd | 11, 17);
target += uniform_lpdf(b_dOptSdPm | 29, 38);
target += uniform_lpdf(b_Pcrit | 11.5, 14.5);
target += uniform_lpdf(b_Psen | 0.13, 0.35);

In summary, perhaps the original priors that I proposed are too vague?

Jones, J.W., Hoogenboom, G., Porter, C.H., Boote, K.J., Batchelor, W.D., Hunt, L.A., Wilkens, P.W., Singh, U., Gijsman, A.J., Ritchie, J.T., 2003. The DSSAT cropping system model. European Journal of Agronomy, Modelling Cropping Systems: Science, Software and Applications 18, 235–265. https://doi.org/10.1016/S1161-0301(02)00107-7

Keating, B.A., Carberry, P.S., Hammer, G.L., Probert, M.E., Robertson, M.J., Holzworth, D., Huth, N.I., Hargreaves, J.N., Meinke, H., Hochman, Z., 2003. An overview of APSIM, a model designed for farming systems simulation. European Journal of Agronomy 18, 267–288.

Salmerón, M., Purcell, L.C., 2016. Simplifying the prediction of phenology with the DSSAT-CROPGRO-soybean model based on relative maturity group and determinacy. Agricultural Systems 148, 178–187. https://doi.org/10.1016/j.agsy.2016.07.016

Thanks for the details and the paper references. The normal priors do look a bit wide based on what you told me about the upper and lower limits.

When you have enough data 10,000’s to 100,000’s points (some smarter person probably has a better number here) then your data can overwhelm the priors. I think here you need stronger priors on the smaller data set.

So maybe something like

  target += normal_lpdf(b_dOptFiFl | 12, 0.5);
  target += normal_lpdf(b_dOptFlSd | 16, 0.5);
  target += normal_lpdf(b_dOptSdPm | 34, 0.5);
  target += normal_lpdf(b_Pcrit | 13, 0.5);
  target += normal_lpdf(b_Psen | 0.15, 0.1);

  target += normal_lpdf(sigma | 0, 5);

and run the model

mod <- stan(file = “./soy_dev_winter_1.stan”, data = datastan, seed = 1234,
## I’ve chosen these parameters since am not interested in the missing Y
cores=6, chains = 4, iter = 4000, warmup = 2000,
control = list(adapt_delta=0.99, max_treedepth = 12, stepsize = 0.01),
pars = c(“b_dOptFiFl”, “b_dOptFlSd”, “b_dOptSdPm”,
“b_Pcrit”, “b_Psen”, “sigma”))

I also add in the diagnostics from the weirdfishes blog here

1 Like

Thanks @Ara_Winter. Yes, they are wide indeed. I was expecting the model to extract more from the data, but perhaps the priors were too vague. Okay, I’ll check your link!

Edit: regarding your numbers… right, and perhaps the number of missing data is an issue too?

Oh! Yeah, if you are missing data that can make it more difficult to recover your parameters.

EDIT: So with simulated data that I’ve added noise to and intentional dropped samples out of to mimic real data I will typically try to get close to the parameters I want. I am slightly lazy when it comes to making simulated data. My choices aren’t really best practices for that ;)

1 Like

@Alan_Severini I wanted to check in and see if the more specific priors were working. I downloaded the papers you linked to. I will read through those next week.

1 Like

Thanks @Ara_Winter! Sorry for the delay. I’ll look forward for you to check those papers.

I tried narrowing the priors as you suggested:

target += normal_lpdf(b_dOptFiFl | 12, 0.5);
target += normal_lpdf(b_dOptFlSd | 16, 0.5);
target += normal_lpdf(b_dOptSdPm | 34, 0.5);
target += normal_lpdf(b_Pcrit | 13, 0.5);
target += normal_lpdf(b_Psen | 0.15, 0.1);
target += normal_lpdf(sigma | 0, 1);

But after running:

mod <- stan(file = "./soy_dev.stan", data = datastan, seed = 1234,
            control = list(adapt_delta = 0.99, max_treedepth = 12, stepsize = 0.01),
            pars = c("b_dOptFiFl", "b_dOptFlSd", "b_dOptSdPm",
                     "b_Pcrit", "b_Psen", "sigma"))

I get the following, which shows that b_Psen is much smaller than expected from its prior and the data:

Inference for Stan model: soy_dev.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

                  mean se_mean    sd     2.5%      25%      50%      75%
b_dOptFiFl[1]    12.01    0.01  0.50    10.98    11.67    12.01    12.33
b_dOptFlSd[1]    16.00    0.00  0.50    15.03    15.66    16.00    16.34
b_dOptSdPm[1]    32.15    0.01  0.55    31.03    31.78    32.15    32.51
b_Pcrit[1]       13.61    0.01  0.62    12.42    13.15    13.63    14.14
b_Psen[1]         0.06    0.00  0.07     0.00     0.01     0.02     0.08
sigma             2.55    0.00  0.11     2.35     2.48     2.55     2.62
lp__          -1420.73    0.52 17.21 -1455.90 -1431.70 -1420.58 -1409.11
                 97.5% n_eff Rhat
b_dOptFiFl[1]    13.00  9214    1
b_dOptFlSd[1]    16.99 10163    1
b_dOptSdPm[1]    33.24  6610    1
b_Pcrit[1]       14.59  2760    1
b_Psen[1]         0.26  2221    1
sigma             2.77  1991    1
lp__          -1387.35  1111    1

Samples were drawn using NUTS(diag_e) at Mon Jan 27 12:02:40 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

This is what the pairs look like:

Why are b_Pcrit and b_Psen behaving like that? And what does sigma correlating with energy__ tell us about the parameterization of the model?

Thanks in advance!

That is super weird! Let me check my notes. Maybe someone else has an idea about those L shapes.

1 Like