Dear all,

I am new to Stan and currently working with 3 level hierachical growth model (non-centered slope & intercept) to estimate plant growth (rate) in different chemical concentrations. The longitudinal data are obtained by repeated measurements (plant length over a 14 days) nested within plant that are nested within different laboratories, that in turn are nested within six different chemical concentrations. Data includes1512 observations (plant length) nested in 378 plants that are nested within 6 different laboratories in 6 chemical concentrations. In the modeling part, I would like to see if the plants growth are linear or exponential growth. Since it is easier to fit exponential model by taking logarithm at both sides and fitting with the existing linear model stan code, it’s what I started with. I am doing it with the code from the appendix in the paper, Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Included, with the Stan code :-

data {

int<lower=0> Nh; //no. of observations

int<lower=0> Ni; //no. of plants

int<lower=0> Nj; //no. pesticide concentrations of all laboratories

int<lower=0> Nk; //no. pesticide concentrations

int<lower=1> Plant4Day[Nh]; //level-2 cluster (Plant) for each observation

int<lower=1> ConcPerLab4Plant[Ni]; //level-3 cluster (concentrations of all labs) for each plant

int<lower=1> Conc4Lab[Nj]; //level-4 cluster (concentrations) for each lab

int X_hijk[Nh];

real Y_hijk[Nh];

}

parameters {

real beta_0k[Nk]; //Slope

real beta_1k[Nk]; //Intercept

real u_0jkm[Ni]; //random effects of slope at plant-level

real u_0jk[Nj]; //random effects of slope at laboratory-level

real u_1jkm[Ni]; //random effects of intercept at plant-level

real u_1jk[Nj]; //random effects of at laboratory-level

real<lower=0,upper=100> sigma_u0jkm;

real<lower=0,upper=100> sigma_u0jk;

real<lower=0,upper=100> sigma_u1jkm;

real<lower=0,upper=100> sigma_u1jk;

real<lower=0,upper=1> sigma_y;

}

transformed parameters {

real beta_0jkm[Ni];

real beta_0jk[Nj];

real beta_1jkm[Ni];

real beta_1jk[Nj];

for (j in 1:Nj) {

beta_0jk[j] = beta_0k[Conc4Lab[j]] + u_0jk[j];

beta_1jk[j] = beta_1k[Conc4Lab[j]] + u_1jk[j];

}

for (i in 1:Ni) {

beta_0jkm[i]=beta_0jk[ConcPerLab4Plant[i]]+u_0jkm[i];

beta_1jkm[i]=beta_1jk[ConcPerLab4Plant[i]]+u_1jkm[i];

}

}

model {

u_0jk ~ normal(0, sigma_u0jk);

u_0jkm ~ normal(0, sigma_u0jkm);

u_1jk ~ normal(0, sigma_u1jk);

u_1jkm ~ normal(0, sigma_u1jkm);

for (h in 1:Nh) {

Y_hijk[h] ~ normal(beta_0jkm[Plant4Day[h]]* X_hijk[h] + beta_1jkm[Plant4Day[h]], sigma_y);

}

}

generated quantities {

vector[Nh] log_lik;

for (h in 1:Nh) {

log_lik[h] = normal_lpdf(Y_hijk[h] |beta_0jkm[Plant4Day[h]]* X_hijk[h] + beta_1jkm[Plant4Day[h]], sigma_y);

}

}

However, I have a small query related to model comparison via PSIS-LOO . I’m hopping if someone can shed some light on the issue described as below:-

i) I have checked visually how well are the two models suit the data. At this moment I am interested at the first level fitting. When comparing linear model to exponential/log-linear model, the models on PSIS-LOO reveals an estimated difference in elpd of 3651.4. Positive in the estimated difference of elpd indicate the expected predictive accuracy for the second model is higher. Which means, exponential/log-linear model is preferred. However, when looking at the plots for how two models fit, the linear model seems to fit better than the exponential/log-linear model. Is that mean, I am actually not comparing apple to apple (due to the logarithm in the log-linear model) via PSIS-LOO? Or, did I make any mistake with the code. Note, the plot shown the model fitting of one lab. There are another 5 plots for the other labs which also look better fit in linear than exponential/log-linear model.

ii) In case PSIS-LOO isn’t the correct method here, any suggestion which model comparison method I could use?

Figure 1: linear growth model

Figure 2: log-linear/exponential growth model (logarithm at dependent variable)

Thanks in advance for any suggestions.