# The samples of parameter in the model sometimes stick to a constant when doing dynamic linear modeling

``````data {
int<lower=1> N;  // total number of observations
int Y[N];  // response variable
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
// row_vector[K-1] Xdummy[N];  // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> groupmax;  // number of grouping levels
// int<lower=1> M_1;  // number of coefficients per level
int<lower=1> group[N];  // grouping indicator per observation
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc;  // centered version of X without an intercept
vector[Kc] means_X;  // column means of X before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector[K-1] beta1;  // population-level effects
real beta0;  // temporary intercept for centered predictors
real alpha[groupmax];
real<lower=0> sigma;
}
model {
sigma ~ gamma(0.1,0.01);
target += student_t_lpdf(beta1 | 7, 0, 2.5);
target += student_t_lpdf(beta0 | 7, 0, 2.5);

target += normal_lpdf(alpha[1] | 0, 0.1);
target += normal_lpdf(alpha[2] | 0, 1/sigma);
for (k in 3:groupmax){
alpha[k] ~ normal(2*alpha[k-1]-alpha[k-2],1/sigma);
}

for(n in 1:N) {
Y[n] ~ bernoulli(inv_logit(beta0 + alpha[groupmax+1-group[n]] +  Xc[n]*beta1));
}
}
``````

tempdata.RData (961 Bytes)

Figure 1 is the model I try to fit. Figure 2 is the example traceplot. File 1 is the example dataset used to generate figure 2.
Here is the r command with the seed to generate figure 2
fit=rstan::sampling(logisticmodel_rand,
data = dataran, chains = 1, refresh=0, warmup = 2500, iter = 5000,seed=93)
traceplot(fit)

Question: I am a beginner. As you can see in figure 2, the value of parameters sticks to a constant after the warm-up phase. This will lead to a problem when making the inference. This happens for some seeds. May I ask how to deal with the dynamic linear model using stan?

Thank you very much for anyone making comments!

The R version is 4.1.2
The rstan version is 2.21.3

That can happen for many reasons, but itâ€™s likely a symptom of some underlying issue with the model, or at least the model given the data (e.g. too little, or more likely too much data for the mode to handle). If you can have tried it with different data sets, or with smaller values of K for instance, you may get an idea of whether itâ€™s a structural problem or not.

Another possibility, given that for at least some part of the chain (and I infer from your comment for other entire chains) that doesnâ€™t happen, it can be that the optimization of the algorithm parameters is not being adequate during the warmup phase and itâ€™s getting stuck in some parameter regions â€“ that could happen for complicated models because the chain started in a weird place and couldnâ€™t get out of it during warmup, so the metric and step size are not good values for the rest of the chain. if these parameter values are far from what you get on other chains, and the likelihood hasnâ€™t converged to higher probability spaces, itâ€™s an indication of the problem.

You may change the algorithm tuning parameters, like `adapt_delta`, so it takes longer to find the optimal values, but may explore better the probability space, or run a longer warmup and chain, or both.

There are probably other causes, but it depends on the â€śsymptomsâ€ť you are getting.

Hi Caesoma,

The K in this example equals to 2. I have tried to use jag. Jag works well on this model. Thank you very much.

Ziyan

Do you mean the software JAGS? Because that uses Gibbs sampling, which is very different, and â€śworking wellâ€ť may only be true in the sense of you not getting the error you were with Stan,. HMC is a much more powerful sampler, so itâ€™s unlikely that it will underperform if working as intended. Iâ€™d say there are two possibilities here:

1. The problem is simple enough that Gibbs sampling actually works well (HMC should produce a similar output);
2. The Gibbs sampler generates a trace that doesnâ€™t have obvious problems, but itâ€™s actually a poor sample from the posterior.

You may not be able to distinguish between the two cases just from simple trace diagnostics, so it may be worth trying to make HMC work â€“ depending on the applications a reviewer may be inclined to reject the use of an outdated sampler and require a state-of-the-art method.

Yes JAGS. I am willing to use Stan but when sampling from this hierarchical model something strange happened like what I asked. I tried to debug this issue but did not succeed. All other models in my thesis are sampled using Stan and works well. I would like to keep my thesis consistently using Stan. Would you mind help download the data file I have attached (temp.RData) and sample from the model by using the seed to see whether such problem can be avoided? I can not find a way of solving this problem at this moment. Maybe I need to reparameterize this model?

Thank you very much
Caesoma

Have you tried to make the prior on `sigma` more restrictive? Something like `gamma(0.1, 0.1)` would contain `sigma` a bit more. It looks like whatâ€™s happening is the sampler getâ€™s stuck in a region where sigma is large and the dynamics on `alpha` get stuck. If sigma is large, `alpha2= 0` and `alpha[k] = 2 * alpha[k-1] - alpha[k-2]` thus `alpha[k] = alpha[1]`.

1 Like

Thank you very much.

I have rewritten the Stan code which seems avoids this problem. I am testing the result and will update tomorrow.

Best wishes
Ziyan

Thereâ€™s nothing â€śwrongâ€ť with the Gibbs sampler or JAGS/BUGS, but it would be a fair question from a reviewer/examiner if thereâ€™s something odd about the model that it wonâ€™t work with HMC when other models do (and whether Gibbs is actually working).
I cannot run and check your model myself, but if itâ€™s an important piece of your thesis and you are worried about the robustness of the results Iâ€™d strongly suggest fiddling a bit with it and trying to fix it.

There are a few things you could relatively easily do:

1.Try to restrict the space of parameter by constraining parameters or using stronger priors (making sure you are able to justify the constraints);
2. Run longer warmup and change `adapt_delta` to try and make sure the algorithm parameters are well-optimized;
3. Try to start the chains from regions of higher probability (but making sure they are different initial points to better assess convergence);
4. Simulate data and see how the inference behaves under known parameters;
5. Simplify the model and scale up to see when problems start arising.

Here is the Stan code that works better. There is no sickness now.

``````data {
int<lower=1> N;  // number of patients
int<lower=1> K;  // number of treatment arms
int<lower=1> groupmax;  // number of time points
int<lower=1, upper=K> X[N];  // predictor variable
int<lower=0, upper=1> Y[N];  // response variable
int<lower=1> group[N];
}

parameters {
real beta_0;  // intercept
vector[K] beta;  // coefficients for predictor variable
vector[groupmax] alpha;  // random effects for each time point
real<lower=0> gamma;  // precision parameter for alpha
}

model {
// priors
beta_0 ~ normal(0, 1.8);
beta ~ normal(0, 1.8);
alpha[1] ~ normal(0,0.01);
alpha[2] ~ normal(0, sqrt(1/gamma));
alpha[3:groupmax] ~ normal(2*alpha[2:(groupmax-1)] - alpha[1:(groupmax-2)], sqrt(1/gamma));
gamma ~ gamma(0.1, 0.01);

// likelihood
// for(n in 1:N) {
//   y[n] ~ bernoulli(inv_logit(beta_0 + alpha[groupmax+1-group[n]] +  x[n]*beta));
// }

for (n in 1:N) {
vector[K] pi;
for (k in 1:K) {
pi[k] = inv_logit(beta_0 + beta[k] * (X[n] == k) + alpha[groupmax-group[n]+1]);
}
Y[n] ~ bernoulli(pi[X[n]]);
}
}

``````