Hello,
I am new to using Stan and would appreciate some help and advice with a model I am trying to fit. The data consists of a set of scores per individual at different ages. I want to fit a linear model where score is predicted by age in each individual but where the intercept and slope per individual are both positive. I’ve set this up using a lognormal prior on slope and intercept, with gamma priors on the lognormal parameters.
I’d appreciate any comments on this model and suggestions for alternative formulations. Especially those that could result in more efficient and stable model fits.
On small simulated datasets I am getting large numbers of divergent transitions.
data {
int<lower=0> N; // # observations
int<lower=0> K; // # of individuals
vector[N] y; // observations
vector[N] age; // age
array[K] int s; // group sizes
}
parameters {
real<lower=0> mu[N];
real<lower=0> beta[K];
real<lower=0> alpha[K];
real<lower=0> a1;
real<lower=0> a2;
real<lower=0> b1;
real<lower=0> b2;
real<lower=0> sigma;
}
model {
// priors
alpha ~ lognormal(a1, a2);
beta ~ lognormal(b1, b2);
a1~ gamma(2,1);
a2~ gamma(2,1);
b1~ gamma(2,1);
b2~ gamma(2,1);
sigma~ gamma(2,1);
//likelihood
int pos;
pos = 1;
for (k in 1:K) {
segment(y, pos, s[k]) ~ normal(alpha[k] + beta[k] * age[pos:(pos+s[k]-1)], sigma);
}
}
Hi, @jlm1973 and welcome to the Stan forums!
There are several issues with your model and I’m glad you inquired here rather than getting frustrated at it not working. We usually recommend that you start with the simplest possible thing that can do anything and then build up from there.
-
Remove the definition of mu
! It’s not used in the model and not given a prior, so the posterior is improper (doesn’t integrate to a finite number) and hence it can’t be sampled.
-
There’s no reason to constrain a1
and b1
to be positive. They can be below 0 for positive-constrained variables. That is, sampling alpha ~ lognormal(a1, a2)
is equivalent to alpha_raw ~ normal(a1, a2); alpha = exp(alpha_raw);
so you get a positive value even if a1 < 0
. You may need values less than zero if the values are small. Enforcing a1 > 0
forces alpha
to be larger than you may want. The median value of lognormal(a1, a2)
is exp(a1)
, and that may be too high if you restrict mu > 0
.
-
The centered priors on alpha
and beta
are going to be problematic. You instead want to define things this way:
paraemters {
real<offset=a1, multiplier=a2> alpha_raw;
real<offset=b1, multiplier=b2> beta_raw;
transformed parameters {
real alpha = exp(alpha_raw);
real beta = exp(beta_raw);
model {
alpha_raw ~ normall(a1, a2);
beta_raw ~ normal(b1, b2);
- I’m assuming you meant to increment
pos
in the loop, because as it’s defined, it’s starting each segment from position 1, so for each k
, you are modeling y[1]
, which is going to lead to the wrong posterior variance estimates. Rather than trying to vectorize like this, I’d suggest just looping over the positions in y
first to make sure you get the model correct, though I’m not sure how you’re lining the n
up with the k
.
1 Like
Dear Bob,
Thank you very much for your reply. It is very much appreciated and has saved me a lot of time. I fixed the errors you pointed out and all your suggestions and it’s made a huge difference.
One follow-on question : now that alpha_raw and beta_raw have independent normal priors, how could that be extended to bivariate normal prior so as to model the correlation as well.
You can just do it directly. In two dimensions efficiency’s not much of an issue, so you can just have this:
mu = [a1, b1];
alpha_beta = [alpha_raw, beta_raw];
cov_matrix[2] Sigma = [[a2, rho * sqrt(a2 * b2)], [rho * sqrt(a2 * b2)], b2]];
alpha_beta ~ multi_normal(mu, Sigma);
You’ll need to declare the correlation rho
as data:
real<lower=-1, upper=1> rho;