Trouble sampling from Poisson model with random effects


#1

Hi I’m trying to create a model that emulates the Salm: extra - Poisson variation in dose - response study from OpenBUGS. Having trouble with 2 things, my transformed parameter contains log mu however I cant log the left hand side in a transform statement so ive had to try and work around it.

log(mu[i , j]) <- alpha + beta * log(x[i] + 10) + gamma * x[i] + lambda[i , j] is how the statement written in OpenBUGS. You can take a look down here

Also there seems to be a problem when I try and sample from my model, lots of errors come up. My error log is at the bottom

My data and model look like:

c1 = c(0,15,21,29)
c2 = c(10,16,18,21)
c3 = c(33,16,26,33)
c4 = c(100,27,41,69)
c5 = c(333,33,38,31)
c6 = c(1000, 20,27,42)
df = data.frame(c1,c2,c3,c4,c5,c6)
dose = df[1,1:6]
j = length(dose)
i = length(df[2:4,1])
obsmatrix = data.frame(c(15,21,29),
c(16,18,21),
c(16,26,33),
c(27,41,69),
c(33,38,31),
c(20,27,42))

  data{
    int<lower=1> n;     // number of doses
    int<lower=1> p;     // number of plates for each dose
    vector[n]x;        // doses
    int y[p,n];     // responses
  }
  
  
  parameters{
    matrix[p,n] lambda;
    matrix[p,n] mu;
    real<lower=0,upper=100> alpha;
    real<lower=0,upper=100> beta;
    real<lower=0,upper=100> gamma;
    real tau;
  }
  
  transformed parameters{
  matrix[p,n] lmu;
  lmu = log(mu);
  for(i in 1:n){  
    for(j in 1:p){
  
      lmu[i,j] = alpha + beta * log(x[i] + 10) + gamma * x[i] + lambda[i,j];
    }}
  }
  
  model{
  for(i in 1:n){  
    for(j in 1:p){
    
      y[i,j] ~ poisson(mu[i,j]);
      lambda[i,j] ~ normal(0,tau);
      }}
    alpha ~ normal(0,1000);
		beta ~ normal(0,1000);
		gamma ~ normal(0,1000);
		tau ~ gamma(0.001, 0.001);
     }

When I try to sample from my model I get

[595] “Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:”
[596] “Exception thrown at line 24: []: accessing element out of range. index 4 out of range; expecting index to be between 1 and 3; index position = 1lambda”
[597] “If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,”
[598] “but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.”
[599] “Rejecting initial value:”
[600] " Error evaluating the log probability at the initial value."
[601] "Initialization between (-2, 2) failed after 100 attempts. “
[602] " Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.”
[603] "Error in sampler$call_sampler(args_list[[i]]) : "
[1] “error occurred during calling the sampler; sampling not done”

I’ve done a lot of fiddling around with parameters and ranges and still cant seem to fix it. Any help is appreciated!


#2

Just looking at syntax stuff here:

for(i in 1:n){  
  for(j in 1:p){
    y[i,j] ~ poisson(mu[i,j]);
    lambda[i,j] ~ normal(0,tau);

The error:

[596] "Exception thrown at line 24: []: accessing element out of range. index 4 out of range; expecting index to be between 1 and 3; index position = 1lambda"

Is happening because i (the first index of lambda) goes from 1:n and j (the second index of lambda) goes from 1:p, but lambda is declared like:

matrix[p,n] lambda;

So it’s like the indices are backwards? (I’m assuming n and p are different)

Does this help?


#3

edit: nvm im so tired im saying stupid things now.

gonna give it a try thanks

2nd edit:

Sampling is running! Have to optimize a bit more but Im happy thats its sampling well. Thanks for picking out my error. Also did a bit of re-arranging of which lines are where. Will post final model once its fixed


#4

Thanks for your help. My final model seems to be sampling relatively accurately

Out of curiosity, is there a different way to handle a transformed variable with a log on the left side other than exponentiating?

  data{
    int<lower=1> n;     // number of doses
    int<lower=1> p;     // number of plates for each dose
    vector[n] x;        // doses
    int y[p,n];     // responses
  }
  
  
  parameters{
    matrix[p,n] lambda;
    real alpha;
    real beta;
    real gamma;
    real tau;
  }
  
  transformed parameters{
  matrix[p,n] mu;
  real sigma;
  for(i in 1:p){  
    for(j in 1:n){
  
      mu[i,j] = exp(alpha + beta * log(x[j] + 10) + gamma * x[j] + lambda[i,j]);
    }}
    sigma = 1/sqrt(tau);
  }
  
  model{
  alpha ~ normal(0,10000);
	beta ~ normal(0,10000);
	gamma ~ normal(0,100000);
	tau ~ gamma(0.001, 0.001);
  for(i in 1:p){  
    for(j in 1:n){
      lambda[i,j] ~ normal(0,tau);
    
      y[i,j] ~ poisson(mu[i,j]);
     }}
     }

#5

Good to hear things are at least running. I had a look at the model. I’m not a stats guru, so take what I say with a grain of salt, but here yah go:

For the log question, in general this isn’t the correct way to do a log normal (possibly need a Jacobian adjustment depending on the rest of the model, see section 20.3 and Chapter 33 in the manual for more info), but it is valid Stan syntax to say:

log(y) ~ normal(mu, sigma);

This little blurb isn’t in your model but it’s a valid log-on-the-left-side statement (from User-Defined Transforms in the Stan manual).

I think there’s a flaw in the original model that could also be affecting your question about the log stuff.

If you do:

transformed parameters{
  matrix[p,n] lmu;
  lmu = log(mu);
  for(i in 1:n){  
    for(j in 1:p){
  
      lmu[i,j] = alpha + beta * log(x[i] + 10) + gamma * x[i] + lambda[i,j];
    }
  }
}

You aren’t actually changing mu, and, from your second model, I suspect that’s what you were going for. If you set lmu = log(mu), and then set lmu = 0, you’re just overwriting the value in lmu – this does not effect the value of mu. Does that make sense?

Also, the priors here are seem to indicate a really wide range of parameters in your model:

alpha ~ normal(0,10000);
beta ~ normal(0,10000);
gamma ~ normal(0,100000);
tau ~ gamma(0.001, 0.001);

Usually folks try to get their parameters formulated/scaled in a way to avoid having huge things (alpha, beta, and gamma) and small things (tau, and hence lambda) mixed together. It can make the numerics way easier on Stan.

Hopefully that helps!


#6

Stan doesn’t support the log(x) <- y form you get in BUGS/JAGS if that’s what you’re asking—it’s far too limited an idiom for us to want to support it in Stan (e.g., it only works for monotonic invertible functions). In Stan, exp(x) is the only way to exponentiate a value.

You probably don’t want to carry over those terrible priors from BUGS/JAGS practice. We have both a Wiki page (on stan-dev/stan) and advice in the manual for better priors. Those fat priors don’t do what people think they do (i.e., convey little information)—they turn out to be very strong as Andrew Gelman’s been writing about for 10 years. Stan also doesn’t use conjugacy, so you can just parameterize a prior on the scale sigma directly. It’s much easier to understand what’s going on that way.

You can make this all run a lot faster by vectorizing, e.g.

mu[i,j] = exp(alpha + beta * log(x[j] + 10) + gamma * x[j] + lambda[i,j]);

can be

mu[i] = exp(alpha + beta * (log(x) + 10) + gamma * x + lambda[i];

The sampling for lambda can be

to_vector(lamba)  ~ normal(0, tau);

But you really might need to use the non-centered parameterization where you sample

to_vector(lambda_raw) ~ normal(0, 1);
lambda = lambda_raw * tau;

And then you want to vectorize the Poisson,

for (i in 1:p)
  y[i] ~ poisson(mu[i]);

You can also use the R syntax mu[i, ] rather than just mu[i], but the former is a bit faster.