Help implementing a Gamma model (rstan)

Hello. I am having a lot of trouble trying to get a Gamma model working with some data.

A while ago, I wrote myself a simple tutorial model to learn how gamma models worked (shown below).

data {
  int<lower=1> n;         // Number of observations
  vector[n] y;            // Response variable
  vector[n] x;            // Predictor variable
}

parameters {
  real a;         
  real b;                
  real<lower=0> shape;    
}

transformed parameters {
  vector[n] mu; 
  vector[n] beta; /
  for(i in 1:n){
    mu[i] = exp(a + b * x[i]); 
    beta[i] = shape / mu[i]; . 
  }
}

model {
  // Likelihood
  y ~ gamma(shape, beta); 
  // Priors
  a ~ normal(0, 2);
  b ~ normal(0, 2);
  shape ~ gamma(0.01, 0.01);
}

This model correctly recovers values from fake data, so I think it is working as intended.

N <- 100
x <- runif(N, -1, 1)
a <- 0.5
b <- 1.1
y_true <- exp(a + b * x)
shape <- 10
y <- rgamma(N, rate = shape / y_true, shape = shape)

fake_data_a <- data.frame(x=x,y=y)

So, I’ve been trying to run a model specified in the same way with some real data. I.e.

data {
  int<lower=1> n;       
  vector[n] G_value;            
}

parameters {
  real a;
  real<lower=0> shape;  
}

transformed parameters {
  vector[n] mu; 
  vector[n] beta; 
  for(i in 1:n){
    mu[i] = exp(a); 
    beta[i] = shape / mu[i]; 
  }
}

model {
  G_value ~ gamma(shape, beta); 
  a ~ normal(0,5);
  shape ~ gamma(0.01, 0.01);
}

However, no matter what I try the model never samples. Irrespective of the predictors that are included. Above I simplified it as much as possible. It always produces errors like this:

Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value

I’m sure it will be something quite obvious, but I can’t figure out what is going wrong. Does anybody have any suggestions?

For example, is there anything wrong with how I have setup either of my models?

Thanks in advance.

Hi. This is a simple model, hence stan shouldn’t have any issues with initialization. I would check that the data you are feeding is actually positive.

Also, for performance, most of the stan functions are optimized for vectorization. If data follows the same distribution, you can define the data as arrays and feed scalar values for the parameters. The optimized code, with the addition of checking for positive data would be

data {
  int<lower=1> n;       
  array[n] real<lower=0> G_value;            
}

parameters {
  real a;
  real<lower=0> shape;  
}

transformed parameters {
  real mu = exp(a); 
  real beta = shape / mu; 
}

model {
  G_value ~ gamma(shape, beta); 
  a ~ normal(0,5);
  shape ~ gamma(0.01, 0.01);
}
1 Like

Thank you very much for your reply. You were right, there were a few 0s in the data which was stopping the model running. Removing them let everything work as intended.

Thanks also for posting a vectorised version of the model. I’ve been able to expand it to include predictors without needing the for loop, which has sped things up a lot.