Adding a seemingly simple constraint is breaking my model

Hi all,

I’m trying to fit a basic version of a more complex model and am running into trouble constraining a (pair of) parameter(s) in the model. The R script I’m using is:

library(rstan)
library(ggplot2)

true_fun = function(a_1, a_2, tau, gamma, x){
    lambda = numeric(length(x))
    lambda[x <= gamma] = exp(a_1 * (gamma - x[x <= gamma]) ^ 2 + tau)
    lambda[x > gamma] = exp(a_2 * (x[x > gamma] - gamma) ^ 2 + tau)
    y = rpois(length(x), lambda)
}

a_1 = -0.05
a_2 = -0.001
gamma = 15
tau = 3

set.seed(1)
df = c()
n_obs = 5
for(i in seq_len(n_obs)){
    x = 1:100
    y = true_fun(a_1, a_2, tau, gamma, x)
    df = rbind(df, data.frame(x = x, y = y, obs = rep(i, length(x))))
}

ggplot(df, aes(x = x, y = y, color = as.factor(obs))) + geom_point()

stan_d <- list(y = df$y, 
               x = df$x,
               n = nrow(df))

set.seed(2)
minit1 <- stan("Code/model.stan",
              data = stan_d, 
              chains = 1, 
              iter = 2)

set.seed(3)
mfit1 <- stan(fit = minit1, 
             iter = 1250,
             warmup = 250,
             data = stan_d, 
             chains = 4, 
             cores = 4)

plot(mfit1, pars = c('alpha', 'tau', 'gamma'))
traceplot(mfit1, pars = c('alpha', 'tau', 'gamma'))

where model.stan is:

data {
  int<lower = 1> n;      
  vector[n] x;
  int<lower = 0> y[n];
}

parameters {
  real alpha[2];
  real<lower = 0> tau;
  real<lower = 0> gamma;
} 

transformed parameters {
  real<lower = 0> mu[n];
  
  vector[n] ID; // indicator variable for whether x_i < gamma
  for(i in 1:n){
    if (x[i] <= gamma) {
      ID[i] = 1;
    } else {
      ID[i] = 0;
    }
    mu[i] =  exp((alpha[1] * (gamma - x[i]) ^ 2 + tau) * ID[i] + (alpha[2] * (x[i] - gamma) ^ 2 + tau) * (1 - ID[i]));
  }

}

model {
  alpha ~ normal(0, 1);
  gamma ~ uniform(2, 99); //fixed to 2nd and (n-1)th (ordered) observations of x
  tau ~ normal(0, 1);
  y ~ poisson(mu);
}

The example data is being simulated under the assumed data generating mechanism. For this particular model, it only makes sense for alpha to be negative. Unfortunately, when I try to constrain it either via

  real<upper = 0> alpha[2];

or

  real<lower = 0> alpha[2]; 

and changing

  mu[i] =  exp((-alpha[1] * (gamma - x[i]) ^ 2 + tau) * ID[i] + (-alpha[2] * (x[i] - gamma) ^ 2 + tau) * (1 - ID[i]));

I’m struck with the “Initialization between (-2, 2) failed after 100 attempts.” error. Additionally, adding an appropriate init argument does not resolve the issue (as expected). If you actually fit the model with unconstrained alpha per the code above and examine the traceplots, 3 of the chains appear to converge quickly to the expected values, while the fourth chain is doomed to non-sense due to positive values for alpha. I don’t quite understand why adding such a seemingly simple constraint to this model is completely breaking it. Any input would be greatly appreciated!

Thank you!

EDIT: (Steve) fixed code formatting

Hello

I am wondering if the issue might not be with the prior on alpha which is symmetrical around 0. So it seems to be in a conflict with the sign contraint on alpha. Maybe you can use a lognormal distribution for alpha instead ? (don’t know if that make sense here)

You have a list of supported positive distribution here: 16 Positive Continuous Distributions | Stan Functions Reference

Hi Edvon,

There are a couple of Stan programming issues that I am pretty sure you can improve but I also think that you cannot specify the model in the way you are trying to set it up.

For the programming:

  • If you use gamma ~ uniform(2, 99), you need to declare real<lower=2, upper=99> gamma in the parameter block
  • For numerical stability you can use y ~ poisson_log(log_mu) where log_mu[i] = (alpha[1] * (gamma - x[i]) ^ 2 + tau) * ID[i] + (alpha[2] * (x[i] - gamma) ^ 2 + tau) * (1 - ID[i])

Conceptually, I think what you have is very similar to a change point model which has it’s own section in the Stan User Guide: 7.2 Change point models | Stan User’s Guide. The naive implementation requires a discrete parameter, in this case ID but Stan does not accept discrete parameters. (You sort of get around it programming wise but I believe it still messes up the HMC algorithm). The Stan implementation requires marginalization.

You can simplify the specification of mu a bit and than I think the link with a change point model is clearer. At least it was for me. The code below will not work, it’s just to illustrate the simplification. The marginalization is more complicated than this.

if (x[i] <= gamma) {
   log_mu[i] = alpha[1] * (gamma - x[i])^2 + tau;
} else {
   log_mu[i] = alpha[2] * (gamma - x[i])^2 + tau;
}
   
2 Likes

fyi doing

real<lower=0> alpha;
// ...
alpha ~ normal(0, 1);

Will give a positive half-normal prior

1 Like