Difficulty modelling a piece-wise normal likelihood

It seems like HMC is struggling when trying to recover parameters from a piece-wise density even as simple as the one below.

f(x)=\begin{cases} \mathcal{N}(0,1)\ ,x\le0\\ \mathcal{N}(0,0.25)\, x>0\end{cases}
n1 <- rnorm(1000)
n2 <- rnorm(1000,0,0.25)
n1 <- n1[n1<=0]
n2 <- n2[n2>0]
y_n <- c(n1,n2)
test_data1 = within(list(),{
  N <- length(y_n)
  y <- y_n
})
mod_bqrnn <- stan_model("piece_wise.stan")
initf1 <- function() list(mu=c(0,0),sigma=c(1,0.25))
test_fit <- sampling(mod_bqrnn,data=test_data1,init=initf,algorithm="HMC",refresh=0)
data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}
model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]);
    }else{
      y[i] ~ normal(mu[2],sigma[2]);
    }
  }
}
      mu[1]       mu[2]    sigma[1]    sigma[2] 
 -0.7077208   0.2022929   0.5782205   0.1481777  

Is there someway to improve its performance?

The piece wise density effectively creates two truncated distributions. If you tell Stan you have truncated distributions, you can recover the parameters.

data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}
model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]) T[, 0];
    }else{
      y[i] ~ normal(mu[2],sigma[2]) T[0, ];
    }
  }
}
Inference for Stan model: test.
4 chains, each with iter=500; warmup=250; thin=1; 
post-warmup draws per chain=250, total post-warmup draws=1000.

           mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
mu[1]     -0.02    0.01 0.22  -0.34  -0.17  -0.05   0.09   0.49   258 1.00
mu[2]     -0.02    0.00 0.06  -0.18  -0.06  -0.01   0.02   0.07   326 1.01
sigma[1]   1.01    0.01 0.09   0.86   0.95   1.00   1.06   1.24   251 1.01
sigma[2]   0.25    0.00 0.03   0.21   0.23   0.25   0.26   0.31   334 1.01
lp__     870.62    0.11 1.57 866.71 869.86 871.04 871.77 872.45   209 1.01

1 Like

Thank you very much! How long did it take for you to run this code though? It seems like with the presence of truncation the sampling suddenly became unbelievably inefficient. Maybe I should first split the vector based on the truncatong knot and vectorize them?

  • Warmup took about 13 seconds for 250 iterations, 2 second for sampling 250 iterations. I think in this example you can get away with 250 iterations warmup and increase sampling depending on how much precision you need (maybe because of the initialisation.)
  • Splitting and vectorizing would be a good idea but I am not sure that you can vectorize the T operator.
  • I think you can use vector<upper=0> [Nneg] yneg and yneg ~ normal(mu[1], sigma as an alternative.
    EDIT:
  • There are also no priors. I think this will run much smoother with some reasonable priors.

I am experiencing a really strange problem, with the same code and more sampling I thought I would get an better estimate. But I am receiving fairly biased estimates, did I code anything wrong?

n1 <- rnorm(1000)
n2 <- rnorm(1000,0,0.25)
n1 <- n1[n1<=0]
n2 <- n2[n2>0]
y_n <- c(n1,n2)
test_data1 = within(list(),{
  N <- length(y_n)
  y <- y_n
})
mod_bqrnn <- stan_model("piece_wise.stan")
initf1 <- function() list(mu=c(0,0),sigma=c(1,0.25))
test_fit <- sampling(mod_bqrnn,data=test_data1,init=initf1,iter=1000,warmup=250,algorith="HMC")
data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}

model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]) T[, 0];
    }else{
      y[i] ~ normal(mu[2],sigma[2]) T[0, ];
    }
  }
}
, , chains = chain:4

          stats
parameter          mean         sd         2.5%          25%          50%          75%       97.5%
  mu[1]      0.21077200 0.37046433  -0.22628419  -0.02341318   0.14445668   0.31524014   1.1784420
  mu[2]      0.04119068 0.03731571  -0.04084605   0.01659133   0.04628801   0.06784184   0.1003024
  sigma[1]   1.05327782 0.12909491   0.87502752   0.96603528   1.03307975   1.10134776   1.3910572
  sigma[2]   0.21777376 0.01746593   0.18835739   0.20520857   0.21709193   0.22866561   0.2549169

I did some further experimentation and warmup gets stuck pretty easily. I only realised after your latest response that you are using algorithm = “HMC”. I think the sampling problems are because plain vanilla HMC is just less robust than NUTS. If you use the default NUTS, you should not experience any problems.

True. Thank you for all the help!