Difficulty modelling a piece-wise normal likelihood

Steven_Xu · April 16, 2019, 6:17pm

It seems like HMC is struggling when trying to recover parameters from a piece-wise density even as simple as the one below.

f(x)=\begin{cases} \mathcal{N}(0,1)\ ,x\le0\\ \mathcal{N}(0,0.25)\, x>0\end{cases}

n1 <- rnorm(1000)
n2 <- rnorm(1000,0,0.25)
n1 <- n1[n1<=0]
n2 <- n2[n2>0]
y_n <- c(n1,n2)
test_data1 = within(list(),{
  N <- length(y_n)
  y <- y_n
})
mod_bqrnn <- stan_model("piece_wise.stan")
initf1 <- function() list(mu=c(0,0),sigma=c(1,0.25))
test_fit <- sampling(mod_bqrnn,data=test_data1,init=initf,algorithm="HMC",refresh=0)

data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}
model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]);
    }else{
      y[i] ~ normal(mu[2],sigma[2]);
    }
  }
}

      mu[1]       mu[2]    sigma[1]    sigma[2] 
 -0.7077208   0.2022929   0.5782205   0.1481777

Is there someway to improve its performance?

stijn · April 17, 2019, 4:49am

The piece wise density effectively creates two truncated distributions. If you tell Stan you have truncated distributions, you can recover the parameters.

data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}
model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]) T[, 0];
    }else{
      y[i] ~ normal(mu[2],sigma[2]) T[0, ];
    }
  }
}

Inference for Stan model: test.
4 chains, each with iter=500; warmup=250; thin=1; 
post-warmup draws per chain=250, total post-warmup draws=1000.

           mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
mu[1]     -0.02    0.01 0.22  -0.34  -0.17  -0.05   0.09   0.49   258 1.00
mu[2]     -0.02    0.00 0.06  -0.18  -0.06  -0.01   0.02   0.07   326 1.01
sigma[1]   1.01    0.01 0.09   0.86   0.95   1.00   1.06   1.24   251 1.01
sigma[2]   0.25    0.00 0.03   0.21   0.23   0.25   0.26   0.31   334 1.01
lp__     870.62    0.11 1.57 866.71 869.86 871.04 871.77 872.45   209 1.01

Steven_Xu · April 17, 2019, 5:54am

Thank you very much! How long did it take for you to run this code though? It seems like with the presence of truncation the sampling suddenly became unbelievably inefficient. Maybe I should first split the vector based on the truncatong knot and vectorize them?

stijn · April 17, 2019, 6:01am

Warmup took about 13 seconds for 250 iterations, 2 second for sampling 250 iterations. I think in this example you can get away with 250 iterations warmup and increase sampling depending on how much precision you need (maybe because of the initialisation.)
Splitting and vectorizing would be a good idea but I am not sure that you can vectorize the T operator.
I think you can use vector<upper=0> [Nneg] yneg and yneg ~ normal(mu[1], sigma as an alternative.
EDIT:
There are also no priors. I think this will run much smoother with some reasonable priors.

Steven_Xu · April 17, 2019, 7:22am

I am experiencing a really strange problem, with the same code and more sampling I thought I would get an better estimate. But I am receiving fairly biased estimates, did I code anything wrong?

n1 <- rnorm(1000)
n2 <- rnorm(1000,0,0.25)
n1 <- n1[n1<=0]
n2 <- n2[n2>0]
y_n <- c(n1,n2)
test_data1 = within(list(),{
  N <- length(y_n)
  y <- y_n
})
mod_bqrnn <- stan_model("piece_wise.stan")
initf1 <- function() list(mu=c(0,0),sigma=c(1,0.25))
test_fit <- sampling(mod_bqrnn,data=test_data1,init=initf1,iter=1000,warmup=250,algorith="HMC")

data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  vector[2] mu;
  vector<lower=0> [2] sigma;
}

model {
  for(i in 1:N){
    if(y[i]<=0){
      y[i] ~ normal(mu[1],sigma[1]) T[, 0];
    }else{
      y[i] ~ normal(mu[2],sigma[2]) T[0, ];
    }
  }
}

, , chains = chain:4

          stats
parameter          mean         sd         2.5%          25%          50%          75%       97.5%
  mu[1]      0.21077200 0.37046433  -0.22628419  -0.02341318   0.14445668   0.31524014   1.1784420
  mu[2]      0.04119068 0.03731571  -0.04084605   0.01659133   0.04628801   0.06784184   0.1003024
  sigma[1]   1.05327782 0.12909491   0.87502752   0.96603528   1.03307975   1.10134776   1.3910572
  sigma[2]   0.21777376 0.01746593   0.18835739   0.20520857   0.21709193   0.22866561   0.2549169

stijn · April 17, 2019, 7:30am

I did some further experimentation and warmup gets stuck pretty easily. I only realised after your latest response that you are using algorithm = “HMC”. I think the sampling problems are because plain vanilla HMC is just less robust than NUTS. If you use the default NUTS, you should not experience any problems.

Steven_Xu · April 17, 2019, 3:38pm

True. Thank you for all the help!

Topic		Replies	Views
Inferring very small parameters when implementing non-centered hierarchical model Modeling fitting-issues	1	44	February 4, 2025
Parameter recovery issues with normal mixture model Modeling hierarchical-model , mixture	3	636	June 16, 2022
Having trouble recovering known parameters from simulated data Modeling cmdstanpy	2	648	June 4, 2021
Poisson Likelihood with normal priors Modeling poisson	7	1117	December 13, 2020
Problem with static HMC Modeling	9	1190	January 30, 2018

Difficulty modelling a piece-wise normal likelihood

Related topics