Reciprocal truncated normal distribution seems to work well but the SBC hists are terrible

Hi all,
I’m fitting a truncated reciprocal distribution to simulated data. It’s a weird distributions, but it has been argued that it’s useful for reaction times and I anyways want to use it for a tutorial. It requires a truncation and change of variables (Jacobian). They are fairly simple, but maybe I have silly mistake there?

library(cmdstanr)
library(SBC)
library(extrDistr)
## Generate data from a truncated reciprocal normal dist
N <- 40
mu <- .002
sigma <- .0004
rt <- 1/rtnorm(N, mu, sigma, a = 0)

# code to fit the distribution
recnormal_code <- '
data {
  int<lower = 1> N;
  vector[N] RT;
}
parameters {
  real mu_s;
  real<lower = 0> sigma_s;
}
transformed parameters {
  //aux pars so that it samples from pars that are not that small
  real mu = mu_s / 1000;
  real sigma = sigma_s / 1000;
}
model {
  target += normal_lpdf(mu_s | 2, 2);
  target += normal_lpdf(sigma_s | 4, 2)
            - normal_lccdf(0 | 4, 2);
  target += normal_lpdf(1 ./ RT | mu, sigma)
     - N * normal_lccdf(0 | mu, sigma)
    - sum(2 * log(RT));
}
'

The thing is that I recover true values quite ok, see below:

cmdstan_model <- cmdstanr::cmdstan_model(stan_file = write_stan_file(recnormal_code) )
fit_rec <- cmdstan_model$sample(
  data = list(RT= rt, N= N),
  seed = 123,
  chains = 4,
  parallel_chains = 4,
  refresh = 500
)
fit_rec$summary(variables = c("mu_s", "sigma_s"))
## # A tibble: 2 × 10
##   variable  mean median     sd    mad    q5   q95  rhat ess_bulk ess_tail
##   <chr>    <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
## 1 mu_s     2.04   2.04  0.0564 0.0555 1.95  2.13   1.00    3124.    2653.
## 2 sigma_s  0.350  0.346 0.0422 0.0409 0.289 0.427  1.00    2998.    2475.
fit_rec$summary(variables = c("mu", "sigma"))
# A tibble: 2 × 10
##   variable     mean   median        sd       mad       q5      q95  rhat ess_bulk ess_tail
##   <chr>       <dbl>    <dbl>     <dbl>     <dbl>    <dbl>    <dbl> <dbl>    <dbl>    <dbl>
## 1 mu       0.00204  0.00204  0.0000564 0.0000555 0.00195  0.00213   1.00    3124.    2653.
## 2 sigma    0.000350 0.000346 0.0000422 0.0000409 0.000289 0.000427  1.00    2998.    2475.

Here I use the new package SBC. And the results are quite bad. I tried to do SBC by hand and I also get horrible results:


# gen function:
recnormal_gen_single <- function(N){  # N is the number of data points we are generating
  mu_s <- rnorm(1, 2, 2)
  sigma_s <- rtnorm(1, 4, 2, a = 0)
  mu <- mu_s / 1000
  sigma <- sigma_s / 1000
  RT <- rtnorm(N, mu, sigma, a = 0)
  list(
    variables = list(
      mu_s = mu_s,
      sigma_s = sigma_s),
    generated = list(
      N = N,
      RT = RT)
  )
}

set.seed(54882235)
n_sims <- 100  # Number of SBC iterations to run

recnormal_generator <- SBC_generator_function(recnormal_gen_single, N = 40)
recnormal_dataset <- generate_datasets(
  recnormal_generator,
  n_sims)
cmdstan_model <- cmdstanr::cmdstan_model(recnormal_rt)
recnormal_backend <- SBC_backend_cmdstan_sample(
  cmdstan_model, iter_warmup = 1000, iter_sampling = 1000, chains = 2)

results <- compute_SBC(recnormal_dataset, recnormal_backend)

plot_rank_hist(results)

I attach the plot

Any ideas of what’s going on?

Best,
Bruno

2 Likes

Maybe the bug is not in the Stan code, but in the generator?

Unless I am missing something, the generator code differs between the “one simple generated dataset”, where you have rt <- 1/rtnorm(N, mu, sigma, a = 0) while in the generator for SBC you have just RT <- rtnorm(N, mu, sigma, a = 0), i.e. the reciprocal is missing.

1 Like

I can’t believe how I had such a dumb mistake.
I checked everything, even if the SBC algorithm and plots were working properly!

2 Likes

No worries , this can happen to anybody.

1 Like