A modified Stan model estimates only after a laptop reboot

I have run into a mysterious phenomenon. I am using the development version of CmdstanR (0.4.0.9000) and version 2.27.0 of cmdstan. I am running this software on Mac OS Big Sur with an M1 machine, running version 4.1 of R, and a development version of RStudio.

This is the issue: the first time I compile and run the stan model, everything works fine.

If I edit the Stan code even slightly, say modifying a single prior by increasing a standard deviation by 1 unit, and recompile and run, I get an error indicating “no chains finished successfully.”

But the oddest thing is that when I return the Stan code to its original state, I continue to get the same warning/failure message. However, if I reboot the Macbook Pro, I am able to successfully estimate the model the modified Stan model or the original, whichever one I compile first.

Unless you think it is helpful, I will not provide a detailed explanation of the data and model, but I am providing all the R and Stan code for the simplest model. (There may be questions about why this model, but that is not relevant here - I am trying to understand why I need to reboot my laptop in order to successfully fit a new model.)


Here is the Stan model:

data {
  
  int<lower=0> N;
  int<lower=1, upper=2> a[N];    // intervention 1
  int<lower=1, upper=2> b[N];    // intervention 2
  int<lower=1, upper=4> ab[N];   // interaction 1 & 2
  vector[N] y;
  
}

parameters {
  
  vector[1] t_a_raw;
  vector[1] t_b_raw;
  vector[3] t_ab_raw;
  
  real<lower=0> sigma;
  
}

transformed parameters {
  
  // constrain parameters to sum to 0
  
  vector[2] t_a = append_row(t_a_raw, -t_a_raw);
  vector[2] t_b = append_row(t_b_raw, -t_b_raw);
  vector[4] t_ab = append_row(t_ab_raw, -sum(t_ab_raw));
  
  // yhat
  
  vector[N] yhat;

  for (i in 1:N){
    yhat[i] = t_a[a[i]] + t_b[b[i]] + t_ab[ab[i]];
  }
}


model {
  
  sigma ~ cauchy(0, 1);
  
  t_a ~ normal(0, 3);
  t_b ~ normal(0, 3);
  t_ab ~ normal(0, 3);

  y ~ normal(yhat, sigma);

}

And here is the R code which generates the data and calls Stan:

library(simstudy)
library(cmdstanr)
library(caret)
library(ggpubr)

b_1 <- c(-3, 3)
b_2 <- c(-1, 1)
b_12 <- matrix(c(-.5/3, -.5/3, -.5/3, .5), nrow = 2)

d1 <- defData(varname = "a", formula = ".5;.5", variance = "1;2", dist = "categorical")
d1 <- defData(d1, varname = "b", formula = ".5;.5", variance = "1;2", dist = "categorical")

d1_a <- defDataAdd(varname = "y", formula = "mu", variance = 4, dist = "normal")

set.seed(107)
dd <- genData(1000, d1)
dd[, mu := b_1[a] + b_2[b] + b_12[a, b], keyby = id]
dd <- addColumns(d1_a, dd)

dsum <- dd[, mean(y), keyby = .(a, b)]

p1 <- ggplot(data = dd, aes(x = factor(a), y = y)) +
  geom_jitter(height = 0, width = .1) +
  facet_grid(.~b) +
  theme(panel.grid = element_blank()) +
  ylim(-12, 12)

p2 <- ggplot(data = dsum, aes(x = factor(a), y = V1)) +
  geom_point(aes(color = factor(b)), size = 2) +
  geom_line(aes(group= factor(b), color = factor(b))) +
  theme(panel.grid = element_blank()) +
  ylim(-5, 5)

ggarrange(p1, p2)

summary(lm(y~factor(a)*factor(b), data = dd))

# Bayesian analysis

dt_to_list <- function(dx) {
  
  dx[, a_f := factor(a)]
  dx[, b_f := factor(b)]
  
  dv <- dummyVars(~ a_f:b_f , data = dx, n = c(2,2))
  dp <- predict(dv, dx )
  
  N <- nrow(dx)                               ## number of observations 
  a <- dx[, a]                                ## number of levels of outcome 
  b <- dx[, b]                                ## individual outcome 
  ab <- as.vector(dp %*% c(1:4))
  
  y <- dx[, y]
  
  list(N=N, a=a, b=b, ab=ab, y=y)
}

mod <- cmdstan_model("Programs/two_factor.stan")

fit <- mod$sample(
  data = dt_to_list(dd),
  seed = 27261,
  refresh = 0,
  chains = 1L,
  parallel_chains = 1L,
  iter_warmup = 500,
  iter_sampling = 2500
)

fit$summary(variables = c("t_a", "t_b", "t_ab"))

Any insights would be much appreciated.

2 Likes

What happens if you force recompile without modifying the Stan file at all? Edited to add: what about if you call sample() again without recompiling?

Very good questions. If I reboot, compile and sample, and call sample() again without compiling, the model estimates just fine both times - even if I have generated a different data set for each sample. However, if I force recompile without modifying the Stan file at all (before rebooting), the estimation fails as if I have changed the Stan file. So, it appears that recompiling without rebooting is doing something funky.

1 Like

Wow that’s weird. Just in case, try running cmdstanr::rebuild_cmdstan() in a fresh R session. If the problem persists, then a few more questions:

  1. What are the contents of your cmdstan’s make/local (assuming you’ve installed cmdstan via cmdstanr this file is probably on a path like /Users/YOUR_USER_NAME/.cmdstanr/cmdstan-2.27.0/make/local)?
  2. If you rename your .stan file and then compile the newly named model, can you sample that? More generally, after compiling one model, can you compile and sample any other models at all?
  3. When you run cmdstan_model(), a unix executable should appear in the same directory where your .stan model definition file is. If you delete this file, then recompile, can you sample?

I would also advice to check the temp folder. For example, if manually remove all previously compiled files, then would the recompilation be successful? Otherwise, you can use cmdstan+bash directly, and check if the problem with cmdstanr or cmdstan

ps: just curious, when the chain fails, any messages in the terminal? Usually, the program gives the reason why it failed

Before trying to answer your questions, I rebuilt cmdstan.

(1) The file /Users/YOUR_USER_NAME/.cmdstanr/cmdstan-2.27.0/make/local exists, but it has no contents.

(2) If I rename the .stan file, and compile the newly name model, I can indeed sample that. I did go ahead a create a much simpler regression model, and I am observing the same behavior.

(3) And yes, if I delete the executable file and recompile, I can indeed sample. (As a workaround, this is a little annoying, but is better than having to reboot every time I compile!)

Yes - it turns out if I remove the previously compiled file, I can recompile and sample successfully. I would definitely like to try to see if I can get more a meaningful message indicating what the problem is - how would I run directly in cmdstan from terminal?

1 Like

Is there anything at all funny about the folder where your .stan file resides that might cause R not to have privileges to overwrite files there? Do you get the same problem if you save the .stan file to your desktop?

See here:

You can skip the part about installing cmdstan.

There doesn’t seem to be anything funny about that directory - it is on my local hard drive. I do have the same problem with the files on my Desktop.

Youch. Regardless of what happens with cmdstan run from terminal (which would probably be useful information), it’s time for me to pass you up the food chain @rok_cesnovar @jonah @ahartikainen

A workaround for now would be to add a line to cmdstan_model that deletes the existing executable before compiling. Let me know if you’d like help with that, but really it would be much better to get to the bottom of this!

Yes - it would be great to get to the bottom of this, particularly since it just started happening with this recent project. I’ve not had this problem before, though admittedly my setup is constantly changing as I try to make things work with the M1 chip.

When I have a chance, I will see how things go executing directly in cmdstan. Thanks so much for your help, and I hope we can figure this out.

First debug step is to compile the model on bash and see if everything works.

Can you check which gcc to see if you have multiple compilers?

Here are the results of which gcc: /usr/bin/gcc. There appears to be a single compiler.

This might be interesting - I have two cmdstan directories: /Users/keith/cmdstan and /Users/keith/.cmdstanr/cmdstan-2.27.0.

When I run make examples/bernoulli/bernoulli in /Users/keith/.cmdstanr/cmdstan-2.27.0, everything is fine - no errors. But when I run make examples/bernoulli/bernoulli in /Users/keith/cmdstan, I get error messages during compliation.

What happens when you try to recompile and then sample model from the .cmdstanr directory?

It seems to work fine when I recompile and then sample while in the .cmdstanr/cmdstan-2.27.0 directory.

1 Like

Maybe you have outdated cmdstan in that user folder. Have you upgraded your laptop recently?

Also, you probably want to remove that user folder cmdstan or recompile it (make clean --all etc)

Yeah - I did update recently to cmdstan 2.27.0. I removed that old version - but, of course, the original issue still remains.

Instead of using the bernoulli example, I am using a simple regression example that I created and works in the Rstudio environment. However, when I put the stan code and json data in a directory called /Users/keith/.cmdstanr/examples/ksg, things don’t work so well.

It appears that I can compile the file:

keith@180madphmlt052 cmdstan-2.27.0 % make examples/ksg/simple_regression                                                               
make: `examples/ksg/simple_regression' is up to date.
keith@180madphmlt052 cmdstan-2.27.0 % rm examples/ksg/simple_regression.hpp
keith@180madphmlt052 cmdstan-2.27.0 % rm examples/ksg/simple_regression    
keith@180madphmlt052 cmdstan-2.27.0 % make examples/ksg/simple_regression  

--- Translating Stan model to C++ code ---
bin/stanc  --o=examples/ksg/simple_regression.hpp examples/ksg/simple_regression.stan

--- Compiling, linking C++ code ---
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS         -c -include-pch stan/src/stan/model/model_header.hpp.gch -x c++ -o examples/ksg/simple_regression.o examples/ksg/simple_regression.hpp
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare -D_REENTRANT -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include   -O3 -I src -I stan/src -I lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.9 -I stan/lib/stan_math/lib/boost_1.75.0 -I stan/lib/stan_math/lib/sundials_5.7.0/include    -DBOOST_DISABLE_ASSERTS               -Wl,-L,"/Users/keith/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/keith/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"      examples/ksg/simple_regression.o src/cmdstan/main.o        -Wl,-L,"/Users/keith/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb" -Wl,-rpath,"/Users/keith/.cmdstanr/cmdstan-2.27.0/stan/lib/stan_math/lib/tbb"   stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_idas.a stan/lib/stan_math/lib/sundials_5.7.0/lib/libsundials_kinsol.a  stan/lib/stan_math/lib/tbb/libtbb.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc.dylib stan/lib/stan_math/lib/tbb/libtbbmalloc_proxy.dylib -o examples/ksg/simple_regression
rm -f examples/ksg/simple_regression.o

But, when I try to sample, things go awry:

keith@180madphmlt052 ksg % ./simple_regression sample data file=regression.data.json
method = sample (Default)
  sample
    num_samples = 1000 (Default)
    num_warmup = 1000 (Default)
    save_warmup = 0 (Default)
    thin = 1 (Default)
    adapt
      engaged = 1 (Default)
      gamma = 0.050000000000000003 (Default)
      delta = 0.80000000000000004 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
    algorithm = hmc (Default)
      hmc
        engine = nuts (Default)
          nuts
            max_depth = 10 (Default)
        metric = diag_e (Default)
        metric_file =  (Default)
        stepsize = 1 (Default)
        stepsize_jitter = 0 (Default)
id = 0 (Default)
data
  file = regression.data.json
init = 2 (Default)
random
  seed = 4208440880 (Default)
output
  file = output.csv (Default)
  diagnostic_file =  (Default)
  refresh = 100 (Default)
  sig_figs = -1 (Default)
  profile_file = profile.csv (Default)

Exception: mismatch in number dimensions declared and found in context; processing stage=data initialization; variable name=N; dims declared=(); dims found=(1) (in 'examples/ksg/simple_regression.stan', line 2, column 2 to column 17)

For completeness, here is the stan code:

data {
  int<lower=0> N;
  vector[N] x;
  vector[N] y;
}

parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
}


model {
  y ~ normal(alpha + beta*x, sigma);
}

generated quantities {
  real y_rep[N] = normal_rng(alpha + beta*x, sigma);
}

And here is the JSON data file:

{
  "N": [10],
  "x": [0.2605, 0.9917, 0.831, 0.2237, 0.4458, 0.1094, 0.565, 0.1395, 0.7674, 0.3644],
  "y": [8.173, 14.7943, 5.9667, -2.2663, 1.2737, 1.6466, 7.1887, 3.8055, 4.7056, -0.598]
}
1 Like

Your data has invalid format, N should not be a list.

1 Like