Error running Stan model with rstan 2.21 in Windows

Hi!

I recently upgraded to R 4.0.2 with the help of this post:

Everything seemed to be working fine. Then earlier today I upgraded rstan to 2.21. I ran into a problem with Error in compileCode(...) which is described in the topic below, and I was able to solve that using @bgoodri’s solution.

Now when I run a couple of specific Stan models that working in rstan 2.19, I get errors as soon as I start sampling. These models are several variations on a hidden Markov model. The error I get is as follows:

Error in unserialize(socklist[[n]]) : error reading from connection Error in serialize(data, node$con, xdr = FALSE) : error writing to connection

When I try to sample from the same models with cores = 1 R instantly crashes (R encountered a fatal error - the session was terminated). This problem seems to be model-specific; if I run the 8 Schools model, or just example(stan_model, package = "rstan", run.dontrun = TRUE) then I get warnings but no errors and sampling proceeds as expected.

Do you have any advice on what I could try to get this working again?

1 Like

I would try to run it with cores = 1 using Rterm rather than RStudio or the R GUI. Also, you should make note of what is tempdir() before running the code since there may be more information about the problem written to one of the .txt files in that directory.

Hi Ben, thanks for these suggestions. Both in R GUI and Rterm I have the same problem. As soon as I start sampling with cores = 1 R crashes and exits. There are no .txt files in the tempdir().
Running with cores = 4 gives the error mentioned above. Then the sampler just hangs (i.e. the terminal freezes and doesn’t return control to me).

By the way, compilation proceeds without errors but gives some warnings:

1: In system2(CXX, args = ARGS) : '""' not found
2: In file.remove(c(unprocessed, processed)) :
  cannot remove file 'C:\Users\mista\AppData\Local\Temp\Rtmpc5ViUv\file39cc16c1a40.stan', reason 'No such file or directory'

Not sure if that is useful information. Is there anything else I could try?

Can you post a link to the Stan code?

This is the Stan code. I’ll also include an R script to generate fake data and reproduce the problem (although you’ll have to change the path to the Stan file). I know it’s a lot of code - hopefully it’s still possible to find something useful.

Thanks for your help!

// Gaussian Hidden Markov Model

data {
  int<lower=1> T; // number of periods
  int<lower=1> N_states; // number of potential states
  vector[T] y; // observations
}

transformed data {
  vector<lower=0>[N_states] theta_prior = rep_vector(5.0, N_states); // prior for transition probs
}

parameters {
  // transition parameters
  simplex[N_states] theta[N_states]; // transition probabilities
  // theta[r, s] is the probability of a transition from state r to state s
  
  // emission parameters (observation model)
  vector[N_states] mu;
  positive_ordered[N_states] sigma; // ordered to break multi-modality
}

model {
  vector[N_states] alpha[T]; // posterior state probabilities:
  // alpha[t] is the vector of log unnormalized joint probabilities of 
  // the states at time t and the observations UP TO time t
  // so alpha[t, n] = P(states[t] == n AND observations[1:t] == y[1:t])
  
  // priors
  // priors for emission parameters
  mu ~ normal(0, 1);
  sigma ~ normal(0, 1);
  
  // initial state probability
  
  for (n in 1:N_states) {
    alpha[1, n] = normal_lpdf(y[1] | mu[n], sigma[n]);
  }
  
  // transition probability priors
  for (n in 1:N_states) {
    theta[n] ~ dirichlet(theta_prior);
  }
  
  //likelihood
  // marginalize state probabilities
  {
    real accum[N_states];
    for (t in 2:T) {
      for (s in 1:N_states) {
        for (r in 1:N_states) {
          accum[r] = alpha[t - 1, r] + log(theta[r, s]);
        }
        alpha[t, s] = normal_lpdf(y[t] | mu[s], sigma[s]) + log_sum_exp(accum);
      }
    }
    
  }
  
  target += log_sum_exp(alpha[T]);
  
}

generated quantities {
  vector[N_states] naive_probs[T]; // probabilities only from observations, not from time-dependence
  vector[N_states] forward_probs[T]; // alpha in 'standard notation'
  vector[N_states] backward_probs[T]; // backward probabilities; beta in 'standard notation'
  // backward_probs[t, s] is the probability
  // of state s given all observations (t+1):T
  vector[N_states] smoothed_probs[T]; // product of alpha and beta in 'standard notation'
  int<lower=1, upper=N_states> best_path[T]; // Viterbi path - highest probability path
  real logp_best_path; // log probability of best path
  
  for (t in 1:T) {
    for (n in 1:N_states) {
      naive_probs[t, n] = normal_lpdf(y[t] | mu[n], sigma[n]);
    }
    naive_probs[t] -= log_sum_exp(naive_probs[t]);
  }
  
  
  // infered state probabilities using forward algorithm
  for (n in 1:N_states) {
    forward_probs[1, n] = normal_lpdf(y[1] | mu[n], sigma[n]);
  }
  
  forward_probs[1] -= log_sum_exp(forward_probs[1]); // normalize immediately
  
  {
    real accum[N_states];
    for (t in 2:T) {
      for (s in 1:N_states) {
        for (r in 1:N_states) {
          accum[r] = forward_probs[t - 1, r] + log(theta[r, s]);
        }
        forward_probs[t, s] = normal_lpdf(y[t] | mu[s], sigma[s]) + log_sum_exp(accum);
      }
      forward_probs[t] -= log_sum_exp(forward_probs[t]); // normalize after every step
    }
  }
  
  
  // forward-backward algorithm
  {
    real accum[N_states];
    
    backward_probs[T] = rep_vector(- log(N_states), N_states); // initialization
    
    for (t in 1:(T - 1)) {
      for (s in 1:N_states) {
        for (r in 1:N_states) {
          accum[r] = backward_probs[T - t + 1, r] + log(theta[s, r]) + normal_lpdf(y[T - t + 1] | mu[r], sigma[r]);
        }
        backward_probs[T - t, s] = log_sum_exp(accum);
      }
      backward_probs[T - t + 1] -= log_sum_exp(backward_probs[T - t + 1]); // normalize after every step
    }
    backward_probs[1] -= log_sum_exp(backward_probs[1]);
    
    for (t in 1:T) {
        smoothed_probs[t] = forward_probs[t] + backward_probs[t];
        smoothed_probs[t] -= log_sum_exp(smoothed_probs[t]);
    }
    
  }
  
  //viterbi i.e. max-sum algorithm
  {
    vector[N_states] omega[T]; // omega[t, i] is the log probability of states up to t - 1,
    // state t being equal to i, and observations y[1:t]
    int backtrack_states[T, N_states]; // keeping track of best path so far
    // backtrack_states[t, i] is the state at time t - 1 that is part of the highest probability
    // path to state i at time t.
    
    for (i in 1:N_states) {
      omega[1, i] = normal_lpdf(y[1] | mu[i], sigma[i]);
    }
    
    omega[1] -= log_sum_exp(omega[1]);
    
    for (t in 2:T) {
      for (i in 1:N_states) {
        omega[t, i] = negative_infinity();
        
        for (j in 1:N_states) {
          real logp;
          logp = omega[t - 1, j] + log(theta[j, i]) + normal_lpdf(y[t] | mu[i], sigma[i]);
          
          if (logp > omega[t, i]) {
            backtrack_states[t, i] = j;
            omega[t, i] = logp;
          }
        }
      }
      omega[t] -= log_sum_exp(omega[t]);
    }
    
    logp_best_path = max(omega[T]);
    
    for (n in 1:N_states){
      if (omega[T, n] == logp_best_path) {
        best_path[T] = n;
      }
    }
    for (t in 1:(T - 1)) {
      best_path[T - t] = backtrack_states[T - t + 1, best_path[T - t + 1]];
    }
    
  }
  
  
  // finally, transform to probabilities
  naive_probs = exp(naive_probs);
  forward_probs = exp(forward_probs);
  backward_probs = exp(backward_probs);
  smoothed_probs = exp(smoothed_probs);
  
}

R script

library(rstan)

transition_process <- function(current_state,
                               possible_states = 1:2,
                               transition_probs = matrix(0.5,
                                                         nrow = length(possible_states),
                                                         ncol = length(possible_states))) {
  
  stopifnot(nrow(transition_probs) == length(possible_states))
  
  stopifnot(nrow(transition_probs) == ncol(transition_probs)) # must be square matrix
  
  stopifnot(any(rowSums(transition_probs) == 1)) # transition matrix rows must sum to 1
  
  new_state <- base::sample(possible_states, size = 1, prob = transition_probs[current_state, ])
  
  return(new_state)
}

observation_process <- function(current_state, state_mean, state_sd) {
  observation <- rnorm(1, mean = state_mean[current_state], sd = state_sd[current_state])
}

possible_states <- 1:2
transition_probs <- matrix(c(0.8, 0.2, 0.7, 0.3), nrow = 2, ncol = 2, byrow = TRUE)
state_mean <- c(0, 3)
state_sd <- c(0.5, 1)

# generate fake data
set.seed(12)
T <- 1000

states <- numeric(T)
states[1] <- possible_states[1]
observations <- numeric(T)
observations[1] <- observation_process(current_state = states[1],
                                       state_mean = state_mean,
                                       state_sd = state_sd)

for (t in 2:T) {
  states[t] = transition_process(current_state = states[t - 1],
                                 possible_states = possible_states,
                                 transition_probs = transition_probs)
  observations[t] <- observation_process(current_state = states[t],
                                         state_mean = state_mean,
                                         state_sd = state_sd)
}

stan_data_hmm <- list(T = T,
                      N_states = length(possible_states),
                      y = observations)

hmm <- stan_model("path/to/stanmodel")

fit_hmm <- sampling(hmm, data = stan_data_hmm, iter = 10000, chains = 4, cores = 1)

This runs for me on Linux. The new parser complains about variables “maybe being uninitialized”, which can be avoided by changing line 69 to

int best_path[T] = rep_array(0, T); // Viterbi path - highest probability path

and line 129 to

int backtrack_states[T, N_states] = rep_array(0, T, N_states); // keeping track of best path so far

Try that, although I sort of doubt it will fix the sampling error you are encountering on Windows.

Hi, @bgoodri thanks for these suggestions. Sadly (as you expected) they don’t solve the problem. I can get the model to run on my Mac as well, just not on my Windows machine. The same holds for variations of this model, although (like I mentioned before) simple models like the 8 Schools model seem to work fine.

An unrelated question: whenever I go to edit Stan files in RStudio on my Mac, I now get a warning that reads sh: clang++ -mmacosx-version-min=10.13: command not found. Every time I go from the console to the editor window this warning pops up again. Do you know what causes it and how I can fix that?

That’s fixed on GitHub. If it bothers you, you can temporarily do

remotes::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan/rstan")

But we still need to get to the bottom of the Windows thing before we can upload rstan to CRAN again.

1 Like

Yeah stancjs has that flag on by default (stanc does not). Will fix that tomorrow.

It would be fixed when standalone functions would be merged but this should get fixed before that.

1 Like

Thanks! That makes sense. I’ll wait patiently - but please let me know if I can test anything on my side or help in any other way.

Hi @bgoodri, the new version fixes the command not found warning but introduces a couple of new ones. These are almost surely related to the fact that I have folders with spaces in their names (e.g. Working Directory/Stan Practice/HMMPractice/... - the warnings I receive look like this:

clang: error: no such file or directory: 'Working'
clang: error: no such file or directory: 'Directory/Stan'
clang: error: no such file or directory: 'Practice/HMMpractice'

Can I do something about that?

Try

remotes::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan/rstan")

That gives the following error on my Windows machine:

Downloading GitHub repo stan-dev/rstan@develop
stan-dev-rstan-0eebee1/StanHeaders/inst/include/libsundials: Can't create '\\\\?\\C:\\Users\\mista\\AppData\\Local\\Temp\\RtmpMnML7N\\remotes2d0c3be628f9\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\libsundials'
stan-dev-rstan-0eebee1/StanHeaders/inst/include/src: Can't create '\\\\?\\C:\\Users\\mista\\AppData\\Local\\Temp\\RtmpMnML7N\\remotes2d0c3be628f9\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\src'
stan-dev-rstan-0eebee1/StanHeaders/inst/include/stan: Can't create '\\\\?\\C:\\Users\\mista\\AppData\\Local\\Temp\\RtmpMnML7N\\remotes2d0c3be628f9\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\stan'
tar.exe: Error exit delayed from previous errors.
"C:\PROGRA~1\Git\cmd\git.exe" clone --depth 1 --no-hardlinks --recurse-submodules --branch develop https://github.com/stan-dev/stan.git C:\Users\mista\AppData\Local\Temp\RtmpMnML7N\remotes2d0c3be628f9/stan-dev-rstan-0eebee1/rstan/rstan/../../StanHeaders/inst/include/upstream
Error: Failed to install 'rstan' from GitHub:
  Command failed (128)
In addition: Warning messages:
1: In utils::untar(tarfile, ...) :
  ‘tar.exe -xf "C:\Users\mista\AppData\Local\Temp\RtmpMnML7N\file2d0c5da1305b.tar.gz" -C "C:/Users/mista/AppData/Local/Temp/RtmpMnML7N/remotes2d0c3be628f9"’ returned error code 1
2: In system(full, intern = TRUE, ignore.stderr = quiet) :
  running command '"C:\PROGRA~1\Git\cmd\git.exe" clone --depth 1 --no-hardlinks --recurse-submodules --branch develop https://github.com/stan-dev/stan.git C:\Users\mista\AppData\Local\Temp\RtmpMnML7N\remotes2d0c3be628f9/stan-dev-rstan-0eebee1/rstan/rstan/../../StanHeaders/inst/include/upstream' had status 128

It works on my Mac, so it’s most likely some specific set-up on my machine (or a Windows problem).

Same error for me.

With whatever rstan you have, does your compilation utilize -march=native or similar?

My current makevars.win is

CXX11FLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function -Wno-ignored-attributes -Wno-deprecated-declarations 
CXX14FLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function -Wno-ignored-attributes -Wno-deprecated-declarations 

Can u try this:

Please recompile everything Stan related as needed.

My makevars.win now reads:

CXX14FLAGS=-O3 -mtune=native -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 
CXX11FLAGS=-O3 -mtune=native -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2

I have StanHeaders 2.21.0-5 installed from CRAN and tried to install rstan from github via

remotes::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan/rstan")

This results in the following error, still:

Downloading GitHub repo stan-dev/rstan@develop
stan-dev-rstan-0eebee1/StanHeaders/inst/include/libsundials: Can't create '\\\\?\\C:\\Users\\paulb\\AppData\\Local\\Temp\\RtmpikHlkL\\remotes19201e724170\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\libsundials'
stan-dev-rstan-0eebee1/StanHeaders/inst/include/src: Can't create '\\\\?\\C:\\Users\\paulb\\AppData\\Local\\Temp\\RtmpikHlkL\\remotes19201e724170\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\src'
stan-dev-rstan-0eebee1/StanHeaders/inst/include/stan: Can't create '\\\\?\\C:\\Users\\paulb\\AppData\\Local\\Temp\\RtmpikHlkL\\remotes19201e724170\\stan-dev-rstan-0eebee1\\StanHeaders\\inst\\include\\stan'
tar.exe: Error exit delayed from previous errors.
"C:\PROGRA~1\Git\cmd\git.exe" clone --depth 1 --no-hardlinks --recurse-submodules --branch develop https://github.com/stan-dev/stan.git C:\Users\paulb\AppData\Local\Temp\RtmpikHlkL\remotes19201e724170/stan-dev-rstan-0eebee1/rstan/rstan/../../StanHeaders/inst/include/upstream
Fehler: Failed to install 'rstan' from GitHub:
  Command failed (128)
Zusätzlich: Warnmeldungen:
1: In utils::untar(tarfile, ...) :
  ‘tar.exe -xf "C:\Users\paulb\AppData\Local\Temp\RtmpikHlkL\file1920129f44e5.tar.gz" -C "C:/Users/paulb/AppData/Local/Temp/RtmpikHlkL/remotes19201e724170"’ returned error code 1
2: In system(full, intern = TRUE, ignore.stderr = quiet) :
  Ausführung von Kommando '"C:\PROGRA~1\Git\cmd\git.exe" clone --depth 1 --no-hardlinks --recurse-submodules --branch develop https://github.com/stan-dev/stan.git C:\Users\paulb\AppData\Local\Temp\RtmpikHlkL\remotes19201e724170/stan-dev-rstan-0eebee1/rstan/rstan/../../StanHeaders/inst/include/upstream' ergab Status 128

That’s a different issue not related to compilation as it looks…Ben?

I can confirm I get this exact same error, with Makevars as described by @paul.buerkner. I’m using StanHeaders 2.21.0-6 that I found in one of the other threads on Windows problems over here.