Persistent Windows Issues with RStudio

betanalpha · July 22, 2019, 10:28am

I continue to encounter issues in my courses where some Windows computers crash RStudio when running models like that included below. The problems do not arise on Mac or Linux, or PyStan on Windows. They seem to be limited to Windows 10, but having Windows 10 is not sufficient. Machines with 16 GB of RAM (and no other processes starving the system of memory) still crash. One possible commonality is Windows 10 Enterprise?

I have two system specs for machines that were causing problems if that might help.

Any thoughts?

data {
  int<lower=1> N; // Number of observations
  
  int<lower=1> N_location; // Number of location groups
  int<lower=1, upper=N_location> location[N]; // Location group assignments

  int<lower=1> N_method; // Number of method groups
  int<lower=1, upper=N_method> method[N]; // Method group assignments
  
  vector[N] x; // Covariates
  int y[N];    // Binary variates
}

parameters {
  real mu_alpha;             // Location intercepts population mean
  real<lower=0> sigma_alpha; // Location intercepts population standard deviation
  vector[N_location] alpha_location_tilde; // Non-centered location intercepts

  real mu_beta; // Location slopes population mean
  real<lower=0> sigma_beta; // Location slopes population standard deviation
  vector[N_location] beta_location_tilde; // Non-centered location slopes
}

transformed parameters {
  // Recentered intercepts for each location group
  vector[N_location] alpha_location = mu_alpha + sigma_alpha * alpha_location_tilde;
  // Recentered slopes for each location group
  vector[N_location] beta_location = mu_beta + sigma_beta * beta_location_tilde;
}

model {
  mu_alpha ~ normal(0, 0.5);           // Prior model
  sigma_alpha ~ normal(0, 2);       // Prior model
  alpha_location_tilde ~ normal(0, 1); // Non-centered hierarchical model

  mu_beta ~ normal(0, 0.5);           // Prior model
  sigma_beta ~ normal(0, 2);       // Prior model
  beta_location_tilde ~ normal(0, 1); // Non-centered hierarchical model

  // Observational model
  y ~ bernoulli_logit(beta_location[location] .* x + alpha_location[location]);
}

rok_cesnovar · July 22, 2019, 10:44am

Could you share a minimal example that fails including the data? Or does it crash on compile? I have access to a few Windows 10 machines, enterpise/education & pro. Maybe we can replicate it.

betanalpha · July 22, 2019, 10:52am

It compiles but crashes sometime during runtime.

Try the below using the Stan program I pasted above with the attached data file.

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

data <- read_rdump('multilevel_logistic_regression.data.R')
fit <- stan(file='one_level_ncp.stan', data=data, seed=4938483,
            control=list(adapt_delta=0.85), chains=1)

multilevel_logistic_regression.data.R (4.2 KB)

Charles_Driver · July 22, 2019, 10:57am

I get similar behaviour on some machines when I have -march=native in my makevars.

betanalpha · July 22, 2019, 11:14am

Good call – the heterogeneity may very well be controlled by the processor type and hence tickle a -march=native issue.

rok_cesnovar · July 22, 2019, 11:29am

I tried it on a Windows 10 Enterprise machine with an i5 CPU. I am running R 3.6.1. and RTools 3.5. It worked with and without the march flag. Will investigate on other machines. Do post the configurations, they might help track this down.

betanalpha · July 22, 2019, 11:41am

lukas-rokka · July 22, 2019, 11:57am

I also had RStudio crashing when using the -march=native and calling multi_normal().

tjmahr · July 22, 2019, 1:49pm

I had the crash with this model too on Windows 10 Pro (Version 1903) with R 3.6.1, rstan 2.19.2, StanHeaders 2.18.1-10, RStudio 1.2.1555. and Rtools version 3.5.0.4

CXX14=$(BINPREF)g++ -O2 -march=native -mtune=native
CXX14FLAGS=-O3 -march=native -mtune=native
CXX11FLAGS=-O3 -march=native -mtune=native

betanalpha · July 22, 2019, 3:27pm

But not the particular function I included above?

djvanness · July 22, 2019, 4:31pm

I think this issue comes down to the Rtools package using a version of the mingw compiler that was built before some of the latest generation Intel processors. On my machine, which uses Skylake processors, I had to use -march=Broadwell to get models to compile. Has anyone tried using experimental Rtools 4.0?

rok_cesnovar · July 22, 2019, 4:48pm

I was able to finnally reproduce this on a third machine with an i7 CPU (also got nothing on an AWS instance with Windows Server). Its a fresh install of R 3.6.1, Rstudio and RTools and with -march=native set.

Sometimes (almost always after after a Rstudio restart) it displayed 1/2000 and crashed the R session. Upon restarting the R session (but not RStudio) it ran fine. Is this like what you were seeing @betanalpha ?

One thing I am seeing is that -march=native causes the model to run for twice as long when it doesnt crash (1.4 seconds comapred to 0.7s withotut the flag). So that might already be a sign of some troubles.

EDIT: no issue without the flag

lukas-rokka · July 22, 2019, 5:26pm

Tested your function now and it did not crash.

But, I was not able to recreate the crash I had few weeks ago either. Since then I’ve updated both R (3.5.x to 3.6.1) and Stan (2.18.x to 2.19.2). I think my Windows version (17763), Rstudio (1.2.1335) and Rtools (3.5.0.4) are still the same.

betanalpha · July 22, 2019, 8:36pm

The “run for 1/2000 then crash” was definitely observed, but those people didn’t get anything to work by restarting.

Looks like the -march=native flag is definitely the culprit. Thanks everyone!

betanalpha · July 22, 2019, 8:36pm

Do you have ‘-march=native’ in your makevars?

ahartikainen · July 22, 2019, 8:52pm

On Windows (PyStan) mingw-w64 compiler would crash sometimes (divide-by-zero, and maybe some other function where output should have been nan).

Have you tried the latest dev version for RTools?

lukas-rokka · July 23, 2019, 8:42am

I did try it both with ‘-march=native’ and without. Did not crash when I run it yesterday. But tried it again today and now it crashes every time when using the ‘-march=native’!

Attaching my program below that also usually will crash when ‘-march=native’. When rewriting it to use multi_normal_cholesky instead if multi_normal it does work with ‘-march=native’.

library(rstan)

Sys.setenv(LOCAL_CPPFLAGS = '-march=native')
#Sys.setenv(LOCAL_CPPFLAGS = '-O3 -march=native -mtune=native')

code = '
data {
  int<lower=0> N; 
  matrix[N, 4] xy;
  real r1; 
  real r2; 
  real r3; 
  real r4; 
}

transformed data {
  matrix[N*2, N*2] Omega = diag_matrix(rep_vector(1., N*2));
  vector[N*2] xy_mu = append_row(xy[, 1], xy[, 2]);
  vector[N*2] xy_sigma = append_row(xy[, 3], xy[, 4]);
  
  Omega = [
    [1,  r1, r2, r2,    r3, r4, r4, r4],
    [r1, 1,  r1, r2,    r4, r3, r4, r4],
    [r2, r1, 1,  r1,    r4, r4, r3, r4],
    [r2, r2, r1, 1,     r4, r4, r4, r3],
    
    [r3, r4, r4, r4,    1,  r1, r2, r2],
    [r4, r3, r4, r4,    r1, 1,  r1, r2],
    [r4, r4, r3, r4,    r2, r1, 1,  r1],
    [r4, r4, r4, r3,    r2, r2, r1, 1]];
    
  print("Omega "); for (i in 1:N*2) print(Omega[i, ]);

}

parameters {
  vector[N*2] xy_par;
}

model {
  xy_par ~ multi_normal(xy_mu, quad_form_diag(Omega, xy_sigma));
}
'

xy = matrix(c(
  c(-20, -10, 0, 17),  # x_mu
  c(60, 52, 43,  18),  # y_mu
  c(1, 1, 1, 1),       # x_sigma
  c(1, 1, 1, 1)        # y_sigma
  ), ncol=4)

model <- stan_model(model_code = code)
fit_xy <- sampling(model, chains = 1, seed=65445, 
                   data=list(N=4, xy=xy, r1=0.0, r2=0.0, r3=-0.0, r4=-0.0))

ahartikainen · July 23, 2019, 8:58am

Will it crash if you set init to something reasonable and no warm-up?

lukas-rokka · July 23, 2019, 10:59am

Yes it does. Here is simpler program that often reproduces the crash. But, sometimes it does not crash. The only factor I’m sure about is the ‘-march=native’.

library(rstan)

Sys.setenv(LOCAL_CPPFLAGS = '-march=native')

code = '
data {
  int<lower=0> N;
  int<lower=0> K;
  matrix[N, K] y;
}

parameters {
  vector[K] mu;
  vector<lower=0.0>[K] sigma;
  corr_matrix[K] Omega;
}

model {
  Omega ~ lkj_corr(2);
  for (i in 1:N) {
    y[i, ] ~ multi_normal(mu, quad_form_diag(Omega, sigma));
  }
}
'

N = 100
K = 4
mu = 1
sigma = 1

y = MASS::mvrnorm(n = N, mu=rep(mu, K), Sigma=diag(rep(sigma, K)))

model <- stan_model(model_code = code)
fit <- sampling(model, chains = 1, seed=65445, data=list(N=N, K=K, y=y))

Topic		Replies	Views
Algebraic Solver Problems on Windows? Developers	5	761	March 6, 2019
Windows RStudio crashing with RStan model loaded Developers	15	2058	January 30, 2019
Rstan crashes after first iteration on one of two Windows machines RStan	1	1432	June 5, 2017
Error: cannot allocate vector of size 13813.6 Gb RStan	8	5369	May 5, 2020
Model object not written rstanarm	0	472	November 20, 2018

Persistent Windows Issues with RStudio

Related topics