R crashes when increasing number of studies from 4 to 5, in a regression random effect meta-analysis

Data
There are N studies, each study i has n_i participants and k_i successes. For each study, we also have two covariates x_1 and x_2.

Model 1
I try to fit a random effect regression model, assuming binomial distribution of the data. We define x as a matrix of dimensions N\times 2, whose columns are the two covariates x_1 and x_2.
k_i \sim \mathcal{Binomial}(p_i,n_i)
\text{logit}(p_i)=\theta + \theta_i + x \cdot \beta
\theta_i \sim \mathcal{N}(0,\sigma)

Prior
p_{init} \sim \mathcal{Beta}(1,10)
\theta=\text{logit}(p_{init})
\sigma \sim \mathcal{Exp}(\sigma_1)
\beta \sim \mathcal{N}(0,10)

Model 2
In model 2, I tried to estimate \beta with the help of regularized horseshoe priors as in J Piironen et al..

Simulations
With the help of simulated data (see R code below), I first try to see whether both models can retrieve true values of \beta, using Rstan.
When N<5 (number of studies), both models work, but when N=5 or more with Model 2 (with horseshoe prior) RStudio crashes, while there is no issue with Model 1.

image

Same problem happens when assigning inference=0 (prior predictive check). Any idea why ?
R version: 3.5.3.
Rstan version: 2.19.2

N=4 #work
N=5 #does not work
x1=rnorm(N,50,20)/100
x2=rbeta(N,1.5,1.5)

x=cbind(x1,x2)
x=as.matrix(x,nrow=N)
K=dim(x)[2]
apply(x,2,mean)
theta=logit(0.1)
sigma=0.8
b=2
theta_i=rnorm(N,0,sigma)
n=round(seq(100,200,length.out = N))
p=inv.logit(theta+theta_i+b*x2)
k<-rbinom(N,n,p)

stan_data=list(N=N,
K=K,
k=k,
n=n,
x=matrix(as.matrix(x),ncol=K),
beta_1=structure(rep(0,K),dim=K),
beta_2=structure(rep(10,K),dim=K),
sigma_1=1,
scale_global=1/((K-1)*sqrt(N)),
nu_global=1,
nu_local=1,
slab_scale=10,
slab_df=1,
inference=1)

mod<-stan_model(file = “model1.stan”)
mod<-stan_model(file = “model2.stan”)
fit<-sampling(object=mod7,
data=stan_data,
warmup = 1000,
iter = 2500,
chains = 1,
cores = 4,
thin = 1,
control=list(adapt_delta=0.8))

Stan code Model 1 (model1.stan)

// Stan model for simple linear regression
data {
  int<lower=0> N;   // number of data items
  int<lower=0> K; // number of predictors
  int k[N];
  int n[N];
  matrix[N, K] x;   // predictor matrix
  
  int inference;
  
  vector[K] beta_1;
  vector[K] beta_2;
  real sigma_1;
}
parameters {
  vector [K] beta;
  real<lower=0, upper=1> p_init;
  vector[N] theta_i;
  real<lower=0> sigma;
}
transformed parameters {
  real theta = logit(p_init);
  vector[N] xb;
  xb = x * beta;
}
model {
  //Priors
  p_init ~ beta(1,10);
  beta ~ normal(beta_1,beta_2);
  theta_i ~ normal(0,1);
  sigma ~ exponential(sigma_1);
  
  if(inference==1){
    for(i in 1:N){
      target += binomial_lpmf(k[i] | n[i], inv_logit(theta + theta_i[i] * sigma + xb[i])); // likelihood
    }
  }
}

Stan code Model 2 (model2.stan)

// Stan model for simple linear regression
data {
  int<lower=0> N;   // number of data items
  int<lower=0> K; // number of predictors
  int k[N];
  int n[N];
  matrix[N, K] x;   // predictor matrix

  int inference;
  
  real < lower =0 > scale_global; // scale for the half -t prior for tau
  real < lower =1 > nu_global; // degrees of freedom for the half -t priors for tau
  real < lower =1 > nu_local; // degrees of freedom for the half - t priors for lambdas
  real < lower =0 > slab_scale; // slab scale for the regularized horseshoe
  real < lower =0 > slab_df; // slab degrees of freedom for the regularized horseshoe
}
parameters {
  real<lower=0, upper=1> p_init;
  vector[N] theta_i; 
  
  vector [K] z;
  real < lower =0 > tau; // global shrinkage parameter
  vector < lower =0 >[K] lambda; // local shrinkage parameter
  real < lower =0 > caux;
}
transformed parameters {
  real theta = logit(p_init);
  vector[K] beta; // coefficients for predictors
  vector[N] xb;
  
  real < lower =0 > c; // slab scale
  vector < lower =0 >[K] lambda_tilde ; // ’ truncated ’ local shrinkage parameter
   c = slab_scale * sqrt (caux);
  lambda_tilde = sqrt ( c ^2 * square ( lambda ) ./ (c ^2 + tau ^2* square ( lambda )) );

  beta = z .* lambda_tilde * tau;
  xb = x * beta;
}
model {
  //Priors
  z ~ normal(0, 1);
  lambda ~ student_t( nu_local, 0, 1);
  tau ~ student_t( nu_global, 0 , scale_global);
  caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
 
  p_init ~ beta(1,10);
  
  theta_i ~ normal(0,1);
  
  if(inference==1){
    for(i in 1:N){
      target += binomial_lpmf(k[i] | n[i], inv_logit(theta + theta_i[i] + xb[i])); // likelihood
    }
  }
}

Couple points. First, I’ve taken the liberty of changing the tag of your post because I think this might be a RStan-specific issue rather than a modelling issue. Secondly, can you clarify what you mean by “crash” so folk like @bgoodri can be able to help?

Thanks, I just updated the question. Rstudio crashes.

Using -mtune = native or not?

Yes. Here is my Makevars.win file:

CXX14FLAGS=-O3 -march=native -mtune=native
CXX11FLAGS=-O3 -march=native -mtune=native

Should I change that?

Try deleting both -march=native -mtune=native.

1 Like

It works when deleting -mtune=native, thanks! Can you explain me what does it change exactly and why it did not work before?

-mtune=native enables options in the compiled code that are specific to the processor in your computer and it messes that up on Windows for some processors.

3 Likes