R Studio Abort with Finnish Horseshoe specification

dkaplan · October 28, 2021, 12:45pm

Hi all,
I’m fitting the Finnish horseshoe using a specification I found in a webpage by Michael Betancourt.

This specification runs without the control=list(adapt_delta=0.99, max_treedepth=15) but induces about 400 divergences. When add the control=list(adapt_delta=0.99, max_treedepth=15), R Studio aborts. I’m running R Studio Version 2021.09.0 Build 351, on a Mac with Big Sur. Here is the code. As always, thank you.

PISA18sampleScale <- read.csv(file.choose(),header=T)

n <- nrow(PISA18sampleScale)
X <- PISA18sampleScale[,2:11]
readscore <- PISA18sampleScale[,1]
p <- ncol(X)


data.list <- list(n=n, p=p, X=X, readscore=readscore)

# Stan code adapated from Vehtari, Appendix C. #
modelString = "
data {
  int <lower=1> n;          // number of observations
  int <lower=1> p;          // number of predictors
  real readscore[n];        // outcome
  matrix[n,p] X;          //inputs

}

// slab_scale = 5, slab_df = 25


transformed data {
  real p0 = 5;
  real slab_scale = 3;
  real slab_scale2 = square(slab_scale);
  real slab_df = 25;
  real half_slab_df = 0.5 * slab_df;
  
}

parameters {
  vector[p] beta_tilde;
  vector<lower=0>[p] lambda;
  real<lower=0> c2_tilde;
  real<lower=0> tau_tilde;
  real alpha;
  real<lower=0> sigma;

}


transformed parameters {
  vector [p] beta ; // regression coefficients
{
 real tau0 = (p0 / (p - p0)) * (sigma / sqrt(1.0 * n));
    real tau = tau0 * tau_tilde; // tau ~ cauchy(0, tau0)
    
    // c2 ~ inv_gamma(half_slab_df, half_slab_df * slab_scale2)
    // Implies that marginally beta ~ student_t(slab_df, 0, slab_scale)
    
    real c2 = slab_scale2 * c2_tilde;
    
    vector[p] lambda_tilde =
      sqrt( c2 * square(lambda) ./ (c2 + square(tau) * square(lambda)) );
    
    // beta ~ normal(0, tau * lambda_tilde)
    beta = tau * lambda_tilde .* beta_tilde;
    
    }
}
 
model {
  beta_tilde ~ normal(0, 1);
  lambda ~ cauchy(0, 1);
  tau_tilde ~cauchy(0,1);
  c2_tilde ~ inv_gamma(half_slab_df, half_slab_df);
  
  alpha ~ normal(0, 2);
  sigma ~ normal(0, 2);
  
  readscore ~ normal(X * beta + alpha, sigma);

}

"

# Start estimation

nChains = 4
nIter= 50000
thinSteps = 10
warmupSteps = floor(nIter/2)

readscore = data.list$readscore

myfitFinnHorseshoe = stan(data=data.list,model_code=modelString,chains=nChains,
                          iter=nIter,warmup=warmupSteps,thin=thinSteps,
                          control=list(adapt_delta=0.99, max_treedepth=15))

andrjohns · October 28, 2021, 1:42pm

This is something that we’ve been seeing lately with the CRAN rstan, you can either try cmdstanr: R Interface to CmdStan • cmdstanr

Or the preview of the next rstan version:

remove.packages(c("StanHeaders", "rstan"))
install.packages("StanHeaders", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

dkaplan · October 28, 2021, 3:29pm

The next version of rstan works very well.

On a somewhat related topic, perhaps relevant for @avehtari, is there a resource providing guidance on how to choose the slab scale and slab df? I am getting beautiful trace plots, perfect rhats, and very good n_eff. However the density plots really don’t look very good.

Rplot.pdf (55.2 KB)

Thanks.

avehtari · October 29, 2021, 8:17am

This is the best resource Sparsity information and regularization in the horseshoe and other shrinkage priors. The slab is more important, e.g., in logistic regression where with many covariates complete separation is more likely. You can set the slab parameters as how you would set your prior for coefficients if oracle would tell you which covariates are irrelevant.

Looking at the plots, I see that you don’t have many covariates, and I guess that you have many more observations than covariates. In such case there is a smaller benefit from horseshoe. Also, I guess that you have correlating covariates. You could look at the pairs scatter plots for that. You could continue with projection predictive variable selection (see examples at Model selection tutorials and talks). If your model is not much more complex than what you now have, you could use brms or rstanarm to make your model and use CRAN - Package projpred to do the rest of the work. See projpred examples at Model selection tutorials and talks.

dkaplan · October 29, 2021, 9:46am

I should have mentioned that this analysis was based on a small sample from PISA. I have n=100, and p=10.

Playing with the slab seemed to help a bit, but there are still some density plots that don’t quite look as good as one would hope. However, the results seem roughly consistent with theory insofar as the small coefficients from the traditional horseshoe are even a bit smaller with the Finnish horseshoe. The larger ones (and there weren’t many) shrunk a bit too. I’m not sure how to get the density plots to look better, apart from your suggestions, but I want to stick to demonstrating the Finnish horseshoe. Again, all the other diagnostics are spot on.

avehtari · November 2, 2021, 10:34am

What would you hope to see?

The demonstration for the posterior marginals would be more clear with a larger p.

Topic		Replies	Views
Finnish horseshoe error message Modeling	8	671	October 25, 2021
Error message with Betancourt specification of horseshoe prior Modeling	2	556	January 7, 2022
Divergent transitions with the regularized horseshoe Modeling	1	429	December 21, 2022
Horseshoe prior error in rstan Modeling	7	379	April 24, 2020
"Fused" Finnish horseshoe (or any sparsity inducing prior for that matter) Modeling	2	930	August 9, 2019

R Studio Abort with Finnish Horseshoe specification

Related topics