Horseshoe prior in rstan

Se_Yoon_Lee · April 23, 2020, 6:04pm

Hi. i am using the horseshoe prior, coded by stan as following:

data {
  int<lower=1> n; // Number of data
  int<lower=1> p; // Number of covariates
  matrix[n,p] X;  // n-by-p design matrix
  real y[n];      // n-dimensional response vector
}

parameters {
  vector[p] beta;
  vector<lower=0>[p] lambda;
  real<lower=0> tau;
  real<lower=0> sigma;
}

transformed parameters {
  vector[n] theta ;
  theta = X * beta;
}

model {
  lambda ~ cauchy(0, 1);
  tau ~ cauchy(0, 1);
  beta ~ normal(0, sigma * tau * lambda); 
  y ~ normal(theta, sigma);
}

I don’t know why, but it is extremely slow.
Would you give me any recommendations to speed up??

EDIT: @maxbiostat edited this post for syntax highlighting.

avehtari · April 23, 2020, 6:57pm

What do you mean by extremely slow? Days?
How large are n and p?

To speed up

Use the regularized horseshoe for better posterior geometry Sparsity information and regularization in the horseshoe and other shrinkage priors
Use the code shown in Appendix C.1 in Sparsity information and regularization in the horseshoe and other shrinkage priors
Instead of y ~ normal(theta, sigma) use y ~ normal_id_glm(x, alpha, beta, sigma) See https://mc-stan.org/docs/2_23/functions-reference/normal-id-glm.html

Or you could use brms or rstanarm packages which have these speed ups already implemented.

Se_Yoon_Lee · April 23, 2020, 7:06pm

Thank you very much for your reply.

20,000 iterations takes nearly 30 minutes or more, which may just take 1 minute in ordinary MCMC sampler. n and p are quite small, n = 100, p =500.

maxbiostat · April 23, 2020, 7:07pm

Why are you running 20K iterations, though? What is the lowest ESS you get?

Se_Yoon_Lee · April 23, 2020, 7:09pm

I am running 20K because I usually use this much in sampling to see a stationary.

maxbiostat · April 23, 2020, 7:16pm

I’ll let @avehtari be the final judge of that, but I find it weird it would take Stan 20K iterations to warmup properly for a problem of this size.

Se_Yoon_Lee · April 23, 2020, 7:42pm

To speed up I used the parameter expansion for the half-Cauchy as follows:
data {
int<lower=1> n; // Number of data
int<lower=1> p; // Number of covariates
matrix[n,p] X; // n-by-p design matrix
real y[n]; // n-dimensional response vector
}

parameters {
vector[p] beta;
vector<lower=0>[p] lambda_sq;
vector<lower=0>[p] omega;
real<lower=0> tau_sq;
real<lower=0> eta;
real<lower=0> sigma;
}

transformed parameters {
vector[n] theta ;
theta = X * beta;
}

model {
// Origianal horseshoe by carvalho + Parameter expansion on half-Cauchy
//lambda ~ cauchy(0, 1) is equivalent with;
omega ~ inv_gamma(1 / 2, 1);
lambda_sq ~ inv_gamma(1 / 2, 1 ./ omega);
//tau ~ cauchy(0, 1) is equivalent with;;
eta ~ inv_gamma(1 / 2, 1);
tau_sq ~ inv_gamma(1 / 2, 1 / eta);
beta ~ normal(0, sigma * sqrt(tau_sq) * sqrt(lambda_sq) );
y ~ normal(theta, sigma);
}

but I don’t why this does not work…

avehtari · April 24, 2020, 3:51pm

How do you diagnose stationarity? How big is your data, that is, n and p?

How about the speed ups I recommended?

Se_Yoon_Lee · April 24, 2020, 4:30pm

I check stationarity by see the traceplot in my naked eyes.
For simulation study, I typically use n = 100 p =500, but in gene expression data study, obviously p is much larger than 500, saying 5000.

avehtari · April 28, 2020, 6:45pm

That is not sufficient. If you are using RStan you can use monitor function.

See here for an example with p>5000

and the comments about the speed What's the highest dimensional model Stan can fit using NUTS? - #5 by avehtari
With my 3+ year old laptop it’s 2 hours, with GPUs it would be less than 40mins

Topic		Replies	Views
Horseshoe prior error in rstan Modeling	7	379	April 24, 2020
Horseshoe prior with diabetes dataset Modeling fitting-issues	4	3278	May 12, 2017
Rewrite brms stan code to speed up model with multiple measurement error predictors brms specification , cmdstanr , horseshoe-prior	0	390	May 24, 2023
Can someone double check to make sure my Stan code is correct for an ordered logistic regression with a horseshoe prior? Modeling rstan , specification	2	380	December 1, 2023
Applying a horseshoe prior to a vector of normally distributed estimates Modeling	1	332	October 19, 2020

Horseshoe prior in rstan

Related topics