Finnish horseshoe error message

Greetings all. I have implemented the Finnish horseshoe prior as shown in Appendix C of Piironen & Vehtari (2017). Here is the code including the data setup in R

PISA18sampleScale <- read.csv(file.choose(),header=T)

PISA18sampleScale <- as.matrix(PISA18sampleScale)
n <- nrow(PISA18sampleScale)
p <- ncol(X)
X <- PISA18sampleScale[,2:11]
readscore <- as.matrix(PISA18sampleScale[,1])
data.list <- list(n=n, p=p, X=X, readscore=readscore)

# Stan code adapated from Vehtari, Appendix C. #
modelString = "
data {
  int <lower=0> n;   // number of observations
  int <lower=0> p;   // number of predictors
  vector [n] readscore; // outcome
  matrix [n,p] X;       //inputs
  real <lower=0> scale_icept ; //prior std for the intercept
  real <lower=0> scale_global ; // scale for the half -t prior for tau
  real <lower=1> nu_global ; // degrees of freedom for the half-t prior for tau
  real <lower=1> nu_local ;  //degrees of freedom for the half-t priors
                               //for lambdas
  real <lower=0> slab_scale ; //  slab scale for the regularized horseshoe
  real <lower=0> slab_df ; //  slab degrees of freedom for the regularized horseshoe
}

parameters {
  real logsigma ;
  real alpha ;
  vector[p] z;
  real <lower=0> tau; // global shrinkage parameter
  vector <lower=0>[p] lambda ; // local shrinkage parameter
  real <lower=0> caux ;
}

transformed parameters {
  real <lower=0> sigma ;  // noise std
  vector <lower=0>[p] lambda_tilde ; // 'truncated ' local shrinkage parameter
  real <lower =0> c; // slab scale
  vector [p] beta ; // regression coefficients
  vector [n] f;     // latent function values
  sigma = exp(logsigma);
  c = slab_scale * sqrt (caux);
  lambda_tilde = sqrt(c^2 * square(lambda) ./ (c^2 + tau^2 * square (lambda)) );
  beta = z .* lambda_tilde*tau;
  f = alpha + X* beta ;
}

model {
// half -t priors for lambdas and tau , and inverse - gamma for c^2
z ~ normal (0, 1);
lambda ~ student_t (nu_local, 0, 1);
tau ~ student_t (nu_global, 0, scale_global * sigma );
caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
alpha ~ normal (0, scale_icept);
readscore ~ normal (f, sigma);

}

"

# Start estimation

nChains = 4
nIter= 5000
thinSteps = 10
warmupSteps = floor(nIter/2)

readscore = data.list$readscore

myfitFinnHorseshoe = stan(data=data.list,model_code=modelString,chains=nChains,
                  iter=nIter,warmup=warmupSteps,thin=thinSteps)

I am receiving the following error message which I take to be a mismatch of vectors, matrices, and reals, however I don't see the problem.  Advice always appreciated.

```stan
Real return type required for probability function.
 error in 'modelfd12100ec98a_f72b37525158f4bfd223f8ff542529cf' at line 45, column 31
  -------------------------------------------------
    43: caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
    44: alpha ~ normal (0, scale_icept );
    45: readscore ~ normal (f, sigma );
                                      ^
    46: 
  ------

I’m able to compile your model under both rstan and cmdstanr without issue. Can you try running it in a fresh R session and seeing if the problem remains?

1 Like

Hi Andrew. Thanks for your email. I opened a new R session and I’m now getting this error.

Error in mod$fit_ptr() :
Exception: mismatch in number dimensions declared and found in context; processing stage=data initialization; variable name=readscore; dims declared=(100); dims found=(100,1) (in ‘model44431ccf617_a6f8150b953ec2755d1eae81cfbbc115’ at line 5)

failed to create the sampler; sampling not done

That indicates that the data you’re passing from R is formatted as a one-column matrix rather than a vector

Thanks again. That error was solved but here is another one.

Error in mod$fit_ptr() :
Exception: variable does not exist; processing stage=data initialization; variable name=scale_icept; base type=double (in ‘model6e3603ba993_a6f8150b953ec2755d1eae81cfbbc115’ at line 7)

failed to create the sampler; sampling not done

This code is a direct copy and past from Piironen and Vehatari and modified to deal with my specific data. You’ve been very kind with answering my questions, but I don’t want to keep bothering you. If there is some generic issue with the code that is the problem, do let me know.

Here is the Stan code again for your convenience.

PISA18sampleScale <- read.csv(file.choose(),header=T)

PISA18sampleScale <- as.matrix(PISA18sampleScale)
n <- nrow(PISA18sampleScale)
X <- PISA18sampleScale[,2:11]
p <- ncol(X)
readscore <- as.vector(PISA18sampleScale[,1])
data.list <- list(n=n, p=p, X=X, readscore=readscore)

# Stan code adapated from Vehtari, Appendix C. #
modelString = "
data {
  int <lower=0> n;   // number of observations
  int <lower=0> p;   // number of predictors
  vector [n] readscore; // outcome
  matrix [n,p] X;       //inputs
  real <lower=0> scale_icept ; //prior std for the intercept
  real <lower=0> scale_global ; // scale for the half -t prior for tau
  real <lower=1> nu_global ; // degrees of freedom for the half-t prior for tau
  real <lower=1> nu_local ;  //degrees of freedom for the half-t priors
                               //for lambdas
  real <lower=0> slab_scale ; //  slab scale for the regularized horseshoe
  real <lower=0> slab_df ; //  slab degrees of freedom for the regularized horseshoe
}

parameters {
  real logsigma ;
  real alpha ;
  vector[p] z;
  real <lower=0> tau; // global shrinkage parameter
  vector <lower=0>[p] lambda ; // local shrinkage parameter
  real <lower=0> caux ;
}

transformed parameters {
  real <lower=0> sigma ;  // noise std
  vector <lower=0>[p] lambda_tilde ; // 'truncated ' local shrinkage parameter
  real <lower =0> c; // slab scale
  vector [p] beta ; // regression coefficients
  vector [n] f;     // latent function values
  sigma = exp(logsigma);
  c = slab_scale * sqrt (caux);
  lambda_tilde = sqrt(c^2 * square(lambda) ./ (c^2 + tau^2 * square (lambda)) );
  beta = z .* lambda_tilde*tau;
  f = alpha + X* beta ;
}

model {
// half -t priors for lambdas and tau , and inverse - gamma for c^2
z ~ normal (0, 1);
lambda ~ student_t (nu_local, 0, 1);
tau ~ student_t (nu_global, 0, scale_global * sigma );
caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
alpha ~ normal (0, scale_icept);
readscore ~ normal (f, sigma);

}

"

# Start estimation

nChains = 4
nIter= 5000
thinSteps = 10
warmupSteps = floor(nIter/2)

readscore = data.list$readscore

myfitFinnHorseshoe = stan(data=data.list,model_code=modelString,chains=nChains,
                  iter=nIter,warmup=warmupSteps,thin=thinSteps)

Thanks,

You’re not passing (at least) the required scale_icept tuning parameter

Thanks, Jeffrey, but I don’t know what that means. As I mentioned above, this code is a cut-and-paste from Pirronen and Vehtari, only there may be other features of their input that I didn’t see in the paper.

It seems that the implementation you are using has certain tuning parameters which need to be passed as data into the Stan function. I can’t comment on the exact meaning of the specific tuning parameters unfortunately.

Best,
Jeffrey

@jeffreypullin is right. All the variables declared in the data block need to get data in the data.list. Currently you have only this

Just add to this list also scale_icept, scale_global, nu_global, nu_local, slabe_scale and slab_df with the values you want them to have, or move them to transformed data and give the values there.

2 Likes