Greetings all. I have implemented the Finnish horseshoe prior as shown in Appendix C of Piironen & Vehtari (2017). Here is the code including the data setup in R
PISA18sampleScale <- read.csv(file.choose(),header=T)
PISA18sampleScale <- as.matrix(PISA18sampleScale)
n <- nrow(PISA18sampleScale)
p <- ncol(X)
X <- PISA18sampleScale[,2:11]
readscore <- as.matrix(PISA18sampleScale[,1])
data.list <- list(n=n, p=p, X=X, readscore=readscore)
# Stan code adapated from Vehtari, Appendix C. #
modelString = "
data {
int <lower=0> n; // number of observations
int <lower=0> p; // number of predictors
vector [n] readscore; // outcome
matrix [n,p] X; //inputs
real <lower=0> scale_icept ; //prior std for the intercept
real <lower=0> scale_global ; // scale for the half -t prior for tau
real <lower=1> nu_global ; // degrees of freedom for the half-t prior for tau
real <lower=1> nu_local ; //degrees of freedom for the half-t priors
//for lambdas
real <lower=0> slab_scale ; // slab scale for the regularized horseshoe
real <lower=0> slab_df ; // slab degrees of freedom for the regularized horseshoe
}
parameters {
real logsigma ;
real alpha ;
vector[p] z;
real <lower=0> tau; // global shrinkage parameter
vector <lower=0>[p] lambda ; // local shrinkage parameter
real <lower=0> caux ;
}
transformed parameters {
real <lower=0> sigma ; // noise std
vector <lower=0>[p] lambda_tilde ; // 'truncated ' local shrinkage parameter
real <lower =0> c; // slab scale
vector [p] beta ; // regression coefficients
vector [n] f; // latent function values
sigma = exp(logsigma);
c = slab_scale * sqrt (caux);
lambda_tilde = sqrt(c^2 * square(lambda) ./ (c^2 + tau^2 * square (lambda)) );
beta = z .* lambda_tilde*tau;
f = alpha + X* beta ;
}
model {
// half -t priors for lambdas and tau , and inverse - gamma for c^2
z ~ normal (0, 1);
lambda ~ student_t (nu_local, 0, 1);
tau ~ student_t (nu_global, 0, scale_global * sigma );
caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
alpha ~ normal (0, scale_icept);
readscore ~ normal (f, sigma);
}
"
# Start estimation
nChains = 4
nIter= 5000
thinSteps = 10
warmupSteps = floor(nIter/2)
readscore = data.list$readscore
myfitFinnHorseshoe = stan(data=data.list,model_code=modelString,chains=nChains,
iter=nIter,warmup=warmupSteps,thin=thinSteps)
I am receiving the following error message which I take to be a mismatch of vectors, matrices, and reals, however I don't see the problem. Advice always appreciated.
```stan
Real return type required for probability function.
error in 'modelfd12100ec98a_f72b37525158f4bfd223f8ff542529cf' at line 45, column 31
-------------------------------------------------
43: caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
44: alpha ~ normal (0, scale_icept );
45: readscore ~ normal (f, sigma );
^
46:
------
I’m able to compile your model under both rstan
and cmdstanr
without issue. Can you try running it in a fresh R session and seeing if the problem remains?
1 Like
Hi Andrew. Thanks for your email. I opened a new R session and I’m now getting this error.
Error in mod$fit_ptr() :
Exception: mismatch in number dimensions declared and found in context; processing stage=data initialization; variable name=readscore; dims declared=(100); dims found=(100,1) (in ‘model44431ccf617_a6f8150b953ec2755d1eae81cfbbc115’ at line 5)
failed to create the sampler; sampling not done
That indicates that the data you’re passing from R is formatted as a one-column matrix rather than a vector
Thanks again. That error was solved but here is another one.
Error in mod$fit_ptr() :
Exception: variable does not exist; processing stage=data initialization; variable name=scale_icept; base type=double (in ‘model6e3603ba993_a6f8150b953ec2755d1eae81cfbbc115’ at line 7)
failed to create the sampler; sampling not done
This code is a direct copy and past from Piironen and Vehatari and modified to deal with my specific data. You’ve been very kind with answering my questions, but I don’t want to keep bothering you. If there is some generic issue with the code that is the problem, do let me know.
Here is the Stan code again for your convenience.
PISA18sampleScale <- read.csv(file.choose(),header=T)
PISA18sampleScale <- as.matrix(PISA18sampleScale)
n <- nrow(PISA18sampleScale)
X <- PISA18sampleScale[,2:11]
p <- ncol(X)
readscore <- as.vector(PISA18sampleScale[,1])
data.list <- list(n=n, p=p, X=X, readscore=readscore)
# Stan code adapated from Vehtari, Appendix C. #
modelString = "
data {
int <lower=0> n; // number of observations
int <lower=0> p; // number of predictors
vector [n] readscore; // outcome
matrix [n,p] X; //inputs
real <lower=0> scale_icept ; //prior std for the intercept
real <lower=0> scale_global ; // scale for the half -t prior for tau
real <lower=1> nu_global ; // degrees of freedom for the half-t prior for tau
real <lower=1> nu_local ; //degrees of freedom for the half-t priors
//for lambdas
real <lower=0> slab_scale ; // slab scale for the regularized horseshoe
real <lower=0> slab_df ; // slab degrees of freedom for the regularized horseshoe
}
parameters {
real logsigma ;
real alpha ;
vector[p] z;
real <lower=0> tau; // global shrinkage parameter
vector <lower=0>[p] lambda ; // local shrinkage parameter
real <lower=0> caux ;
}
transformed parameters {
real <lower=0> sigma ; // noise std
vector <lower=0>[p] lambda_tilde ; // 'truncated ' local shrinkage parameter
real <lower =0> c; // slab scale
vector [p] beta ; // regression coefficients
vector [n] f; // latent function values
sigma = exp(logsigma);
c = slab_scale * sqrt (caux);
lambda_tilde = sqrt(c^2 * square(lambda) ./ (c^2 + tau^2 * square (lambda)) );
beta = z .* lambda_tilde*tau;
f = alpha + X* beta ;
}
model {
// half -t priors for lambdas and tau , and inverse - gamma for c^2
z ~ normal (0, 1);
lambda ~ student_t (nu_local, 0, 1);
tau ~ student_t (nu_global, 0, scale_global * sigma );
caux ~ inv_gamma (0.5* slab_df , 0.5* slab_df );
alpha ~ normal (0, scale_icept);
readscore ~ normal (f, sigma);
}
"
# Start estimation
nChains = 4
nIter= 5000
thinSteps = 10
warmupSteps = floor(nIter/2)
readscore = data.list$readscore
myfitFinnHorseshoe = stan(data=data.list,model_code=modelString,chains=nChains,
iter=nIter,warmup=warmupSteps,thin=thinSteps)
Thanks,
You’re not passing (at least) the required scale_icept
tuning parameter
Thanks, Jeffrey, but I don’t know what that means. As I mentioned above, this code is a cut-and-paste from Pirronen and Vehtari, only there may be other features of their input that I didn’t see in the paper.
It seems that the implementation you are using has certain tuning parameters which need to be passed as data into the Stan function. I can’t comment on the exact meaning of the specific tuning parameters unfortunately.
Best,
Jeffrey
@jeffreypullin is right. All the variables declared in the data block need to get data in the data.list. Currently you have only this
Just add to this list also scale_icept
, scale_global
, nu_global
, nu_local
, slabe_scale
and slab_df
with the values you want them to have, or move them to transformed data and give the values there.
2 Likes