Student_t_cdf vectorization issue

donghyeon_ko · July 12, 2017, 2:23am

Hi, I am trying to write the stan file for robit model which is a probit model that uses student_t_cdf instead of normal cdf.

My code works fine but since it is very slow, I tried to use vectorization.

It doesn’t work for some reason.

Can I not use vectorization for student_t_cdf?

Here is my code that works fine.

data {
int<lower=0> N; // number of obs
int<lower=0> K; // number of predictors
int<lower=0,upper=1> u[N]; // outcomes
matrix[N, K] v; // predictor variables
}
parameters {
vector[K] beta; // beta coefficients
real<lower=1> df;
}
model {
vector[N] mu;
beta ~ normal(0, 10); // normal priors for betas
df ~ gamma(2, 0.1);
mu = v*beta;
for (n in 1:N) mu[n] = student_t_cdf(mu[n], df, 0, 1);
u ~ bernoulli(mu);
}

And this is the vectorization code that I tried.

data {
int<lower=0> N; // number of obs
int<lower=0> K; // number of predictors
int<lower=0,upper=1> u[N]; // outcomes
matrix[N, K] v; // predictor variables
}
parameters {
vector[K] beta; // beta coefficients
real<lower=1> df;
}
model {
vector[N] mu;
beta ~ normal(0, 10); // normal priors for betas
df ~ gamma(2, 0.1);
mu = v*beta;
mu = student_t_cdf(mu, df, 0, 1);
u ~ bernoulli(mu);
}

And this is the error message I get.

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

base type mismatch in assignment; variable name = mu, type = vector; right-hand side type=real
error in ‘model276074a12540_robit’ at line 16, column 7

14:   df ~ gamma(2, 0.1);
15:   mu = v*beta; 
16:   mu = student_t_cdf(mu, df, 0, 1);
          ^
17:   u ~ bernoulli(mu);

PARSER EXPECTED:
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘robit’ due to the above error.
In addition: Warning message:
In readLines(file, warn = TRUE) :
incomplete final line found on ‘C:\Users\g1310\Desktop\Stan\robit.stan’

Or is there any other way that I can speed up my model?

bgoodri · July 12, 2017, 2:32am

The student_t_cdf outputs a scalar number (the product of the CDFs when the input is a vector). Even if there were a version of the Student t CDF that input a vector and output a vector, it would entail the same loop in C++ as the one you wrote, so there would be no difference in execution speed; just less typing.

Models like this are slow because estimating the degrees of freedom is hard. Unless you really think a Cauchy is possible, you often get better results by bumping the lower limit on df up a bit.

Topic		Replies	Views
[Re-Post] Rejecting Initial Value (Student_t distribution) Modeling	8	486	June 20, 2018
No vector return from normal_cdf with vector arguments General	3	485	January 10, 2024
Proper Vectorization of Model Modeling	3	446	May 6, 2019
Vectorized != loop Modeling performance	12	953	December 29, 2020
Is there a way to vectorize my code that contains a custom distribution? Modeling rstan , techniques , specification , performance	1	280	October 4, 2023

Student_t_cdf vectorization issue

base type mismatch in assignment; variable name = mu, type = vector; right-hand side type=real error in ‘model276074a12540_robit’ at line 16, column 7

Related topics

base type mismatch in assignment; variable name = mu, type = vector; right-hand side type=real
error in ‘model276074a12540_robit’ at line 16, column 7