I am struggling with defining Dirichlet priors in my model. I am getting an error in parameter section in the code. The error is:
Ill-formed expression. Found identifier. There are many ways to complete this to a well-formed expression.
The following is my stan code.
// Create user define function
functions{
real model_log(matrix dat, matrix theta, matrix alpha){
vector[rows(dat)] prob;
real t;
real x;
real out;
real a;
real f;
real w;
real r;
real m;
for (i in 1:rows(dat)){
t <- dat[i, 1];
x <- dat[i, 2];
a <- theta[i, 1];
f <- theta[i, 2];
w <- alpha[i, 1];
r <- alpha[i, 2];
m <- alpha[i, 1];
if(t == 1){
prob[i] <- a^x * f^(1 - x);
}else if(fmod(t, 2) == 0){
prob[i] <- ((1- a - f) * (r)^(t/2.0 - 1.0) * (r)^(t/2.0 - 1.0) * (m)^x * (w)^(1-x));
}else {
prob[i] <- ((1- a - f) * (r)^(t/2.0 - 0.5) * (r)^(t/2.0 - 1.5) * (m)^(1 - x) * (w)^x);
}
}
out <- sum(log(prob));
return out;
}
}
// The input data is a vector 'y' of length 'N'.
data {
int N; // number of observations (rally lengths)
matrix [N, 2] dat; // touches and indicator
int <lower = 2> m;
}
parameters {
simplex [m] theta_1 [N];
simplex [m] theta_2 [N];
}
// The model to be estimated.
model {
// Define priors
//
for (n in 1:N){
theta_1[n] ~ dirichlet(concentration = 1);
}
for (n in 1:N){
theta_2[n] ~ dirichlet(concentration = 1);
}
dat ~ model(theta_1, theta_2);
}
I am new to stan and can I get a help on this to understand how stan works on simplex type data. I think my error comes in defining the parameter types.
Thanks for the response. I tried that. But still getting the same error. Also, the line number of the error comes from parameter block. That is why I thought the way I defined the simplex is wrong. I am not sure.
Stan does not have named function arguments, you canât write dirichlet(concentration = 1) (and in the documentation the argument to dirichlet is called alpha, not concentration, not that it mattersâŚ)
The correct thing to do is
for (n in 1:N){
theta_1[n] ~ dirichlet(rep_vector(1, m));
}
for (n in 1:N){
theta_2[n] ~ dirichlet(rep_vector(1, m));
}
No, tilde statements expect the function name without a suffix.
Omit the suffix unless youâre writing target += ....
Also, the _log suffix is deprecated, you should probably use _lpdf instead.
functions {
real model_lpdf(matrix dat, matrix theta, matrix alpha) {
...
}
model {
...
dat ~ model(theta_1, theta_2);
}
Oh duh. Donât mind me yâall! Pay attention to @nhuurre instead! I forgot that _log can replace _lpdf and was careless in my assessment of the problem!
Stan is very strict about argument types. You declare the function with matrix arguments but theta_1 and theta_2 are arrays of vectors. Arrays and matrixes are considered different types even though in practice they are very similar.
The model compiles for me after changing the argument types from matrix to vector[] (that [] means array).
functions{
real tenismodel_lpdf(matrix dat, vector[] theta, vector[] alpha){
...
Thanks for this explanation. This is something that I wanted to know. I think now my code is working.
How can we get to know what is the type of the variable?. As an example here how do we know theta_1 and theta_2 are arrays?. I searched in stan guide. I could not find the variable types that output from Dirichlet distribution. I thought is is a matrix.
Yes it is. And more specifically it is a logarithmic probability density function. Tilde ~ statements compute probabilities. As I said the distinction between _log and _lpdf doesnât really matter even though the latter is recommended.