What's wrong with my stan code for estimating parameters with multivariate distribution

Hello, everyone. I have a problem that X,Y \sim N(\mu,\Sigma), where the TRUE value is that \mu=(10,20) and \Sigma=(\begin{matrix}10&4\\4&30\end{matrix}).

I KNOW how to estimate with stan-defined functions and now i want to practice in user-defined functions. Now here is my stan code:

code='
  functions{
    real fun_log(vector X,vector Y,real mu1,real mu2,real sigma1, real sigma2, real rho){
    vector [num_elements(X)] prob;
    real lprob;
    for (i in 1:num_elements(X)){
     ##Likelihood function
      prob[i] <- (2*pi()*sigma1*sigma2*sqrt(1-rho^2))^(-1)*exp(-1/2/(1-rho^2)*((X[i]-mu1)^2/sigma1^-2*rho*(X[i]-mu1)*(Y[i]-mu2)/sigma1/sigma2+(Y[i]-mu2)^2/sigma2^2));
    }
    lprob <- sum(log(prob));
    return lprob;
    }
  }
data{
  int N;
  vector[N] X;
  vector[N] Y;
}
parameters{
  real<lower=0> rho;
  real<lower=0> sigma1;
  real<lower=0> sigma2;
  real mu1;
  real mu2;
}
model{
append_row(X,Y)~fun(rho,sigma1,sigma2,mu1,mu2);
}
'

But it says that

Unknown variable: append_row
No matches for: 
  vector ~ fun(real, real, real, real, real)

Can some one help me to debug the model? Thanks!

Judging from the error message, maybe you need to declare a local variable for the return value of append_row, and have this as the target of your sampling statement?

Also, your return from the function is a single real, so it won’t accept a vector as target?

I’m pretty sure that the problem is that fun_log is defined incorrectly. In a log-probability function, all arguments except the first are the parameters to the function. Here, Y is treated as a parameter, so it belongs on the right-hand side of the β€œ~” operator, as part of the arguments to fun. In short, given how you defined fun_log, the corresponding sampling statement has to be

X ~ fun(Y, rho,sigma1,sigma2,mu1,mu2);

What you probably really want to do is define fun_log like this:

real fun_log(matrix XY, real mu1,real mu2,real sigma1, real sigma2, real rho){
    vector [cols(XY)] prob;
    real lprob;
    for (i in 1:cols(XY)){
     ##Likelihood function
      prob[i] <- (2*pi()*sigma1*sigma2*sqrt(1-rho^2))^(-1)*exp(-1/2/(1-rho^2)*((XY[1,i]-mu1)^2/sigma1^-2*rho*(XY[1,i]-mu1)*(XY[2,i]-mu2)/sigma1/sigma2+(XY[2,i]-mu2)^2/sigma2^2));
    }
    lprob <- sum(log(prob));
    return lprob;
    }

Then define the data block as follows:

data{
  int N;
  matrix[2, N] XY;
}

The sampling statement in your model would then be

XY ~ fun(rho,sigma1,sigma2,mu1,mu2);
1 Like