Rstan error: Mismatched array dimensions.Base type mismatch. Returned expression does not match return type LHS type = real; RHS type = real[ , ] Improper return in body of function

Hi,

This is my first time using Stan in R for HMC and I encountered the error “Mismatched array dimensions. Base type mismatch. Returned expression does not match return type LHS type = real; RHS type = real[ , ] Improper return in body of function.”, which showed at the first line of my second function MUPPpr when I try to build my own model functions to fit data. My full Stan code for my model is shown below:

functions{
  real SSpr(vector a, vector b,vector t, int n_student, int n_item, int n_dimension){

    real num0[n_student, n_item];
    real num1[n_student, n_item];
    real denominator[n_student, n_item];
    real sspr[n_student, n_item];
    real thdim[n_student, n_item];
    real th[n_student, n_dimension];
    int d[n_item];

    return sspr;
  }

  

Can anyone help me with this error? Thanks!

this is a 2-d array of reals, but your function return type is scalar real.
it shoulds be real [ , ].

I formatted your program for readability. I tried to get just the first function to compile, but there are mistakes at lines 20,21 with respect to your use of the exp function.

are you trying to adopt something written for BUGS or JAGS to Stan? there’s a section in the Stan user’s manual which addresses the differences. https://mc-stan.org/docs/2_23/stan-users-guide/some-differences-in-the-modeling-languages.html
there’s a lot of other good stuff in the manual too.

Hi,

Thank you for your help. Two questions: (1) should I add a [,] to the return line after the ‘return’ or other lines? (2) does stan have the build-in exp function or not?
I wrote the codes specifically to use Stan and I thought this is the Stan code format cause I followed the blocks section in the user guide.
Thank you!

  1. the error message location - line 1 - indicates that the problem is at line 1 - the function signature. as written -
real SSpr(vector a, vector
^^^^

this says that runction SSpr returns a scalar of type real. it returns a 2-d array of reals. therefore the return type should be real [ , ]

  1. yes, Stan has a built-in exp function.

I’m sorry to keep saying “read the manual”, but in this case, you should look at the User’s guide sections on programming for explanations and examples.

I’m far from an expert on what your trying to do - Google search on MUPP confirms that this is a kind of IRT model - in which case, here are some resources that might help: https://github.com/paul-buerkner/thurstonianIRT, https://education-stan.github.io

Thank you for your help!

Hi, I was reading the user guide and I fixed the previous error, but I encountered a new error and I believe you have the answer: “probability function must end in _l pdf or _l pmf. Found distribution family = MUPPpr() with no corresponding probability function…”. However, I defined a MUPPpr function, and I fitted my data res[i,j] to the defined function, but I noticed all the examples in the user guide fit their data to distribution functions (e.g., Bernoulli). I don’t think there is any built-in function for my data so I defined one my own, which is a function of simulating the response data instead of a distribution function. Is there any format to fit my data to my own non-distribution function? or the function to fit the data have to be a distribution function? Below are the code I used to fit the model and MUPPpr is the function I defined:

 for (i in 1:n_student) {
     for(j in 1:n_pair){ 
       res[i,j] ~ MUPPpr(a[j], b[j], t[j], theta[i,j], n_student, n_item, n_dimension, n_pair);
     }
   }

Thanks!

agreed, this feature of the language is confusing - here’s the relevant piece of the puzzle from the Stan Reference Manual section on the sampling statement: https://mc-stan.org/docs/2_23/reference-manual/sampling-statements-section.html

In general, a sampling statement of the form

y ~ dist(theta1, ..., thetaN);

involving subexpressions y and theta1 through thetaN (including the case where N is zero) will be well formed if and only if the corresponding assignment statement is well-formed. For densities allowing real y values, the log probability density function is used,

target += dist_lpdf(y | theta1, ..., thetaN);

This means that you need to define a function MUPPpr_lpdf which takes as its first argument the data variable, and then you call it in a sampling statement as MUPPpr.

Thank you for the answer! I guess I need to read more to use Stan…