Multivariate nonlinear mixed model using rstan

Good moring everyone,

Please I need help. I am working with a multivariate dataset that has random effects and fixed effects. I am using rstan for my analysis and from the result I got, all my fixed effect values are close to nlme result but the random effect values are far away from my true value using the original data set. I generated another dataset with 0.9 correlation value for the random effects but my rstan result gave me values less than 0.1 as my correlation value. \beta_{0i} and logA are my random effect. Here is my model:
log y_{ijk}=\beta_{0i}+\beta_{1j}t^{\gamma_i}_{ijk}+ E_{ijk},
Where E_{ijk}~N(0,\sigma^2),
\beta_{0i}= \mu_0+e_{0i,},
logA_{i}= logA_{0}+e_{1i,}
\begin{bmatrix} e_{0i}\\ e_{1i}\\ \end{bmatrix}~N\bigg(\begin{bmatrix} 0\\ 0\\ \end{bmatrix}, \begin{bmatrix} \sigma^2_0& \sigma_{01}\\ \sigma{01}&\sigma^2_1\\ \end{bmatrix}\bigg)
This is my rstan code:

data {
  // Define variables in data
  // Number of level-1 observations (an integer)
  int<lower=1> N; //number of units
  // Number of level-2 clusters
  // RH from the data
  real RH[N];
  // Temp from the data
  real T[N];
  // Continuous outcome
  real y_ijk[N];//number of the outputs
  // Continuous predictor
  real t_ijk[N];

//Prepocessing of the data
transformed data{
  real x1[N];
  real x2[N];
  for(n in 1:N){

      x1[n] = log(RH[n]);
      x2[n] = 11605/(T[n]+273.15);


// The parameters accepted by the model.
parameters {
      real mu1;
  real<lower=3,upper=8> mu2;
  // Cholesky factor for the correlation matrix
 corr_matrix[2] L_Omega ;  //corr_matrix[3] L_Omega;
  vector<lower=0,upper=1>[2] L_std;
  real<lower=0,upper=1> sigma;
  vector[2] mat;
real gamma0;
  real delta_H;
  real B;



//Parameters processing before the posterior is computed
transformed parameters{

  matrix[2,2] L_Sigma;
    // Cholesky factor for covariance matrix
  L_Sigma =  quad_form_diag(L_Omega,L_std);


model {
  // Random effect

  real beta0;
  real mu[N];
   real logA;
  // Prior on Cholesk5 decomposition of correlation matrix
  L_Omega ~ lkj_corr(2);
  // Prior on standard deviations for each variate
  //L_std[2] ~cauchy(0,2.5);
  mat ~ multi_normal(rep_vector(0,2), L_Sigma);

  sigma ~ gamma(1e-2,1e-2);
  mu1 ~ normal(0, sqrt(100));
  mu2 ~ normal(0, sqrt(100));
  delta_H ~ normal(0, sqrt(100));
  B ~ normal(0, sqrt(100));
  gamma0~ normal(0, sqrt(100));

  for (n in 1:N){
      beta0 = mu1+mat[1];
     logA = mu2+mat[2]; 
      mu[n] = beta0 + (exp(logA + B*x1[n] + delta_H*x2[n]))* (t_ijk[n])^gamma0;

     y_ijk[n] ~ normal(mu[n], sigma);


When I used L_std~cauchy(0,2.5), my variance result was high around 25. That was why I changed it to Normal(0,0.6).
Here is my result from rstan:

                      mean      se_mean           sd          2.5%
delta_H      -0.5910981015 4.776087e-04 4.046028e-02 -6.747217e-01
B             2.4720683727 2.709032e-03 2.223244e-01  2.048168e+00
sigma         0.5314386063 2.864586e-04 2.507286e-02  4.849404e-01
mu1           2.9116533135 5.798710e-03 4.843099e-01  1.846969e+00
mu2           3.7214743667 7.113636e-03 6.419479e-01  3.024509e+00
L_Sigma[1,1]  0.2287997007 2.973888e-03 2.503332e-01  3.800002e-04
L_Sigma[1,2]  0.0002910327 5.256015e-04 4.497222e-02 -1.020105e-01
L_Sigma[2,1]  0.0002910327 5.256015e-04 4.497222e-02 -1.020105e-01
L_Sigma[2,2]  0.0424841585 7.025839e-04 6.029694e-02  5.602237e-05
L_Omega[1,1]  1.0000000000          NaN 0.000000e+00  1.000000e+00
L_Omega[1,2]  0.0047984285 5.287025e-03 4.498075e-01 -8.068562e-01
L_Omega[2,1]  0.0047984285 5.287025e-03 4.498075e-01 -8.068562e-01
L_Omega[2,2]  1.0000000000 1.002677e-18 8.317936e-17  1.000000e+00

From the result, I want my L_Sigma[2,1] to be within -0.03 and -0.04.

Thank you so much

1 Like

I think the issue here is that the model as you’ve constructed it expresses a latent multivariate normal for the variable mat and furthermore has only one “observation” to inform on the parameters of that multivariate normal, so it just doesn’t have enough contact with new data to cause it to move away from the zero-centered prior. I’m not sure why the posterior on the correlation nonetheless seems to narrow considerably compared to the prior.

Are you sure that mat shouldn’t be an N \times 2 matrix maybe?

(Also, your comments refer to Cholesky factors, but you’ve actually structured the model in the regular non-Cholesky form)

(oh, and your formulas express the model in terms of log(y_ijk), but the model has just y_ijk; did you log transform the data before passing it into the model?)

1 Like

Thank you for your reply. The “mat” is a vector of 2. Yes, I took the log of y_ijk before passing it into the model. Do you suggest I use the Cholesky factors?

But shouldn’t mat be a matrix with 2 columns and i rows?

Changing to the cholesky representation will make it sample faster, but you first need to get it working properly, so don’t bother changing yet.