Please I need help. I am working with a multivariate dataset that has random effects and fixed effects. I am using rstan for my analysis and from the result I got, all my fixed effect values are close to nlme result but the random effect values are far away from my true value using the original data set. I generated another dataset with 0.9 correlation value for the random effects but my rstan result gave me values less than 0.1 as my correlation value. \beta_{0i} and logA are my random effect. Here is my model:
log y_{ijk}=\beta_{0i}+\beta_{1j}t^{\gamma_i}_{ijk}+ E_{ijk},
Where E_{ijk}~N(0,\sigma^2),
\beta_{0i}= \mu_0+e_{0i,},
logA_{i}= logA_{0}+e_{1i,}
\beta_{1j}=exp(logA+B*log(x_j))+H*\frac{11605}{T_j+273.15},
\begin{bmatrix} e_{0i}\\ e_{1i}\\ \end{bmatrix}~N\bigg(\begin{bmatrix} 0\\ 0\\ \end{bmatrix}, \begin{bmatrix} \sigma^2_0& \sigma_{01}\\ \sigma{01}&\sigma^2_1\\ \end{bmatrix}\bigg)
.
This is my rstan code:
data {
// Define variables in data
// Number of level-1 observations (an integer)
int<lower=1> N; //number of units
// Number of level-2 clusters
// RH from the data
real RH[N];
// Temp from the data
real T[N];
// Continuous outcome
real y_ijk[N];//number of the outputs
// Continuous predictor
real t_ijk[N];
}
//Prepocessing of the data
transformed data{
real x1[N];
real x2[N];
for(n in 1:N){
x1[n] = log(RH[n]);
x2[n] = 11605/(T[n]+273.15);
}
}
// The parameters accepted by the model.
parameters {
real mu1;
real<lower=3,upper=8> mu2;
// Cholesky factor for the correlation matrix
corr_matrix[2] L_Omega ; //corr_matrix[3] L_Omega;
vector<lower=0,upper=1>[2] L_std;
real<lower=0,upper=1> sigma;
vector[2] mat;
real gamma0;
real delta_H;
real B;
}
//Parameters processing before the posterior is computed
transformed parameters{
matrix[2,2] L_Sigma;
// Cholesky factor for covariance matrix
L_Sigma = quad_form_diag(L_Omega,L_std);
}
model {
// Random effect
real beta0;
real mu[N];
real logA;
// Prior on Cholesk5 decomposition of correlation matrix
L_Omega ~ lkj_corr(2);
// Prior on standard deviations for each variate
L_std[1]~normal(0,0.6);
L_std[2]~normal(0,0.2);
//L_std[2] ~cauchy(0,2.5);
mat ~ multi_normal(rep_vector(0,2), L_Sigma);
sigma ~ gamma(1e-2,1e-2);
mu1 ~ normal(0, sqrt(100));
mu2 ~ normal(0, sqrt(100));
delta_H ~ normal(0, sqrt(100));
B ~ normal(0, sqrt(100));
gamma0~ normal(0, sqrt(100));
for (n in 1:N){
beta0 = mu1+mat[1];
logA = mu2+mat[2];
mu[n] = beta0 + (exp(logA + B*x1[n] + delta_H*x2[n]))* (t_ijk[n])^gamma0;
y_ijk[n] ~ normal(mu[n], sigma);
}
}
}
When I used L_std~cauchy(0,2.5), my variance result was high around 25. That was why I changed it to Normal(0,0.6).
Here is my result from rstan:
mean se_mean sd 2.5%
delta_H -0.5910981015 4.776087e-04 4.046028e-02 -6.747217e-01
B 2.4720683727 2.709032e-03 2.223244e-01 2.048168e+00
sigma 0.5314386063 2.864586e-04 2.507286e-02 4.849404e-01
mu1 2.9116533135 5.798710e-03 4.843099e-01 1.846969e+00
mu2 3.7214743667 7.113636e-03 6.419479e-01 3.024509e+00
L_Sigma[1,1] 0.2287997007 2.973888e-03 2.503332e-01 3.800002e-04
L_Sigma[1,2] 0.0002910327 5.256015e-04 4.497222e-02 -1.020105e-01
L_Sigma[2,1] 0.0002910327 5.256015e-04 4.497222e-02 -1.020105e-01
L_Sigma[2,2] 0.0424841585 7.025839e-04 6.029694e-02 5.602237e-05
L_Omega[1,1] 1.0000000000 NaN 0.000000e+00 1.000000e+00
L_Omega[1,2] 0.0047984285 5.287025e-03 4.498075e-01 -8.068562e-01
L_Omega[2,1] 0.0047984285 5.287025e-03 4.498075e-01 -8.068562e-01
L_Omega[2,2] 1.0000000000 1.002677e-18 8.317936e-17 1.000000e+00
From the result, I want my L_Sigma[2,1] to be within -0.03 and -0.04.
Thank you so much