I am writing a stan model and comparing the performance with stan_glmer. I used the same prior for both models. However, stan_glmer tends to have a smaller standard deviation. I am wondering if anyone could help to explain why this is the case? Thank you so much!

My stan model is as follows:

data {
int<lower=1> N; //number of data points
real y[N];
real x[N];
int<lower=1> J; // number of group
int<lower=1, upper=J> id[N]; // vector of group indices
}
parameters {
real<lower=0> sigma; // sigma of y
vector<lower=0>[2] sigma_u; // sigma of intercept and slope for random effects
cholesky_factor_corr[2] L_u; // Cholesky decomposition of the group correlation matrix
matrix[2,J] z_u; // intercept and slope for random effects
vector[2] beta; // fixed effects intercept and slope
}
transformed parameters{
matrix[2,J] u; // intercepts and slopes of random effects of J pairs
u = diag_pre_multiply(sigma_u, L_u) * z_u; // generates varying intercepts and slopes from joint probability distribution
}
model {
real mu;
//priors
L_u ~ lkj_corr_cholesky(2.0);
to_vector(z_u) ~ normal(0,1); // convert the matrix z_uu to a column vector in column major order.
sigma ~ exponential(2);
beta ~ multi_normal(rep_vector(0,K),diag_matrix(rep_vector(2,K))*square(25));
//likelihood
for (i in 1:N){
mu = beta[1] + u[1,id[i]]+ (beta[2] + u[2,id[i]]) * x[i];
y[i] ~ normal(mu, sigma);
}
}

Thank you so much for taking a look at it! I tried to rewrite my stan model and reparametrize the covariance matrix of the random effect. I am wondering if in this case both stan_glmer and my model have the same prior? Thank you!

“”"
data {
int<lower=1> N; //number of data points
real y[N]; // time points, TX or T0
real x[N]; // log count
int<lower=1> J; // number of group
int<lower=1> K; // number of regression coefficients (K = 2)
int<lower=1, upper=J> id[N]; // vector of group indices
}
parameters {
real<lower=0> sigma; // sigma of y
vector<lower=0>[2] sigma_u; // sigma of intercept and slope for random effects
cholesky_factor_corr[2] L_u; // Cholesky decomposition of the group correlation matrix
// the square of the correlation matrix
matrix[2,J] z_u; // intercept and slope for random effects
vector[2] beta; // fixed effects intercept and slope
}

transformed parameters{
matrix[2,J] u; // intercepts and slopes of random effects of J pairs
u = diag_pre_multiply(sigma_u, L_u) * z_u; // generates varying intercepts and slopes from joint probability distribution
}

model {
real mu;
//priors
L_u ~ lkj_corr_cholesky(1.0);
//Our choice of 2.0 implies that no prior info about the correlation btw intercepts and slopes
to_vector(z_u) ~ normal(0,1); // convert the matrix z_uu to a column vector in column major order.
sigma ~ exponential(2);
//beta ~ multi_normal(rep_vector(0,K),diag_matrix(rep_vector(2,K))*square(25));
beta[1] ~ normal(0,5.1);
beta[2] ~ normal (0,0.85);
//likelihood
for (i in 1:N){
mu = beta[1] + u[1,id[i]]+ (beta[2] + u[2,id[i]]) * x[i];
y[i] ~ normal(mu, sigma);
}
}