Prior selection for the covariance matrix

The posterior mean of \pho is always close to 0. Is there any improvement for the prior selection, such as using inv-Wishart?

My set-up in simulation based on some explore on the survey data:

x<-matrix(rnorm(2000,7.5,1.5),20,100)
z<-matrix(rnorm(2000,5,2),20,100)

sigmab<-0.09164
sigmam<-0.077
pho<- 0.6
Sigma<-matrix(c(sigmab,sqrt(sigmam*sigmab)*pho,sqrt(sigmam*sigmab)*pho,sigmam),2,2)
bm<-MASS::mvrnorm(n = 20, c(0,0), Sigma)
b<-bm[,1]
m<-bm[,2]

mu<-do.call(rbind, lapply(1:20, function (i) {
  exp(1.25+x[i,]*0.19+b[i])/(1+exp(1.25+x[i,]*0.19+b[i]))
})) 

pi<-do.call(rbind, lapply(1:20, function (i) {
  exp(2.16+z[i,]*-.7162+m[i])/(1+ exp(2.16+z[i,]*-.7162+m[i]))
}))

w<-do.call(rbind, lapply(1:20, function (i) { rbinom(100,1,pi[i,])}))

ystar = do.call(rbind, lapply(1:20,
                              function (i) {rbeta(100,mu[i,]*42,42*(1-mu[i,]))}))

y=ifelse(w==1,w,ystar)

###############################################{SAMPLE=1/10 y}

samplex=c()
sampley=c()
samplez=c()
for (i in 1:20) {
  samplei <- sample(1:100,10)
  sampley = c(sampley, y[i,samplei])
  samplex = c(samplex, x[i,samplei])
  samplez = c(samplez, z[i,samplei])
}

areasamp  <- rep(1:20, each =10)
sampleratio <- data.frame(areasamp, samplex,sampley,samplez)
databeta <- subset(sampleratio, sampley <1)
  
sb_list_multi <- list(N = nrow(sampleratio), N1 = sum(sampleratio$sampley < 1) ,
                        y = sampleratio$sampley[ sampleratio$sampley < 1],
                        W = as.numeric(sampleratio$sampley==1),
                        x1 =sampleratio$samplex[sampleratio$sampley < 1],
                        z = sampleratio$samplez,
                        County1 = as.numeric(as.factor(sampleratio$areasamp))[sampleratio$sampley < 1],
                        County = as.numeric(as.factor(sampleratio$areasamp)),m=c(0,0))

sbfit12 <- stan("psudo2.stan", data = sb_list_multi,
                 iter = 2000, warmup = 1000, chains = 3)

And My stan is written as

data{
  int<lower=0> N;
  int<lower = 0> N1;
  vector[N] x;
  vector[N1] y;
  vector[N1] x1;
  int W[N];
  int County[N];
  int County1[N1];
  vector[2] m;
}
parameters{
  real beta1;
  real beta0;
  real<lower = 0> phi;
  real alpha0;
  real alpha1;
  corr_matrix[2] Omega;        // prior correlation
  vector<lower=0>[2] tau;      // prior scale
  vector[2] bu[20];
}
transformed parameters{
  vector[N1] mu;
  for(i in 1:N1){
	  mu[i] = exp(beta0 + beta1*x1[i]+ bu[County1[i],1])/
	  (1 + exp(beta0 + beta1*x1[i]+ bu[County1[i],1]));
  }
}
model{
  beta0 ~ normal(0, 10000);
  beta1 ~ normal(0, 10000);
  alpha0 ~ normal(0, 10000);
  alpha1 ~ normal(0, 10000);
  phi ~ cauchy(0, 10000);
  tau ~ cauchy(0, 2.5);
  Omega ~ lkj_corr(1);
  for(i in 1:N){
	W[i] ~ bernoulli(exp(alpha0 + alpha1*x[i] + bu[County[i],2])/
	(1 + exp(alpha0 + alpha1*x[i]+ bu[County[i],2])) )	;
  }
  for(i in 1:N1){
	y[i] ~ beta(phi*mu[i] , phi*mu[i]*(1 - mu[i])     );
  }
  for(i in 1:20){
  bu[i] ~ multi_normal(m,quad_form_diag(Omega, tau)) ;
  }
}

Your priors are very flat. I’m not sure what your problem is exactly, but this is where I would start. I suspect that switching from an LKJ to inverse Wishart will exacerbate the problem—I think the inverse Wishart approach can be tricky for HMC to navigate (see discussion here).

Thank you for your feedback, I think you are correct here.

And I also don’t want to use any information from prior to effect my results. Using a flat prior is also focusing on this purpose. As we are not sure what are the true parameters and whether there are correlations in the real data.

Extremely weak priors generally aren’t recommended (e.g., Prior Choice Recommendations · stan-dev/stan Wiki · GitHub). Weakly-informative priors would be a better approach, constraining the outcome to only those that are feasible for your response variable. For example, this is what your prior on the slope \beta_1 looks like:

Your area might be different, but I can’t think of anything I would measure in my discipline for which a a unit change of the predictor would plausibly change the outcome by +/-30000. You are right to be concerned about influencing the posterior with priors that are too strict. Just like an overly narrow prior, setting a super weak prior like this will also impact your posterior (and possibly cause problems with the sampler). You’ll end up spending information in your data to rule out parameter values that you know aren’t physically possible. I suggest doing a prior predictive check on your model and narrowing down these distributions until the posterior density covers only outcome values and relationships that are possible.

Good luck! And please open up another thread if you encounter additional problems.