Prior selection for the covariance matrix

zzhou93 · April 17, 2024, 9:56pm

The posterior mean of \pho is always close to 0. Is there any improvement for the prior selection, such as using inv-Wishart?

My set-up in simulation based on some explore on the survey data:

x<-matrix(rnorm(2000,7.5,1.5),20,100)
z<-matrix(rnorm(2000,5,2),20,100)

sigmab<-0.09164
sigmam<-0.077
pho<- 0.6
Sigma<-matrix(c(sigmab,sqrt(sigmam*sigmab)*pho,sqrt(sigmam*sigmab)*pho,sigmam),2,2)
bm<-MASS::mvrnorm(n = 20, c(0,0), Sigma)
b<-bm[,1]
m<-bm[,2]

mu<-do.call(rbind, lapply(1:20, function (i) {
  exp(1.25+x[i,]*0.19+b[i])/(1+exp(1.25+x[i,]*0.19+b[i]))
})) 

pi<-do.call(rbind, lapply(1:20, function (i) {
  exp(2.16+z[i,]*-.7162+m[i])/(1+ exp(2.16+z[i,]*-.7162+m[i]))
}))

w<-do.call(rbind, lapply(1:20, function (i) { rbinom(100,1,pi[i,])}))

ystar = do.call(rbind, lapply(1:20,
                              function (i) {rbeta(100,mu[i,]*42,42*(1-mu[i,]))}))

y=ifelse(w==1,w,ystar)

###############################################{SAMPLE=1/10 y}

samplex=c()
sampley=c()
samplez=c()
for (i in 1:20) {
  samplei <- sample(1:100,10)
  sampley = c(sampley, y[i,samplei])
  samplex = c(samplex, x[i,samplei])
  samplez = c(samplez, z[i,samplei])
}

areasamp  <- rep(1:20, each =10)
sampleratio <- data.frame(areasamp, samplex,sampley,samplez)
databeta <- subset(sampleratio, sampley <1)
  
sb_list_multi <- list(N = nrow(sampleratio), N1 = sum(sampleratio$sampley < 1) ,
                        y = sampleratio$sampley[ sampleratio$sampley < 1],
                        W = as.numeric(sampleratio$sampley==1),
                        x1 =sampleratio$samplex[sampleratio$sampley < 1],
                        z = sampleratio$samplez,
                        County1 = as.numeric(as.factor(sampleratio$areasamp))[sampleratio$sampley < 1],
                        County = as.numeric(as.factor(sampleratio$areasamp)),m=c(0,0))

sbfit12 <- stan("psudo2.stan", data = sb_list_multi,
                 iter = 2000, warmup = 1000, chains = 3)

And My stan is written as

data{
  int<lower=0> N;
  int<lower = 0> N1;
  vector[N] x;
  vector[N1] y;
  vector[N1] x1;
  int W[N];
  int County[N];
  int County1[N1];
  vector[2] m;
}
parameters{
  real beta1;
  real beta0;
  real<lower = 0> phi;
  real alpha0;
  real alpha1;
  corr_matrix[2] Omega;        // prior correlation
  vector<lower=0>[2] tau;      // prior scale
  vector[2] bu[20];
}
transformed parameters{
  vector[N1] mu;
  for(i in 1:N1){
	  mu[i] = exp(beta0 + beta1*x1[i]+ bu[County1[i],1])/
	  (1 + exp(beta0 + beta1*x1[i]+ bu[County1[i],1]));
  }
}
model{
  beta0 ~ normal(0, 10000);
  beta1 ~ normal(0, 10000);
  alpha0 ~ normal(0, 10000);
  alpha1 ~ normal(0, 10000);
  phi ~ cauchy(0, 10000);
  tau ~ cauchy(0, 2.5);
  Omega ~ lkj_corr(1);
  for(i in 1:N){
	W[i] ~ bernoulli(exp(alpha0 + alpha1*x[i] + bu[County[i],2])/
	(1 + exp(alpha0 + alpha1*x[i]+ bu[County[i],2])) )	;
  }
  for(i in 1:N1){
	y[i] ~ beta(phi*mu[i] , phi*mu[i]*(1 - mu[i])     );
  }
  for(i in 1:20){
  bu[i] ~ multi_normal(m,quad_form_diag(Omega, tau)) ;
  }
}

wpetry · April 18, 2024, 1:46am

zzhou93:

  beta0 ~ normal(0, 10000);
  beta1 ~ normal(0, 10000);
  alpha0 ~ normal(0, 10000);
  alpha1 ~ normal(0, 10000);
  phi ~ cauchy(0, 10000);
  tau ~ cauchy(0, 2.5);
  Omega ~ lkj_corr(1);

Your priors are very flat. I’m not sure what your problem is exactly, but this is where I would start. I suspect that switching from an LKJ to inverse Wishart will exacerbate the problem—I think the inverse Wishart approach can be tricky for HMC to navigate (see discussion here).

zzhou93 · April 18, 2024, 3:44am

Thank you for your feedback, I think you are correct here.

And I also don’t want to use any information from prior to effect my results. Using a flat prior is also focusing on this purpose. As we are not sure what are the true parameters and whether there are correlations in the real data.

wpetry · April 18, 2024, 12:20pm

Extremely weak priors generally aren’t recommended (e.g., Prior Choice Recommendations · stan-dev/stan Wiki · GitHub). Weakly-informative priors would be a better approach, constraining the outcome to only those that are feasible for your response variable. For example, this is what your prior on the slope \beta_1 looks like:

Your area might be different, but I can’t think of anything I would measure in my discipline for which a a unit change of the predictor would plausibly change the outcome by +/-30000. You are right to be concerned about influencing the posterior with priors that are too strict. Just like an overly narrow prior, setting a super weak prior like this will also impact your posterior (and possibly cause problems with the sampler). You’ll end up spending information in your data to rule out parameter values that you know aren’t physically possible. I suggest doing a prior predictive check on your model and narrowing down these distributions until the posterior density covers only outcome values and relationships that are possible.

Good luck! And please open up another thread if you encounter additional problems.

Topic		Replies	Views
Prior for a covariance matrix Modeling techniques , fitting-issues , specification , math	9	2071	July 5, 2020
Trouble with prior selection for multivariate normal-inverse wishart analysis Modeling	11	3222	October 22, 2018
Comparing Wishart prior against LKJ prior Modeling rstan , covariance , prior-choice , multivariate-normal	5	1365	July 4, 2022
Initialization failed - symmetric and positive definite error Modeling techniques , fitting-issues , specification	3	1139	October 2, 2017
Sampling problems by using inv_wishart(...) prior Modeling	2	830	November 8, 2018

Prior selection for the covariance matrix

Related topics