You can do either but they’re not quite the same. In option 1 you induce a dependence across beta1 and beta2. This may or may not be appropriate.
Let’s say I simulate 1000 observations from y \sim \mathcal{N}(3, 1). I put the cauchy prior from option 1. This is the stan model
data {
int<lower=0> N;
vector[N] y;
}
parameters {
real<lower=-pi()/2, upper=pi()/2> beta_unif;
}
transformed parameters {
real beta1 = tan(beta_unif);
real beta2 = tan(beta_unif);
}
model {
y ~ normal(beta1, 1);
}
and the R code with corresponding output
library(rstan)
library(MASS)
options(mc.cores = parallel::detectCores())
set.seed(689934)
N <- 1000
y <- rnorm(N, 3, 1)
stan_rdump(c("N", "y"), file = "noniid_data.R")
input_data <- read_rdump("noniid_data.R")
fit <- stan(file='iid_prior.stan', data=input_data, seed=483892929)
print(fit)
You can see that although the prior was specified as a cauchy(0, 1), beta_unif
takes on a mean value of 1.25 and both beta1 and beta2 are equal.
Inference for Stan model: noniid_prior.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
beta_unif 1.25 0.00 0.00 1.24 1.25 1.25 1.25 1.25 1477 1
beta1 3.00 0.00 0.03 2.94 2.98 3.00 3.02 3.06 1477 1
beta2 3.00 0.00 0.03 2.94 2.98 3.00 3.02 3.06 1477 1
lp__ -522.40 0.02 0.67 -524.39 -522.52 -522.14 -521.98 -521.93 1744 1
Samples were drawn using NUTS(diag_e) at Thu Dec 31 07:09:32 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
If instead, you keep separate priors, on beta1 and 2 by declaring a size 2 array of reals on beta_unif,
data {
int<lower=0> N;
vector[N] y;
}
parameters {
real<lower=-pi()/2, upper=pi()/2> beta_unif[2];
}
transformed parameters {
real beta1 = tan(beta_unif[1]);
real beta2 = tan(beta_unif[2]);
}
model {
y ~ normal(beta1, 1);
}
You can see that beta1 and 2 are now independent parameters where beta_unif[1]
is the same as in the previous model since it induces a prior on a parameter tied to the data y. However, beta_unif[2]
- since it isn’t tied to data - takes on a mean value around 0 corresponding to the cauchy(0, 1) distribution.
Inference for Stan model: iid_prior.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
beta_unif[1] 1.25 0.00 0.00 1.24 1.25 1.25 1.25 1.25 3355 1
beta_unif[2] 0.03 0.02 0.91 -1.49 -0.75 0.06 0.80 1.50 3393 1
beta1 3.00 0.00 0.03 2.94 2.98 3.00 3.02 3.06 3353 1
beta2 1.63 0.85 44.71 -12.70 -0.93 0.06 1.04 15.16 2749 1
lp__ -523.29 0.03 1.10 -526.27 -523.72 -522.97 -522.50 -522.20 1396 1
Samples were drawn using NUTS(diag_e) at Thu Dec 31 07:11:45 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).