Difference in means / t-test and scaling of data


I am trying to estimate the difference between two groups. I’ve made the following Stan model,

data {
  int N1;
  real y1[N1];
  real y1_mean;

  int N2;
  real y2[N2];
  real y2_mean;
parameters {
  real<lower=0> mu_tau;
  real<lower=0> sigma_tau;

  real<lower=0> sigma1;
  real<lower=0> sigma2;

  real<lower=0> mu_scale1;
  real<lower=0> mu_scale2;

  real mu1;
  real mu2;
model {
  /* Prior specification */

  /* Priors related to mean */
  mu_tau ~ normal(0, 1);

  mu_scale1 ~ normal(0, mu_tau);
  mu_scale2 ~ normal(0, mu_tau);

  mu1 ~ normal(y1_mean, mu_scale1);
  mu2 ~ normal(y2_mean, mu_scale2);

  /* Priors related to scale */
  sigma_tau ~ normal(0, 10);

  sigma1 ~ normal(0, sigma_tau);
  sigma2 ~ normal(0, sigma_tau);

  /* Likelihood */
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
generated quantities {
    real diff_mu;
    diff_mu = mu1 - mu2;

It’s not the prettiest model, but its working. However, as I understand HMC works best with standardized data, i.e. mean=0 and sd=1. In the data I’m working with the mean is between 50-75, and the standard deviation is somewhere in the range 10-15. Thus, it’s a bit hard now to judge what is a informative/non-informative prior.

Are there any tricks I can use here? Standardizing the data and then de-standardizing it after model fit would not make much sense, I believe, since I actually wish to estimate the mean.




With regards to the following lines
mu1 ~ normal(y1_mean, mu_scale1); mu2 ~ normal(y2_mean, mu_scale2);

Where does y1_mean and y2_mean come from? If these are the means of y1 and y2, it would be odd if the priors for mu1,mu2 are centered around these values. Or do they come from another data set?

If you are asking about the so-called “Matt-Trick”, this pertains to reparameterising parameters - not data. See section 26.6 in the Stan user manual. I wouldn’t worry too much about it in your case, as long as the convergence diagnostics are looking good and the estimation is not too slow.


1 Like

It actually works best with standardized posteriors; standardizing the predictors can help with this. And it’s not just HMC, the same thing applies to Metropolis and Gibbs.

So you can non-center with:

mu_scale1_std ~ normal(0, 1);
mu_scale1 = mu_scale1_std * mu_tau;

You’ll probably have problems with

mu1 ~ normal(..., mu_scale1);

because you only have a single mu1, which won’t let you estimate a scale.