Beginner request for reparameterization help


I am hoping for your advice reparametrizing my first model.

I am trying to find means and variances of dozens of (presumed-)normal distributions that would best explain a large collection of survey-response data points. Omitting indices, names etc., the model looks as follows:

model {
  // main relationship; ABC are pieces of known data,
  // WXYZ are parameters I'd like to find

  A + C - B * ((X - Y) / Z) ~ normal(0, W)    // in - out = error

  // hyperpriors on WXYZ -- exact numbers don't matter

  W ~ inv_gamma(10, 50)
  X ~ normal(70, 30)
  Y ~ normal(70, 40)
  Z ~ inv_gamma(40, 1000)

Stan warns me that a Jacobian adjustment may be necessary. It will sample away
and finish within seconds but, perhaps unsurprisingly, outputs garbage; the values may as well have been picked at random (they do agree well with the priors but the data points don’t do at all what they’re supposed to).

This is my first time writing a Stan model, and I’m stuck trying to find a nice way to transform that first left-hand side into something which I can come up with a reasonable Jacobian adjustment for.

I would be grateful if some old hand could give me advice on how to best approach this. You don’t need to do my homework for me, but any pointers to where to start would be helpful.

The trick is to to get the parameters on the other side of the tilde.

A + C ~ normal(B * ((X - Y) /  Z), W) // in = out + error ?

Thank you @stijn, I appreciate your help. Moving the parameters over to the mean silences the error, but the results don’t change.

This is surprising because I’m working with simulated data right now – my X values are drawn from a normal with known parameters, and so are the Y, Z and W arrays which distort them according to the formula above. (That’s why I know my priors so well, haha!) – I therefore would expect Stan to get me back, roughly, my initial parameters. Am I wrong in assuming that should be possible?

Nope, not at all. Making sure that Stan can reproduce the parameters that generate your simulated data is exactly what you should be doing. If you can’t reproduce those parameters, then your model probably has problems.

If you are stuck, it’s probably better to post your full stan file. If Y, Z, and W are arrays, things might be more complicated.

Of course! Here you go:

data {
  real true_mu;
  real true_sigma;
  real bl_mu_sigma;
  real bl_sigma_alpha;
  real bl_sigma_beta;
  real errors_alpha;
  real errors_beta;

  int<lower=1> n_ca;
  int<lower=1> n_as;
  int<lower=1> n_assignments;
  int<lower=1, upper=n_assignments> cix[n_assignments];
  int<lower=1, upper=n_assignments> aix[n_assignments];
  real<lower=-1000, upper=1000> given_scores[n_assignments];
parameters {
  real<lower=-1000, upper=1000> true_scores[n_ca];
  real<lower=-1000, upper=1000> bl_mu[n_as];
  real<lower=0, upper=100> bl_sigma[n_as];
  real<lower=0, upper=100> error_sigma;
model {
  true_scores ~ normal(true_mu, true_sigma);
  bl_mu ~ normal(true_mu, bl_mu_sigma);
  bl_sigma ~ inv_gamma(bl_sigma_alpha, bl_sigma_beta);

  for (n in 1:n_assignments) (
    given_scores[n] ~ normal(true_mu + ((true_scores[cix[n]] - bl_mu[aix[n]]) / bl_sigma[aix[n]]) * true_sigma, error_sigma);

Problem solved! Turns out nothing was wrong with my Stan model at all, I just messed up the indexing order on the Python side. The results are fantastic, even with crappy priors, exactly what I was looking for. Thank you both for your help!

1 Like

This is because

y ~ normal(mu, sigma);

is equivalent to

mu ~ normal(y, sigma);

There are two problems.

First, you need a Jacobian since the transform is non-linear.

Second, you will find that your result is only identified through the prior because you can’t identifyX, Y, and Z from (X - Y) / Z.

I’d suggest trying to think about the problem you’re dealing with generatively. Where did the data come from? What do the parameters represent? What do you know about the parameters ahead of time?

You also need <lower = 0> constraints on W and Z if you want to give them those inverse gamma priors. I should also warn you that Stan uses the scale (standard deviation) parameterization of the normal, so inverse gamma isn’t conjugate (it would be if we used variance). Stan doesn’t care about conjugacy.

I somehow missed the later discussion. I don’t see how the model could be giving you reasonable results. The value of (X - Y) / Z can be identified, but you’ll find extreme posterior correlation between X and Y and Z.

Hi Bob – thanks for your response, and sorry I only saw it now.

Your claim that I’ll need a Jacobian to get good results match my initial intuition about the problem. I don’t know whether the results I get are as good as they can be; I do know the difference to a naive averaging procedure (which is the current standard) is already like night and day.

Given the model above, do you have any suggestions regarding an appropriate Jacobian adjustment, or how I should compute it?

Thanks again.

What does the posterior pairs plot look like for X, Y, and Z? Are you measuring performance predictively?

You need the log Jacobian determinant of

(X, Y, Z)  |->  ([X - Y] / Z, Y, Z).

But then you’d typically provide distributions for Y, Z, and (X - Y) / Z. But that’s not what you’re doing, which is why I was suggesting you try to formulate the model generatively if possible.

It is, as far as I understand the meaning of the term, a generative model. I’m measuring performance by generating data using made-up latent values X and roughly realistic distributions for Y, Z plus some random error W, to get my working data set. I then use the model above to try to get back good estimates for the underlying X, Y and Z. And indeed, if I compare my made-up values to Stan’s estimates, they line up almost perfectly (up to the random errors of course).

Is there a way to generate pair plots with pystan?

This is what makes me think it’s not generative because this doesn’t generate the complex left-hand side.