Fitting issue with +/- mean value inside loop

Hi all, I have been having problems fitting a model for a meta analysis of relative drug efficacy based on a literature review. I’ve tracked my issue down to a single sign change: my model contains a variant of the line

for (i in 1:n_i) {
        eta[i] = mu[study[i]] - mean(mu);
    }

And fails to fit (chains don’t converge, all \mu_i have the same (broad) posterior). However, if I change the above to

for (i in 1:n_i) {
        eta[i] = mu[study[i]] + mean(mu);
    }

Everything works fine. I’m sort of at a loss as to why; eta is subsequently constrained to lie in [0,1] by an inv_logit function, so I don’t see why shifting it in a different direction fails. Below is the minimal model which exhibits the issue: I’ve replicated in both Rstan and pystan, so I don’t think it’s a problem with my set up. I’ve also included some simple data (in R) below so you can just copy-paste to run.

Help appreciated!

data {
    // Constants
    int<lower = 2> n_i; 
    
    // Data
    int<lower = 1> study[n_i]; 
    int<lower = 0> y[n_i]; 
    int<lower = 1> n[n_i];

    // Priors
    real<lower = 0> prior_sd_mu;
    }
    
transformed data {
    int<lower = 1> n_s = max(study); 
}

parameters {
    vector[n_s] mu; 
}

transformed parameters {
    vector[n_i] eta;
    vector[n_i] theta;
    
    for (i in 1:n_i) {
        eta[i] = mu[study[i]] - mean(mu);
    }

    // Logit model
    theta = inv_logit(eta);
}

model {
    // Priors
    mu ~ normal(0, prior_sd_mu);
    
    // Likelihood
    y ~ binomial(n, theta);
}
test_data <- list( n = c(108,  55, 177, 183,  96, 349, 218, 221,  26,  28, 380, 127,  40,
                        33, 175, 176, 178, 173,  70), 
                  n_i=19, 
                  prior_sd_mu = 10000, 
                  study = c(9, 9, 5, 5, 5, 4, 7, 7, 1, 1, 6, 6, 8, 8, 2, 2, 3, 3, 4), 
                  y = c(53,  10,  98,  59,  14, 202,  56,  21,   6,   4,  89,  16,   4,                                                                                                                                                                                                                                                                                    
                        9,  77,   8,  97,  22,   3))

I simplify the problem so that study[i] = i for all i and set n_i = 2.

Then, if we use the minus in the definition of eta, then the Jacobian of the transformation from (\mu_1,\mu_2) to (\eta_1,\eta_2) is zero. On the other hand, if we use the plus in the definition of eta, then, the Jacobian is not zero. I guess, this contributes the phenomenon.

The vanishing of Jacobian means that the dimension of parameter reduces and this is not appropriate if there are individual differences and parameters reflect them.

1 Like

Thanks for the reply. I’m not sure I quite follow. For the simple version you mean the Jacobian of the transformation

\eta_1 = \mu_1 - (\mu_1 + \mu_2)/2
\eta_2 = \mu_2 - (\mu_1 + \mu_2)/2

is zero? As \frac{\partial \eta_1}{\partial \mu_1} = \frac{\partial \eta_2}{\partial \mu_2} = 0.5, and \frac{\partial \eta_2}{\partial \mu_1} = \frac{\partial \eta_1}{\partial \mu_2} = -0.5, I’d have thought the Jacobian was

0.5 -0.5
-0.5 0.5

Which isn’t zero (and doesn’t have zero determinant). Can you clarify what I’ve misunderstood? Thank you!

The determinant is in fact zero.

\det\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)=ad-bc=\left(0.5\right)\times\left(0.5\right)-\left(-0.5\right)\times\left(-0.5\right)=0.25-0.25

Anyway, the problem is nonidentifiability. The posterior of mu is necessarily broad because (for example) mu=[1,-1] and mu=[101,99] give the exact same eta and hence the exact same likelihood and predictive distribution. If one is reasonable fit then so is the other.

2 Likes

The determinant is in fact zero.

My bad. If only I learned to maths :)

Ok, I see. So there’s no single solution, so there’s nothing to converge to.

Thanks both for your help!

1 Like