Sum to zero for BYM2 unstructured effect

I was going through the great case study for how to use the sum-to-zero vector type for a BYM2 model by @mitzimorris (here: The Sum-to-Zero Constraint in Stan).

I noticed that in the BYM2 model as specified in the notes, the unstructured effect is left with the classic unconstrained \theta \sim N(0,1) unscaled distribution.

Is there any reason we would prefer that classic parametrisation to one with an explicit sum_to_zero vector, which includes the scaling factor in the BYM2 convolution like we do for the spatial component ?

i.e. currently we have:

\eta = \frac{\phi}{\sqrt s} \sqrt\rho + \theta \sqrt{1-\rho}

where phi is the usual ICAR effect, and theta is declared as an unconstrained vector with standard normal prior:

 vector[n] theta;

– what If instead we define the convolution as:
\eta = \frac{\phi}{\sqrt s} \sqrt\rho + \theta \sqrt{\frac{n} {n - 1}} \sqrt{1-\rho}

and declare \theta as a sum-to-zero vector:

 sum_to_zero_vector[n] theta;

It would seem preferable to be consistent about whether we use hard or soft sum to zero throughout the model, rather than selectively for some effects and not others, unless there is a very good reason to be inconsistent.

theta is constrained to sum to zero by construction - it is a vector of standard normals. phi, OTOH, is the ICAR component, which needs to be constrained. the sum to zero constraint is used when the model is otherwise non-identifiable.

Thanks @mitzimorris – my understanding is that \theta \times \sigma is basically a random effect specified via non-centred parametrisation, and as such it’s “soft” centred (correct me if I’m wrong). Based on my reading of this thread, I had understood that it was preferable for posterior-sampling efficiency gains to specify traditional random intercepts with the sum_to_zero formulation where possible. Did I get that wrong ? Thanks again.

yes the convolution of phi and theta is a random effect but it’s not a non-centered parameterization - cf. Stan User’s Guide. Unlike the discussion in the thread that you cite, this isn’t a nested regression - it’s a standard Poisson regression with an added spatial component - multi-level, non-nested.

The random effects vector theta will tend to sum to zero because each element of theta is given a normal(0, 1) prior. I think you’re correct that doesn’t necessarily sum to zero, but a) the BYM2 model of Riebler et al 2016 doesn’t include a sum-to-zero constraint on theta and b) in my fairly limited experience, this hasn’t been a problem.