Non-centered parameterization for likelihood

I am new to Stan, and I’ve been attempting to fit a model having a Beta-distributed parameter, which provides the mean of a truncated Gaussian likelihood. The model has random intercepts for “participants” for the mean of the Beta, and I’ve attempted to use non-centered parameterizations of priors wherever possible.

I’m wondering if it’s possible to do something similar for the likelihood, though. Including the Beta-distributed parameter theta_v_verb causes warnings - specifically related to high Rhat and ESS - so I’m wondering if there’s a way to get rid of it. (I’ve determined this by just having a Beta likelihood, instead, which works okay.) Thanks to anyone in advance!

parameters {
  vector[N_verb] mu_v;
  vector[N_verb] nu_v;

  real<lower=0> tau_participant_mu;
  vector[N_participant] z_participant_mu;

  vector<lower=0, upper=1>[N_verb_data] theta_v_verb;

  real<lower=0> sigma_jitter;

transformed parameters {
  vector<lower=0, upper=1>[N_verb_data] mu_v_verb;
  vector<lower=0>[N_verb_data] nu_v_verb;
  vector<lower=0>[N_verb_data] alpha_v_verb;
  vector<lower=0>[N_verb_data] beta_v_verb;

  vector[N_participant] epsilon_participant_mu;
  epsilon_participant_mu = tau_participant_mu * z_participant_mu;

  for (i in 1:N_verb_data) {
    mu_v_verb[i] = inv_logit(mu_v[verb_v[i]] + epsilon_participant_mu[participant_v[i]]);
    nu_v_verb[i] = exp(nu_v[verb_v[i]]);

  alpha_v_verb = mu_v_verb .* nu_v_verb;
  beta_v_verb = (1 - mu_v_verb) .* nu_v_verb;
model {
  mu_v ~ normal(0, 1);
  nu_v ~ normal(0, 1);

  theta_v_verb ~ beta(alpha_v_verb, beta_v_verb);

 for (i in 1:N_verb_data) {
      y_v[i] ~ normal(theta_v_verb[i], sigma_jitter) T[0, 1]; // likelihood

  sigma_jitter ~ exponential(20);

Hi, @julian.grove and sorry for the delay in responses.

I was a bit confused by why you are modeling y_v (presumably data—it’s not in the part of the model you included) with a truncated normal.

It helps if you include the whole model. I don’t see y_v, so was that some kind of data variable? Why use a truncated normal for y_v and even so, why restrict the mean to (0, 1)?

The beta distribution isn’t a location/scale family, so the notion of centering/non-centering doesn’t apply directly. You could alternatively model logit_theta_v_verb = logit(theta_v_verb) instead, and give that a hierarchical component. I consider the log odds with normal priors and probability with beta approaches to hierarchical modeling in this case study.

One thing I’d recommend if you really want to stick to the beta distribution is reparameterizing in terms of a mean (alpha / (alpha + beta)) and concentration (alpha + beta). When you model alpha and beta separately, it induces a lot of difficult correlation as they have to move in synch to keep the mean where it should be.

1 Like

Thanks for your reply. That’s right, y_v is the data vector - I’m trying to use a Normal likelihood, which is meant to account for some accidental jittering around the intended response. The likelihood is truncated because the response scale is [0, 1].

I have actually ended up doing something along the lines you’ve suggested, if I understand correctly. That is, I’ve ended up using a latent Normal, rather than a Beta, and then have set the mean of the Normal likelihood to be inv_logit applied to the Normal-sampled parameter. This allowed for a non-centered parameterization, which helped with the warnings.