# Interpret parameters after inv_logit transform in hierarchical non-centered parameterization

Hi!

I’m fairly new to Bayesian modeling in general and Stan in particular so I apologize if I even use incorrect phrasing of what I’m trying to ask…!
I’m working on a hierarchical model of reinforcement learning in a repeated measures design. I’ve posted the model code (close to my latest version) in this post.
In the given experiment a number of subjects `nS` performed a learning task under `nC` different (drug) conditions (i.e. each subject repeats the learning paradigm in `nC` times). The reinforcement model then uses a set of parameters to model the subjects behavior in the learning task. Of interest for my current question is the parameter `Arew` that models the learning rate. It needs to be constrained to a range of `[0,1]`.
To account for the hierarchical structure (subject repeated in condition) each parameter has a non-centered parameterization. The baseline condition is parameterized as follows

``````Arew_normal[s,] = Arew_m + Arew_cond_s * Arew_cond_raw[s,] + Arew_vars[s,1];
``````

That is, there is a group mean `Arew_m`, a subject specific offset `Arew_cond_raw[s,]` with sd `Arew_cond_s` and a random part for each subject that is correlated across conditions.
Each non-baseline condition (that is, each drug condition) has an additional offset `Arew_cond_grp_m` plus a subject specific random part:

``````Arew_normal[s,v] += cond_vars[s,v,kk] * (Arew_cond_grp_m[kk] + Arew_vars[s,kk+1]);
``````

where `cond_vars[s,v,kk]` is a dummy ccoding the condition.
Now for the reinforcement model the parameter `Arew` needs to be in the range `[0,1]`. This is achieved by an `inv_logit()` transform:

``````Arew[s,] = inv_logit(Arew_normal[s,]);
``````

That means, that I can interpret the subject-wise parameters `Arew` on my `[0,1]` scale. However, the estimates for the group-level parameters for the overall mean of the parameter (`Arew_m`) and the group-level parameter for the condition effect (`Arew_cond_grp_m`) are only available on the unconstrained space before the transform happens.

What would be the way to interpret these parameters?
That is, how can I tell (and report) what effect my drug manipulation has on my parameter `Arew`?

I’m extending here a model that is used in the hBayesDM package. This simple model, that does not implement repeated measures, uses a non-centered parameterization for the learning rate parameter as follows (using `Phi_approx()` instead of `inv_logit()` to transform the parameter to a range of `[0,1]`):

``````A[s] = Phi_approx(A_m + sigma * A_raw[s])
``````

To get the group-level parameter transformed back to an interpretable scale it then uses

``````mu_A   = Phi_approx(A_m);
``````

in the `generated quantities` block.
Is there a way to obtain interpretable values for `Arew_m` and `Arew_cond_grp_m` using such kind of back-transformation? I’d assume

``````mu_Arew = inv_logit(Arew_m)
mu_Arew_cond_grp = inv_logit(Arew_cond_grp_m)
``````

to be misleading here since the parameter `Arew` comprises of a sum of the 2 parameters.

Any idea on this or a suggestion how to better deal with such a case is highly appreciated!
Thanks a lot!

1 Like

Hi,
sorry for not getting to you earlier, the question is relevant and well written. I admit I didn’t try to understand the whole model, but hope I can provide some hints nevertheless.

If the `Arew` can be interpreted as a probability or something sufficiently similar, than coefficents before the `inv_logit` transform are changes in log-odds (or you can exponentiate them and get odds-ratio). This is very commonly done for logistic regression, so you should be able to find a lot of examples around this.

If that’s not enough or if the `Arew` coefficient further interacts with other parts of the model and you need to account for those interactions, than I think it is most useful to try to interpret the model via its predictions. Looking at individual coefficients can be seen as a special case of such a prediction. If I regress `outcome ~ treatment` (where treatment has levels `A` and `B`) the coefficent for `treatmentB` represents how big difference between averages of `outcome` for the two groups I would see in a hypothetical new set of measurements with no noise/unlimited data.

So for a complicated model I can make predictions for the two treatments and then subtract them to get posterior samples of the difference. This usually forces me to be abit more precise about the question I am asking: do I want to predict using the non-treatment covariates of the participants I observed? Or some special covariate values? Should I take the fitted random effects for the participants I observed or do I want to predict for a hypothetical new participant (by drawing a new random effect from the fitted hyperprior), etc. I think this is actually an advantage.

Best of luck with your model and feel free to ask for clarifications!