If I set the par_ratio option to a specific value X = expected number of non-zero coefficients to zero coefficients, will this force the model to have the provided number of non-zero coefficients or is the effect that due to the specification of the prior this number of non-zero coefficients has the chance to escape shrinkage?

Welcome! The model isnâ€™t forced to have exactly this number of non-zero coefficients, itâ€™s just a prior.

Typically the horseshoe prior is also not very sensitive to the expected number of non-zero coefficients, so itâ€™s usually enough to have a reasonable guess.

Some background info in case itâ€™s useful:

If your regression coefficients are called \beta_j, then the horseshoe prior is defined as

\beta_j \sim \text{Normal}(0, \tau^{2}\lambda_{j}^{2})

\lambda_{j} \sim \text{C}^{+}(0, 1)

\tau \sim \text{?}

, where \tau is called the â€śglobalâ€ť shrinkage parameter and \lambda_{j} are called the â€ślocalâ€ť shrinkage parameters. A useful intuition mentioned in the regularized horseshoe paper by Piironen and Vehtari https://arxiv.org/pdf/1707.01694.pdf is that the global shrinkage parameter pulls all the estimates towards 0, while the local shrinkage parameters with the half-cauchy priors allow some coefficients to escape that shrinkage.

The prior on \tau has historically just been \tau \sim \text{C}^{+}(0, 1), but in the regularized horseshoe paper, Piironen and Vehtari show that this prior is usually much too wide, and a better prior would be \tau \sim \text{C}^{+}(0, (\text{par_ratio} \frac{1}{\sqrt{N}})^{2}) (when the residual standard deviation is 1).

The reason for that is a bit lengthy to explain here, but the short version is that you can define something called the â€śshrinkage factorâ€ť \kappa_{j}, which describes how much each \beta_{j} is shrunk away from the maximum likelihood solution and towards zero.

This shrinkage factor has \tau in its definition, and if we want put a prior on the number of effective non-zero coefficients (which is the sum over all 1 - \kappa_{j}), then the prior on \tau should have that special form.

The second question is with respect to counting the number of coefficients.

If I have a categorical model, as well as predictors with several different categorical levels, the model learns separately coefficients for every level of predictor and outcome category (minus the reference levels). Is this correct? That means that if I set the par_ratio option of the horseshoe prior, I have to keep in mind, that this is not about how many predictors I expect to have an influence on the outcome, but about all combinations of predictor levels and outcome categories?

Unfortunately I canâ€™t really help you with your second question. Your reasoning sounds correct to me (one regression coefficient per level minus 1), but Iâ€™m unsure how brms handles the pseudo variance for more than two possible outcomes, so someone else has to chime in.