Dear community,

This is a partial crosspost from Crossvalidated, but I moved from `ordinal`

package to `brms`

. I’m not a statistician, nor a full-time-researcher, just a resident physician working on his doctor’s thesis and I’m trying to analyse data from a survey for my doctorate and I’m having troubles finding the right literature and methods for my problem.

I’m trying to specify a model for the following data:

```
participant_id character unique number
answer ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
attractivity ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
competence ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
honesty ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
intelligence ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
kindness ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
reliability ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_attractivity ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_competence ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_honesty ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_intelligence ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_kindness ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
weight_reliability ordered factor 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
situation factor Internist, Surgeon
face factor Man_Average, Woman_Average, Man_Attractive, Woman_Attractive
disfigurement factor None, Strabismus, Acne, Piercing, Tattoo
```

I have 4009 unique participants with four answer blocks each, so 16036 rows in wide format in total.

Participants of the survey have been shown four pictures of people (average man, average woman, attractive man, attractive woman) with four disfigurements (strabism, acne, piercing, tattoo), crossed in a latin square OR the control group (every face without disfigurement), randomized equally (20% each group). Participants were told they are gonna have a surgical treatment or a medical checkup (randomized) by the physician shown and they had to answer on a Likert-Scale from 0-10 (0 very improbable, 10 very probable) how much they would like to get the medical treatment from this physician. They also had to rate the physicians regarding to attractivity, competence, honesty, intelligence, kindness and reliability again on a Likert-Scale from 0-10 and also how important these attributes for the participants are.

I specified the model as follows:

```
Model <- brm(answer ~ disfigurement + situation + situation:disfigurement +
(1 + disfigurement | face) + (1 | participant_id),
family = cumulative(link = "logit", threshold = "flexible"),
prior = c(set_prior(prior = "normal(0,10)", class = "Intercept"),
set_prior(prior = "normal(0,10)", class = "b"),
set_prior(prior = "cauchy(0,5)", class = "sd")),
control = list(max_treedepth = 15, adapt_delta = 0.99))
```

This is the `summary()`

output:

```
Family: cumulative
Links: mu = logit; disc = identity
Formula: answer ~ disfigurement + situation + situation:disfigurement + (1 + disfigurement | face) + (1 | participant_id)
Data: strabism_answers_situation (Number of observations: 16036)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Group-Level Effects:
~face (Number of levels: 4)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 1.51 0.98 0.57 4.19 1275 1.00
sd(disfigurementStrabism) 1.04 0.71 0.38 2.94 1589 1.00
sd(disfigurementAcne) 1.35 0.97 0.47 3.87 1519 1.00
sd(disfigurementPiercing) 0.18 0.33 0.00 0.87 502 1.00
sd(disfigurementTattoo) 0.54 0.51 0.12 1.92 1267 1.00
cor(Intercept,disfigurementStrabism) -0.28 0.37 -0.87 0.51 3038 1.00
cor(Intercept,disfigurementAcne) -0.26 0.37 -0.84 0.51 2853 1.00
cor(disfigurementStrabism,disfigurementAcne) 0.17 0.38 -0.60 0.81 2575 1.00
cor(Intercept,disfigurementPiercing) -0.08 0.41 -0.80 0.70 4138 1.00
cor(disfigurementStrabism,disfigurementPiercing) 0.01 0.41 -0.74 0.76 4332 1.00
cor(disfigurementAcne,disfigurementPiercing) 0.11 0.41 -0.69 0.82 3206 1.00
cor(Intercept,disfigurementTattoo) -0.01 0.38 -0.72 0.69 3712 1.00
cor(disfigurementStrabism,disfigurementTattoo) 0.21 0.39 -0.57 0.82 3144 1.00
cor(disfigurementAcne,disfigurementTattoo) -0.02 0.38 -0.73 0.70 2712 1.00
cor(disfigurementPiercing,disfigurementTattoo) -0.06 0.41 -0.78 0.72 2313 1.00
~participant_id (Number of levels: 4009)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 1.39 0.03 1.34 1.44 1378 1.00
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept[1] -3.51 0.81 -5.20 -1.77 819 1.00
Intercept[2] -2.71 0.81 -4.39 -0.96 818 1.00
Intercept[3] -1.95 0.81 -3.62 -0.19 818 1.00
Intercept[4] -1.22 0.81 -2.87 0.54 822 1.00
Intercept[5] -0.60 0.81 -2.26 1.16 822 1.00
Intercept[6] 0.12 0.81 -1.56 1.87 827 1.00
Intercept[7] 0.77 0.81 -0.91 2.53 827 1.00
Intercept[8] 1.53 0.81 -0.13 3.29 828 1.00
Intercept[9] 2.57 0.81 0.90 4.33 834 1.00
Intercept[10] 3.42 0.81 1.76 5.19 843 1.00
disfigurementStrabism -1.74 0.62 -3.04 -0.57 1182 1.00
disfigurementAcne -0.86 0.77 -2.37 0.76 1419 1.00
disfigurementPiercing -0.16 0.21 -0.51 0.14 354 1.01
disfigurementTattoo -0.69 0.35 -1.42 -0.01 899 1.00
situationInternist 0.46 0.12 0.24 0.69 686 1.00
disfigurementStrabism:situationInternist 0.75 0.14 0.47 1.03 741 1.00
disfigurementAcne:situationInternist -0.71 0.14 -0.99 -0.43 719 1.00
disfigurementPiercing:situationInternist -0.01 0.14 -0.29 0.27 698 1.00
disfigurementTattoo:situationInternist -0.02 0.14 -0.29 0.26 720 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

LOO gives the following output:

```
Computed from 4000 by 16036 log-likelihood matrix
Estimate SE
elpd_loo -34808.9 78.9
p_loo 2951.7 23.1
looic 69617.9 157.8
------
Monte Carlo SE of elpd_loo is 1.1.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 16020 99.9% 942
(0.5, 0.7] (ok) 16 0.1% 2018
(0.7, 1] (bad) 0 0.0% <NA>
(1, Inf) (very bad) 0 0.0% <NA>
All Pareto k estimates are ok (k < 0.7).
```

`check_hmc_diagnostics()`

on the `stanfit`

object yields the following result:

```
Divergences:
0 of 4000 iterations ended with a divergence.
Tree depth:
0 of 4000 iterations saturated the maximum tree depth of 15.
Energy:
E-BFMI indicated no pathological behavior.
```

The wall-time on the HPC cluster I have access to is about 9h for this model with four cores and 16Gb RAM.

Now my questions:

- Do I need to do additional diagnostics? If yes: What do I have to do?
- How can I check the goodness of fit and prevent overfitting?
- Do you have ideas for other model specifications?
- My primary hypothesis is, that strabismus leads to a lower score than no disfigurement. How can I check this?
- Does the interaction between
`disfigurement`

and`situation`

presents difficulties in answering question three? - How can I plot and communicate the
`face`

effect?