(Apologies ahead of time, if I didn’t find the right already-had discussion, when searching)

I have a neuroimaging model where we fit some data on sensors, infer a latent source vector, and the latter to predict an expert label of each region as healthy or pathological:

```
data {
int nr; // number of brain regions
int ns; // number of sensors
vector[ns] sensors; // electrodes
matrix[ns, nr] gain; // biophysical source->sensor forward model
int<lower=0, upper=1> expert_labels[nr]; // 0-> healthy, 1->pathological
real slope;
real intercept;
}
parameters {
vector[nr] regions;
}
model {
regions~ normal(0, 1);
sensors ~ normal(gain * exp(regions), 1);
expert_labels ~ bernoulli_logit(slope * regions + intercept);
}
```

This works OK, but in practice, `nr = 164`

, and on average, <10 of the `expert_labels`

are `1`

. This means true negatives dominate, and we can get 90% accuracy by chance. For this reason, we report (to regulatory agency) precision & recall numbers, not sensitivity and specificity.

My question is how one can modify the model to “focus” (condition?) on the cases of true postives, false postives, and false negatives, which are far more important to the practitioner than true negatives. I looked around at weighted regression, for instance, but the feedback on the forum is negative, as the result is not a well formed Bayesian model. I considered excluding the true negatives from prediction, e.g.

```
model {
...
for (i in 1:nr)
{
real y = slope * regions[i] + intercept
if (!(inv_logit(y) < 0.5 && expert_labels[i]==0))
expert_labels[i] ~ bernoulli_logit(y);
}
}
```

but this doesn’t seem right, at the least because the model is of the probability of a given label, and not the label itself. I have the intuition that this should be solvable with a similar concept as a sparsifying (e.g. horseshoe) prior, but no clue on how to realize that idea.

I would be really keen on any suggestions, links, pointer to BDA3 chapter, etc. Thanks in advance!