Changes in cumulative logit model intercepts after adding discrimination terms

When I add a random discrimination coefficients to a cumulative logit model with crossed random effects, the absolute values of the thresholds increase markedly (double, or treble). Why is this?
Thanks for your help

Here is a toy example:-

set.seed(111); pts=rep(1:100, each=10); drs=rep(1:10, 100); table(pts,drs)

y=rnorm(1000, mean=pts/100+drs/100, sd=drs/2); quantile(y)

y=ordered(cut(y,quantile(y,c(0,.1,.3,.6,.9,1)),include.lowest=T,labels=F))

tapply(as.numeric(y),drs,var); tapply(as.numeric(y),pts,var)

data=as.data.frame(cbind(drs,pts,y))

library(brms)

mp1=c(set_prior('normal(-1,2)',class='Intercept',coef='1'),

 set_prior('normal(0,2)',class='Intercept',coef='2'),

 set_prior('normal(1,2)',class='Intercept',coef='3'),

 set_prior('normal(2,2)',class='Intercept',coef='4'),

 set_prior('normal(.5,.5)',class='sd'))

m1=brm(y~1+(1|pts)+(1|drs), data=data, chains=2, cores=2, family=cumulative, prior=mp1); summary(m1)

mp2=c(set_prior('normal(0,2)',class='Intercept',dpar='disc'),

set_prior('normal(-1,2)',class='Intercept',coef='1'),

 set_prior('normal(0,2)',class='Intercept',coef='2'),

 set_prior('normal(1,2)',class='Intercept',coef='3'),

 set_prior('normal(2,2)',class='Intercept',coef='4'),

 set_prior('normal(.5,.5)',class='sd'))

mf2=bf(y~1+(1|pts)+(1|drs), disc~1+(1|drs), family=cumulative); get_prior(mf2)

m2=brm(mf2, prior=mp1, data=data, chains=2, cores=2, family=cumulative,

control=list(adapt_delta=.99), iter=4e3, warmup=2e3); summary(m2)

 Family: cumulative

  Links: mu = logit; disc = log

Formula: y ~ 1 + (1 | pts) + (1 | drs)

         disc ~ 1 + (1 | drs)

   Data: data (Number of observations: 1000)

Samples: 2 chains, each with iter = 4000; warmup = 2000; thin = 1;

         total post-warmup samples = 4000

Group-Level Effects:

~drs (Number of levels: 10)

                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sd(Intercept)          0.11      0.10     0.00     0.38 1.00     1977     2116

sd(disc_Intercept)     0.95      0.28     0.57     1.63 1.00     1034     2030

~pts (Number of levels: 100)

              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sd(Intercept)     0.47      0.14     0.22     0.78 1.00     1024     1558

Population-Level Effects:

               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

Intercept[1]      -3.90      0.98    -5.89    -2.17 1.00     1130     1656

Intercept[2]      -1.10      0.29    -1.72    -0.58 1.00     1218     1990

Intercept[3]       0.53      0.17     0.25     0.90 1.00     1561     2195

Intercept[4]       4.16      1.06     2.27     6.38 1.00     1136     1573

disc_Intercept    -0.30      0.37    -1.01     0.48 1.00      916     1352

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS

and Tail_ESS are effective sample size measures, and Rhat is the potential

scale reduction factor on split chains (at convergence, Rhat = 1).

>

>

> summary(m1)

 Family: cumulative

  Links: mu = logit; disc = identity

Formula: y ~ 1 + (1 | pts) + (1 | drs)

   Data: data (Number of observations: 1000)

Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;

         total post-warmup samples = 2000

Group-Level Effects:

~drs (Number of levels: 10)

              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sd(Intercept)     0.08      0.06     0.00     0.23 1.00     1319      942

~pts (Number of levels: 100)

              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sd(Intercept)     0.39      0.10     0.18     0.57 1.01      585      465

Population-Level Effects:

             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

Intercept[1]    -2.26      0.12    -2.49    -2.03 1.00     1487     1263

Intercept[2]    -0.88      0.09    -1.05    -0.71 1.00     1958     1799

Intercept[3]     0.42      0.08     0.26     0.59 1.00     2085     1862

Intercept[4]     2.27      0.12     2.03     2.50 1.00     2438     1851

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS

and Tail_ESS are effective sample size measures, and Rhat is the potential

scale reduction factor on split chains (at convergence, Rhat = 1).

The brms parameterization is disc * (mu - thres). So if disc != 1, it will change the scale of thres as well.

Thanks - but I still don’t fully understand.
In my toy example, the intercept for disc is -0.30. So, how does the parameterization disc*(mu-thres) cause disc to inflate the absolute values of thres?
Really, I would like to know how I can use the thresholds and coefficients from covariates to predict means in my model, given that the disc parameter changes (inflates) all of them to apparently unrealistic values.
Thanks again for your help.

Disc is modelled on the log scale so -0.3 is on the log scale. The behavior you see is normal for any 2PL model as small disc implies changes the scale of the thresholds.