Mildly over-disperse logistic model

jamiebernardin · January 15, 2018, 12:01am

Hi - I have a basic logistic model that fits pretty well (without the noise term). It does have a bit more dispersion at very high and very low success rates (at the 0-5%, 95-100%)… so I’m adding some noise:

When I run it, I get pretty tight parameter samples for the linear feature weights (which are very close to those I’d get without the added noise term), but the variance of the noise term does not become happy - at all.

So two questions… what’s going wrong with convergence of sigma (i’d guess it would want to settle close to zero), and second is there a BETTER way to deal with a little bit of random noise on top of a linear logistic model.
Thank you!!

bgoodri · January 15, 2018, 12:33am

You need to non-center theta:

sigma ~ normal(0, 2);
theta ~ normal(0, 1);
y ~ bernoulli_logit(x * beta + sigma * theta);

avehtari · January 15, 2018, 4:01am

sigma is not well identifiable
adding sigma makes the model go towards probit, ie, having shorter tails and less dispersed than logit

see, e.g., http://www.cs.toronto.edu/~radford/ftp/val6gp.pdf

jamiebernardin · January 15, 2018, 11:58pm

Thanks for the input!

@bgoodri - I had taken out the intercept parameter and you made me realize I need it for variance (problem is symmetric and I was not thinking fully bayesian). Your suggestion didn’t quite converge, but a simple real parameter did quite well with resulting in a slight bias with spread.

@avehtari , it is true that I am struggling with the slope of the link function at high/low rates. I had tried probit and it was much worse. I want to smooth out the link function, but adding stricter priors spooks out the linear regression betas.

For now, I am good with this:

with this fit:

avehtari · January 16, 2018, 3:23am

If you want something with thicker tails than logit, use tobit. probit is cdf of Gaussian, tobit is cdf of t distribution, and logit is close to tobit with degrees of freedom 7. Using tobit with df<7 will make it more robust to outliers.

jamiebernardin · January 17, 2018, 2:27am

thanks that’s awesome and exactly what I was looking for. I think over-dispersion is not the problem, but more like poor fitting at extremes.

Topic		Replies	Views
Fitting generalized logistic function Modeling fitting-issues	8	2903	October 2, 2017
Changing regression response variable from gaussian to logistic Modeling fitting-issues	30	682	February 29, 2024
Fitting a logistic multi normal isn't working, but the non-logistic version is fine Modeling	0	358	July 15, 2019
Variance model Modeling specification	48	1859	May 31, 2020
"The noise parameter is built into the Bernoulli formulation." Modeling	3	495	June 28, 2023

Mildly over-disperse logistic model

Related topics