I am working on N measures of a binary response y (0s and 1s), measured with an environamental variable x (temperature). I suppose the existence of a \bar x that acts like a “switch temperature”.
But I do not know how to model this behavior to infer \bar x.
Obviously the prior for \bar{x} is something you might know better. At least that’s the simplest I could think of which might describe to a degree what you want.
Have you also tried using a \beta parameter which would quantify the strength of the switch? Could be interesting to compare these two models. But I reckon it depends on how much data during the switch you have.
I just thought about it a bit more and and what I wrote before would cancel itself out. So the model should actually just be (as so often) a linear regression:
\theta = (x-\bar{x}) \\
y \sim BernoulliLogit(\theta + \beta \theta)
Otherwise we kick out the intercept and a \beta=0 would explain best a more or less even distribution of 0s and 1s (p=0.5). I am not sure about your data input but if you really have \bar{x}=23.56 I am assuming that you didn’t standardise the predictor (temperature). Then it might be a good idea to either adjust the prior for \bar{x} or perform standardisation. Standardisation might make your choice of priors easier if you don’t have a concrete idea.
Thanks @danielparthier , I think that with with your help I’m getting closer. Defining x^*=\frac{x-M}{S} where M is the mean and S the standard deviation of the environmental measures, and following your advice, I defined the model using \hat x as the standardized value for the threshold:
\theta = x^*-\hat x\\
y \sim BernoulliLogit(\theta + \beta \theta);
and calculating in the generated quantities \bar x=\hat x * S + M. The output of this regression model is:
which, using the transformation written above, leads to a \bar x=24.1 \pm 0.20 °C, that sounds good with the idea we had before starting our experiment. So, so far, so good.
Now with such results I would expand the model to other environmental variables and think about variable importance and model selection. Do you know how to translate this survival model with a threshold to brms or stanarms?
You might be interested in the nlf functionality of brms. The parts which you have now can be set up as nlf() and the additional variables can be added by using a model matrix in the lf() for \hat{x} and \beta. I haven’t used brms for such things in a long time, but the example code in the docs looks similar to what you need.
How many variables do you have?
# generate some fake data for testing
x <- rlogis(1000, 0, 1) # temperature
theta <- 0.5 # threshold
y <- rbinom(1000, 1, plogis(x, theta)) # responses
data <- data.frame(x = x, y = y)
# logistic regression model
fit <- brm(bf(y ~ beta * (x - theta), beta ~ 1, theta ~ 1, nl = T),
data = data,
family = bernoulli(link = "logit"))
The point here is that the logistic regression in brms assumes that the threshold is 0. So you just need to shift your predictor variable in a non-linear formula, and then predict the coefficients.
(note that I used a slightly different notation in which \theta is the threshold, as this is a more typical usage, at least in my field. But you can name it whatever you want)
It’s been a while, and I would like to thank both @Ven_Popov and @danielparthier for all the help given to me.
With a complex model involving more variables, brms just crashed my R session. I tried multiple times but i did not got any result at all, so I gave up and I stick to cmdstanr again. Maybe I was defining the formula in bf(..., nl= TRUE) not correctly… but I was running out of time and so I quit.
\theta = \left( x-\bar{x}\right) \\
y \sim BernoullLogit(\theta + \beta\theta)
gave me credible thresholds for all the variables.
So: i can mark the question as solved . As always I post here the code for everyone interested or for future reference: