# Including a probability as predictor for logistic regression

I’m building a Stan model where each binary observation y_{i} is imagined to be the result of some base probability q_{i} that is further modified by my other predictors. I have a noisy point estimate for each q_{i} and my question is how to best include this information in the model.

Right now, my model looks like this (I use a regularized horseshoe prior for my betas and student t priors for k and the intercept b_0):

``````data {
int<lower=1> N;
int<lower=1> M;

int<lower=0, upper=1> y[N];
matrix[N, M] X;
vector[N] q;
}

parameters {
real b0;
vector[M] beta;
real k;
}

model {
y ~ bernoulli_logit(b0 + k*logit(q) + X * beta)
}
``````

This usually works fine, but sometimes my point estimates for q_{i} are exactly 0 or 1 and then my model blows up because logit(1) is Inf. I often don’t trust these point estimates of q_{i} and I want the model to decide to what extend it makes use of this information (this is what I try to achieve with this parameter k).

I guess the quick and dirty way is to truncate each q_{i} to a range between 0.01 and 0.99 (or whatever), but there surely has to be a better way?

If you know that the observed `q` is noisy, you could consider `q` as a parameter, and model `q_obs` (which is now `q`) as an observation with a distribution that does not vanish around the edges (or, depending on the problem, model `logit(q)` directly this way).