# Logistic Model

Hello. I have a model trying to predict if a perspective customer will convert to an actual customer. Usually, I’d use a decision tree to do this but I want to convert…myself into a bayesian modeler. The problem is there are only 327 conversions out of about 9,000 observations. This leaves me with the warning:

``````UserWarning: Your data appears to have a single value or no finite values
warnings.warn("Your data appears to have a single value or no finite values")
``````

Below is what the ppc looks like. Is this a “lack of conversion problem” or is my model the problem?

``````data {
int<lower=0> N; // number perspective customers
int<lower=0, upper=1>  issued_flag[N]; // conversion numbers
vector[N] n_lmt1;// normalized limit 1
vector[N] n_pp_lmt; //normalized personal property limit
vector[N] n_age;//normalized age of insured
vector[N] n_aoh;//normalized age of home

}

parameters {
real<lower=0> mu;
real limit1_beta;
real pplimit_beta;
real age_beta;
real aoh_beta;
}

model {
mu ~ normal(0,3);
limit1_beta ~ normal(0,5);
pplimit_beta ~ normal(0,5);
age_beta ~ normal(0,5);
aoh_beta ~ normal(0,5);
issued_flag ~ bernoulli_logit(mu + n_lmt1*limit1_beta + n_pp_lmt*pplimit_beta + n_age*age_beta + n_aoh*aoh_beta);
}

generated quantities {
vector[N] eta = mu + n_lmt1*limit1_beta + n_pp_lmt*pplimit_beta + n_age*age_beta + n_aoh*aoh_beta;
int y_rep[N];
if (max(eta) > 20) {
// avoid overflow in poisson_log_rng
print("max eta too big: ", max(eta));
for (n in 1:N)
y_rep[n] = -1;
} else {
for (n in 1:N)
y_rep[n] = bernoulli_rng(eta[n]);
}
}``````

I would write the model differently, defining a vector in the model block that takes this

mu + n_lmt1*limit1_beta + n_pp_lmt*pplimit_beta + n_age*age_beta + n_aoh*aoh_beta

and sets it equal to a variable that you insert into the Bernoulli_logit function.

``````model {

vector[N] a;

for n in 1:N {
a[n] = mu + n_lmt1[n]*limit1_beta + n_pp_lm[n]t*pplimit_beta + n_age[n]*age_beta + n_aoh[n]*aoh_beta
}

limit1_beta ~ normal(0,5);
pplimit_beta ~ normal(0,5);
age_beta ~ normal(0,5);
aoh_beta ~ normal(0,5);

issued_flag ~ bernoulli_logit(a)
``````

If I were a better Stan user I could tell you what your code did, but whatever it did I think it was a little off. If you shared your model output it would be easier to see if your model ran correctly.

My guess here is that there is a problem with the way you are passing your data. In your Stan code, there’s a problem in your generated quantities (see below), but I don’t think it could cause the error you are seeing (what interface are you using?).

In generated quantities, you have `y_rep[n] = bernoulli_rng(eta[n]);` but `eta` is on the logit scale and you need to use `bernoulli_logit_rng`. This will produce warnings like

``````Chain 1 Exception: bernoulli_rng: Probability parameter is -0.0278299, but must be in the interval [0, 1] (in '/var/folders/j6/dg5l3gl11xb9v8w61w99ngh80000gn/T/RtmpjULs5c/model-27aa267750.stan', line 38, column 8 to column 41)
``````

but it shouldn’t yield the behavior that you’re seeing.

Model output like the summary below?

``````                   Mean      MCSE   StdDev           5%         50%  \
name
lp__         -6300.00000  0.049000  1.70000 -6300.000000 -6300.00000
mu               0.00023  0.000004  0.00023     0.000014     0.00017
limit1_beta     -0.04200  0.005200  0.20000    -0.370000    -0.03900
pplimit_beta     0.05100  0.005200  0.20000    -0.280000     0.04800
age_beta        -0.00053  0.000360  0.02100    -0.035000    -0.00062
...                  ...       ...      ...          ...         ...
y_rep      0.00000       NaN  0.00000     0.000000     0.00000
y_rep      0.00000       NaN  0.00000     0.000000     0.00000
y_rep      0.00000       NaN  0.00000     0.000000     0.00000
y_rep      0.00000       NaN  0.00000     0.000000     0.00000

95%   N_Eff  N_Eff/s  R_hat
name
lp__         -6300.00000  1200.0      4.9    1.0
mu               0.00068  3100.0     13.0    1.0
limit1_beta      0.29000  1500.0      6.2    1.0
pplimit_beta     0.38000  1500.0      6.2    1.0
age_beta         0.03400  3400.0     14.0    1.0
...                  ...     ...      ...    ...
y_rep      0.00000     NaN      NaN    NaN
``````

Thank you. Here is the updated model I used:

``````data {
int<lower=0> N; // number policy
int<lower=0, upper=1>  issued_flag[N]; // conversion numbers
vector[N] n_lmt1;// normalized limit 1
vector[N] n_pp_lmt; //normalized personal property limit
vector[N] n_age;//normalized age of insured
vector[N] n_aoh;//normalized age of home

}

parameters {
real<lower=0> mu;
real limit1_beta;
real pplimit_beta;
real age_beta;
real aoh_beta;
}

model {
limit1_beta ~ normal(0,5);
pplimit_beta ~ normal(0,5);
age_beta ~ normal(0,5);
aoh_beta ~ normal(0,5);

vector[N] a;
for (n in 1:N) {
a[n] = mu + n_lmt1[n]*limit1_beta + n_pp_lmt[n]*pplimit_beta + n_age[n]*age_beta + n_aoh[n]*aoh_beta;
}

issued_flag ~ bernoulli_logit(a);
}

generated quantities {
vector[N] eta = mu + n_lmt1*limit1_beta + n_pp_lmt*pplimit_beta + n_age*age_beta + n_aoh*aoh_beta;
int y_rep[N];
if (max(eta) > 20) {
// avoid overflow in poisson_log_rng
print("max eta too big: ", max(eta));
for (n in 1:N)
y_rep[n] = -1;
} else {
for (n in 1:N)
y_rep[n] = bernoulli_logit_rng(eta[n]);
}
}
``````

Here is the cumulative ppc output:

Here is the regular ppc output:

Do you think I should be using a different distribution for a convert/not convert target variable?

Since the outcome variable is binary, you are definitely using the right distribution (Bernoulli). It’s very hard to see what’s going on in your graphical posterior predictive plots because the only interesting outcomes are 0 and 1, and these don’t show up adequately on your continuous abscissa. Even from these graphs, however, it’s clear that something is going seriously wrong here.

What’s going wrong is that you’ve declared `mu` to be greater than zero in the parameters block, but `mu` is a logit-scale parameter. Given that there are only a few conversions, `mu` is going to take a negative value. Note that once `mu` is given a prior consistent with the data, posterior predictive checks based on the total frequency of 1’s and 0’s in the response will no longer be at all sensitive to model misspecification; this is related to the advice (e.g. in @jonah et al’s paper here https://arxiv.org/pdf/1709.01449.pdf) to use statistics of the posterior predictive distribution that are orthogonal to the model parameters.