Is it ok to use the same independent outcome variable twice in a model?

Thanks for this @simonbrauer!

Your hunch re unbiased estimates makes sense.

I’ve started checking what you suggest (95% incl true value), but on my actual model instead of the simplified simulations and that is taking a while to fully run, but the first results indicate no issues.
The sample sizes for my actual model are around n = 50.

Interesting to hear about the results of your simulation though, I’ll keep that in mind when analyzing my results.

One reason I was coming back to the double likelihood approach is that @jsocolar’s suggestion started struggling when the coefficient is close to 0. It seems like that introduces unidentifiability between psi and the coefficient. I’ll try your integration approach as well though, maybe that’s more robust somehow.

1 Like

The integration approach seems to do the trick, it recovers all parameters and doesn’t struggle when the coefficient is (close to) 0.

Here is the code I used (with a rough integration approach that I’m sure can be improved, but it does the job).

functions { 

              real normal_pdf(real x, real mu, real sigma) {
                     real coef = 1 / sqrt(2 * pi() * sigma * sigma);
                     real exponent = -0.5 * square((x - mu) / sigma);
                     return coef * exp(exponent);
              }

              real mean_logit(real lower_bound, real upper_bound, real alpha, real beta, real X_mu, real X_sigma, int steps) {
                     real step_size = (upper_bound - lower_bound) / steps;
                     real integral;

                     integral = 0.5 * step_size * normal_pdf(X_mu , X_mu, X_sigma) * inv_logit(alpha + X_mu * beta);

                     for(i in 1:(steps-1)){
                            real x = lower_bound + i * step_size;
                            integral += step_size * normal_pdf(x , X_mu, X_sigma) * inv_logit(alpha + x * beta);
                     }

                     return integral;
              }
       }


       data {
              int N; 
              int y[N];
              vector[N] X;
              int N_u;              
              int u[N_u];
       }

       parameters {
              real alpha;  
              real beta;
              real<lower=0,upper=1> psi;
              real X_mu;
              real X_sigma;
       }
       
       transformed parameters {
              vector<lower=0,upper=1>[N] mu;
              real<lower=0,upper=1> mu_unknown;
              
              mu = inv_logit(alpha + X * beta);
              mu_unknown = mean_logit(-5, 5, alpha, beta, X_mu, X_sigma, 100);
       }

       model {
              alpha ~ normal(0,5);
              beta ~ normal(0,5); 
              psi ~ beta(1,1);
              X_mu ~ normal(0,1);
              X_sigma ~ exponential(0.5);
              X ~ normal(X_mu, X_sigma);

              y ~ bernoulli( (1-psi) * mu);
              u ~ bernoulli(mu_unknown);
       }

1 Like