Modeling Missing Data

nod11 · April 28, 2020, 5:03pm

I have a 2-step generative model.

vector i ~ normal(mu, std)
I algebraically transform each of the entries of vector i onto a scale of 0-1 using a sigmoid function.
Then I draw a new variable vector y ~ Bernoulli( sigmoid(i) )
The observed data, vector s, is equal to: { i if y=1 } and { 0 if y=0 }

I hope to draw samples for the parameters mu and std, given I observe the vector of data s. I have coded the model as follows, but it is giving me a lot of divergences. (Even though I am following non-centered (centered?) implementation.) I am also not sure if I have coded my generative process correctly.

"""
functions {
    real sigm(real x){
        return (1 / (1 + exp(-0.92*(x-8.24)));
    }
    
    real signal_lpdf(real s, real i, real mu, real std){ 
        if (s==0) {
            return bernoulli_lpmf(0 | sigm(i)) + normal_lpdf((i-mu)/std | 0, 1);
            } 
        else {
            return bernoulli_lpmf(1 | sigm(s))+ normal_lpdf((s-mu)/std | 0, 1);
            }
    }
}

data {
     int<lower=0> nPeps;       // number of theoretical peptides in protein
     real<lower=0> s[nPeps];   // array holding signal strength for that protein (0 is missing value)
}

parameters {
    real mu;
    real i;
    real<lower=0> std;
}

transformed parameters {
real mu2;
real<lower=0> std2;
mu2 = mu*10+7.15;
std2 = std*10+3;
}

model { 
    mu ~ std_normal(); 
    std ~ std_normal();
    for (j in 1:nPeps) {
        s[j] ~ signal(i, mu2, std2);
    }
}
"""

emiruz · April 28, 2020, 5:33pm

HMC doesn’t like discontinuity ; so I reckon your IF statement in the signal function is part of your problem.

You can use an inverse logit to make the transition between your two densities continuous. This thread uses the trick:

saudiwin · April 29, 2020, 3:37am

Oooh that looks fun. Thanks for sharing!

Topic		Replies	Views
Truncated model for neg_binomial_2 Modeling	20	1698	June 9, 2017
Missing parameters and priors Modeling	25	810	June 27, 2020
Missing data Modeling	8	553	May 8, 2020
Question about noncentered parameterization sampling structure Modeling	5	434	December 8, 2023
Posterior predictive check in within regression model with considerable missingness Modeling fitting-issues	12	1119	March 19, 2018

Modeling Missing Data

Related topics