# Integrating out censored data in negative binomial model

http://discourse.mc-stan.org/t/right-censored-data-in-a-negative-binomial-model/562?source_topic_id=566&source_topic_id=567

By accident, my initial post did not go to a new topic. Please click on above the link.

Thank you Bob for letting me know that the link is broken. I can still access it, so it may be permission issue.
I have reformatted my post below.

I suspect that I am missing something really basic. This is my first Stan model. I am trying to apply the two procedures for right censored data in section 11.3 in the manual. I adapted the example for continuous data to my count data and have run into trouble.

The first procedure imputes the censored data while

Contents of SimpleNBrightCensored.stan

``````data{
int<lower=0> N;
int<lower=0> N_obs;
int<lower=0> N_cens;
int y_obs[N_obs];
int y_cens[N_cens];
int<lower=max(y_obs)> U;
}
parameters{
real mu;
real<lower=0.001> scale;
}
model{
scale ~ exponential( 2 );
y_obs ~ neg_binomial_2( mu, scale );
y_cens ~ neg_binomial_2( mu, scale );
}
generated quantities{
real dev;
dev = 0;
dev = dev + (-2)*neg_binomial_2_lpmf( y_obs | mu , scale );
}
``````

Rcode for Model1

``````model_file = 'simpleNBrightCensored.stan'
iterations = 10000

# inits
N = 54
N_obs = 44
N_cens = 10
U = 601
mu = 256
scale = 1.7

# data
y_obs=c(7,23,54,59,61,76,79,91,99,127,129,133,141,147,147,150,
151,155,176,184,187,189,207,225,238,239,251,276,280,300,305,332,
369,391,425,432,462,481,483,506,518,544,569,581)
y_cens=c(601,601,601,601,601,601,601,601,601,601)

stan_data = list(N=N, N_obs=N_obs, y_obs=y_obs,  N_cens=N_cens, y_cens=y_cens) # data passed to stan
# set up the model
stan_model = stan(model_file, data = stan_data, chains = 1)

# fit model
stanfit2 = stan(fit = stan_model, data = stan_data,
iter=iterations) # run the model

print(stanfit2,digits=3)
``````

The second procedure involves integrating out the censored data.

The model simpleNBrightCensored2.stan:

``````data{
int<lower=0> N;
int<lower=0> N_obs;
int<lower=0> N_cens;
int y_obs[N_obs];
int y_cens[N_cens];
int<lower=max(y_obs)> U;
}
parameters{
real mu;
real<lower=0.1> scale;
}
model{
scale ~ exponential( 2 );
y_obs ~ neg_binomial_2( mu , scale );
target += N_cens * neg_binomial_2_lccdf(U | mu, scale);
}
generated quantities{
real dev;
dev = 0;
dev = dev + (-2)*neg_binomial_2_lpmf( y_obs | mu , scale );
}
``````

The Rcode to run this model:

``````model_file = 'simpleNBrightCensored2.stan'
iterations = 10000
# inits
N = 54
N_obs = 44
N_cens = 10
#U = 601
#mu = 256
#scale = 1.7
# data
y_obs=c(7,23,54,59,61,76,79,91,99,127,129,133,141,147,147,150,151,155,176,184,187,189,207,225,238,239,251,276,280,300,305,332,369,391,425,432,462,481,483,506,518,544,569,581)
y_cens=c(601,601,601,601,601,601,601,601,601,601)

stan_data = list(N=N, N_obs=N_obs, y_obs=y_obs,  N_cens=N_cens, y_cens=y_cens) # data passed to stan
# set up the model
stan_model2 = stan(model_file, data = stan_data, chains = 1)

# fit model
stanfit3 = stan(fit = stan_model2, data = stan_data,
iter=iterations) # run the model
print(stanfit3,digits=3)
``````

I get the following error message:

``````"Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can't start sampling from this initial value."``````

When I tried to run the second model I got an error about U missing. I added a value for U in the stan_data and got the error you’re describing.

I added bounds on mu and scale as specified in the back of the manual and things seem to work:

``````real<lower=0.0> mu;
real<lower=0.0> scale;
``````

Hope that helps!

Thank you very much bbbales2!

In addition to updating the model and uncommenting U, I had to uncomment mu and scale in the R code and it then ran to completion.

The corrected model code is here

``````data{
int<lower=0> N;
int<lower=0> N_obs;
int<lower=0> N_cens;
int y_obs[N_obs];
int y_cens[N_cens];
int<lower=max(y_obs)> U;
}
parameters{
real<lower=0.0> mu;
real<lower=0.0> scale;
}
model{
scale ~ exponential( 2 );
y_obs ~ neg_binomial_2( mu , scale );
target += N_cens * neg_binomial_2_lccdf(U | mu, scale);
}
generated quantities{
real dev;
dev = 0;
dev = dev + (-2)*neg_binomial_2_lpmf( y_obs | mu , scale );
}
``````

and the corrected R code is here:

``````rm(list=ls())
library(rstan)
model_file = 'simpleNBrightCensored2.stan'
iterations = 10000
# inits
N = 54
N_obs = 44
N_cens = 10
U = 601
mu = 256
scale = 1.7

# data
y_obs=c(7,23,54,59,61,76,79,91,99,127,129,133,141,147,147,150,151,155,176,184,187,189,207,225,238,239,251,276,280,300,305,332,369,391,425,432,462,481,483,506,518,544,569,581)
y_cens=c(601,601,601,601,601,601,601,601,601,601)

stan_data = list(N=N, N_obs=N_obs, y_obs=y_obs,  N_cens=N_cens, y_cens=y_cens) # data passed to stan
# set up the model
stan_model2 = stan(model_file, data = stan_data, chains = 1)

# fit the model
stanfit3 = stan(fit = stan_model2, data = stan_data,
iter=iterations)``````

But how can I get WAIC and AIC for censored data??