Specify rounding error and censoring for response in brms?

royworth · February 14, 2020, 4:13pm

Hi, I have a model in Stan for measurements of chemical concentrations in water samples across counties (actually several more hierarchical levels, but I’m looking to simplify for this question). Each measurement also has an indicator for whether it was gathered from a treated system (i.e., to reduce the concentration) or not. The measurements are both rounded to the nearest 1 ppb (integer) and are left censored at 2 ppb. A Stan model I’ve been working on, which seems to be working well, is shown at the bottom of the post. Hopefully it will help clarify the brms model I’m looking for.

The reason I’m posting is that I’d really like to use brms to model this data because, if possible, I think it would be much more efficient for checking and comparing various different parameterizations of covariates and levels and such. It’d also be a nice way to double-check my own code, which has a lot of moving parts once I start adding and removing additional hierarchical levels and predictors.

The obstacle I’ve run into, however, is that I’m not sure how to deal with some specific aspects of this data via brms, mostly related to rounding. I’ve seen some examples of measurement error for the response in brms, for example, but I’m unsure of how to deal with rounding, specifically. Or if it is even possible, actually. Second, because of the left censoring, I believe there is also a catch to the rounding at the censoring boundary ( i.e., measures reported at y = 2 ppb could only be y + 0.5 instead of y +/- 0.5) that I’m also unsure how to deal with in brms. For maybe more clarity, if the issue were only censoring, I think my brms model would be something like:

brm( conc | cens( det ) ~ obs_trt + county_trt + ( 1 | county ), data = dat, family = lognormal() )

And ‘dat’ would contain the indicator vector ‘det’ with -1 at every observation where ‘conc’ is left censored and 0 for non-censored observations.

Where I’m stuck, again, is (1) finding a brms specification for rounding of the response and (2) dealing with rounding at the left-bounded observations at 2 ppb.

Thanks a bunch for any insight, and sorry if the question is not yet clear.

The Stan model:

data{
 int prior_only; // should likelihood be ignored?
 int< lower = 1 > N;
 int< lower = 1 > N_obs;
 int< lower = 1 > N_cens;
 int< lower = 0 > N_counties;
 
 vector[ N ] county_trt;
 
 int< lower = -1 , upper = 1 > trt[ N ];
 int< lower = 0 > county[ N ];
 
 vector [ N ] y; //lower is 2 in Stan datalist

 int<lower = 0 , upper = 1 > Det[ N ]; // 1 = detected, 0 = not det, MDL = 2ppb
 int< lower = 0 , upper = 1 > bound[ N ]; // indicator (0 , 1) for obs at exactly 2ppb
 }
transformed data {
 //for observed measurements ( > MDL (2 ppb) )
 vector[ N_obs ] y_obs;
 vector[ N_obs ] county_trt_o; 
 int< lower = -1 , upper = 1 > trt_o[ N_obs ]; 
 int< lower = 0 > county_o[ N_obs ];
 int< lower = 0 , upper = 1 > bound_o[ N_obs ]; // indicator that obs reported at MDL;
 
 //for censored measurements ( < 2ppb )
 vector[ N_cens ] county_trt_c; 
 int< lower = -1 , upper = 1 > trt_c[ N_cens ]; 
 int< lower = 0 > county_c[ N_cens ];
 {
  int i = 1;
  int j = 1;
 
  for ( n in 1:N ){
   if( Det[ n ] == 0 ){ //group indices for censored obs
    trt_c[ i ] = trt[ n ];
    county_trt_c[ i ] = county_trt[ n ];
    county_c[ i ] = county[ n ];
    i = i + 1;
    
    } else { //group indices for obs > MDL
     bound_o[ j ] = bound[ n ];
     y_obs[ j ] = y[ n ];
     trt_o[ j ] = trt[ n ];
     county_trt_o[ j ] = county_trt[ n ];
     county_o[ j ] = county[ n ];
     j = j + 1;
     }
    }
   }
 }
parameters{
 real a0;
 real b_trt;
 real b_county_trt;
 
 vector[ N_counties ] z_county;

 real< lower = 0 > sd_county;
 real< lower = 0 > sigma;
 }
transformed parameters{
 vector[ N_counties ] a_county;
 
 a_county = sd_county * z_county;
 }
model{
 vector[ N_obs ] mu_obs;
 vector[ N_cens ] mu_cens;

 //priors
 target += normal_lpdf( sd_county | 0 , 0.7 ) - 1 * normal_lccdf( 0 | 0 , 0.7 );
 target += normal_lpdf( sigma | 0 , 0.7 ) - 1 * normal_lccdf( 0 | 0 , 0.7 );
 
 target += normal_lpdf( a0 | 0 , 3 );
 target += normal_lpdf( b_trt | 0 , 2 );
 target += normal_lpdf( b_county_trt | 0 , 2 );
 
 target += normal_lpdf( z_county | 0 , 1 );

// linear predictor 
 for ( o in 1:N_obs )
  mu_obs[ o ] = a0 + b_trt * trt_o[ o ] + b_county_trt * county_trt_o[ o ] + a_county[ county_o[ o ] ];

 for ( c in 1:N_cens )
  mu_cens[ c ] = a0 + b_trt * trt_c[ c ] +   b_county_trt * county_trt_c[ c ] +  a_county[ county_c[ c ] ];

 //likelihood
 if( prior_only == 0 ){ //sample from prior only?
  for ( o in 1:N_obs ){
   if( bound_o[ o ] == 1 ){  // measurement observed and exactly 2 ppb
    target += log( lognormal_cdf( y_obs[ o ] + 0.5 , mu_obs[ o ] , sigma ) 
              - lognormal_cdf( y_obs[ o ] , mu_obs[ o ], sigma ) );
    } else {
     target += log( lognormal_cdf( y_obs[ o ] + 0.5 , mu_obs[ o ] , sigma ) 
              - lognormal_cdf( y_obs[ o ] - 0.5 , mu_obs[ o ], sigma ) );
     }
   }
     
  target += lognormal_lcdf( 2 | mu_cens , sigma ); // censored observations
  }
 }

Operating System: Windows 10 x64
brms Version: 2.10.0

paul.buerkner · February 20, 2020, 9:52am

Unfortunately, I don’t know how to get this rounding of response values into brms. I think it is safer to stick to pure Stan for the time being.

royworth · February 20, 2020, 2:33pm

No worries. Thanks for responding, Paul. I’ll stick with the pure Stan code for now.

hrlai · June 14, 2023, 11:08pm

Hi there, since this isn’t closed I thought to check if anyone is still pursuing this?

Not sure if the following Stan manual on rounding error model was around when this discussion started, but here it is:

While one could run this in Stan, it’d be great to have this in brms if it fits the design philosophy.

bentrueman · August 14, 2024, 7:45pm

Perhaps I’m missing something, but isn’t it possible to model rounded response values in brms using interval censoring?

Topic		Replies	Views
Binomial measurement error model Modeling specification , brms	11	213	July 30, 2024
Parameterization of measurement error in brms brms	4	788	September 7, 2018
Error message when attempting Cox PH model brms fitting-issues	6	609	May 20, 2021
Censored data reasonable prediction Modeling brms	1	348	December 21, 2022
Mixed (right, left and interval) censored log-normal with brms brms fitting-issues , specification	1	2077	June 17, 2022

Specify rounding error and censoring for response in brms?

Related topics