Help to write my first stan model

I would need help in order to write a specific Stan model.

The biological question

The idea of the model is modeling the number of Bones (NbBones : discret continuous count) according to Orders (3 different discret orders) and Time (continuous). And then show that The Orders 2 has a number of bones greater than the other Orders.

The Baysian model

I’m trying to write a GLM negative binomial model with zero inflated data since the mean of NbBones is not equal to the variance because of a huge number of zeros, and since within these zeros, I have a proportion p that is coming from a specific category (Bones was destroyed and then is not observable, and bones was not find because it never existed (but observable).

So far I wrote this model such as :

model
{
  for(i in 1 : N) 
  {
    observable[i] ~ dbern(p)
    NbBones[i] ~ dpois(nu[i] * theta[i] * observable[i])  
    theta[i] <-  intercept + Coef_Orders[Orders[i]] + Coef_Time*Time[i] 
    nu[i] ~ dgamma(1/alpha, 1/alpha)
  }
  p ~ dunif(0, 1)
  alpha ~ dunif(0, 10)
  Coef_Orders[1] <- 0
  Coef_Orders[2] ~ dexp(1)
  Coef_Orders[3] ~ dexp(1) 
  Coef_Time~ dexp(1)
  intercept ~ dexp(1)
}

Where in the negative binomial model with excess zeros, it is considered that only a proportion p of individuals are likely to have bones (observables). For observable individuals (p %), the number of bones follows a negative binomial distribution (allowing 0’s, as an observable individual may not have any bones). Unobservable individuals (1 -p %) have a number of bones following a Poisson distribution with an expectation of 0 (so their number of bones is necessarily zero).

My question

But i cannot manage to write it on stan, can someone help me ? (I found thid code :

stan_glm1 <- stan_glm(NbBones ~ Orders * Time,
                      data = data, family = neg_binomial_2,
                      prior = normal(0, 2.5),
                      prior_intercept = normal(0, 5),
                      seed = 12345)

But it does not include the zero inflated model…

Thanks a lot for your help and time :-)

Here are the data :

  Orders    Time NbBones
1      1 0.21771       0
2      1 0.08723       0
3      1 0.03911       0
4      1 0.08236       2
5      1 0.08744       6
6      1 0.00417       1

In dput format if you want a toy example :

So far I tried ;


    //Declare the data 
    data{
      int<lower=0> N;//
      real Time[N]; //
      real NbBones[N]; //
    }
    //Declare the parameters 
    parameters{
      real  theta; //continuous unconstrained variable 
      real<lower=0>observable;//constrained to not be negative 
    }
    //Model block - declare the likelihood and priors 
    model{
      //likelihood
      for(i in 1 : N){
        NbBones[i] ~poisson(theta[i] * observable[i]);
        theta[i] <-  intercept + Coef_orders[Orders[i]] + Coef_Time*Time[i];
        observable[i] ~ bernoulli(p);
      }
      //Priors 
      p ~ uniform(0, 1);
      alpha ~ uniform(0, 10);
      Coef_orders[1] <- 0;
      Coef_orders[2] ~ exponential(1);
      Coef_orders[3] ~ exponential(1) ;
      Coef_Time  ~ exponential(1);
      intercept ~ exponential(1);
    }


> dput(data)
structure(list(Orders = c("1", "1", "1", "1", "1", "1", "1", 
"1", "1", "3", "1", "2", "1", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "3", "2", "2", "3", "2", "3", "3", "1", "3", "1", "1", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "1", "2", "1", "1", "1", "1", "2", 
"2", "3", "1", "3", "3", "1", "1", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "1", "2", "2", "2", "2", "2", "2", "2", "2", "1", 
"1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"3", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", 
"2", "2", "3", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1"), Time = c(0.21771, 0.08723, 0.03911, 
0.08236, 0.08744, 0.00417, 0.0162, 0.11631, 0.01712, 0.18276, 
0.0169, 0.01049, 0.04138, 0.03169, 0.05169, 0.01718, 0.11169, 
0.04654, 0.09138, 0.23242, 0.08843, 0.0185, 0.27725, 0.01363, 
0.01198, 0.06394, 0.0218, 0.05122, 0.09829, 0.07324, 0.07186, 
0.05843, 0.06693, 0.18545, 0.18037, 0.00655, 0.0122, 0.00572, 
0.01076, 0.07204, 0.00957, 0.00579, 0.02072, 0.01819, 0.02319, 
0.00948, 0.00263, 0.00698, 0.00364, 0.00502, 0.00597, 0.07171, 
0.09189, 0.01643, 0.00803, 0.00688, 0.00864, 0.01067, 0.01123, 
0.0078, 0.01015, 0.00944, 0.00755, 0.0227, 0.02841, 0.1087, 0.07557, 
0.04564, 0.03514, 0.00946, 0.011, 0.01385, 0.01837, 0.09937, 
0.10749, 0.01513, 0.01839, 0.01219, 0.01021, 0.01, 0.00844, 0.18295, 
0.22844, 0.12524, 0.14649, 0.03092, 0.02957, 0.05148, 0.08305, 
0.02372, 0.04403, 0.06083, 0.05597, 0.04202, 0.01476, 0.03361, 
0.04169, 0.00828, 0.01348, 0.06318, 0.06703, 0.02774, 0.03293, 
0.27072, 0.11012, 0.05687, 0.01636, 0.03637, 0.04128, 0.01157, 
0.01237, 0.13546, 0.01796, 0.07682, 0.0674, 0.188, 0.13317, 0.01142, 
0.01522, 0, 0, 0.01998, 0.07889, 0.02845, 0.08173, 0.01455, 0.06392, 
0.03808, 0.08862, 0.05304, 0.04987, 0.02237, 0.02612, 0.06138, 
0.05949, 0.00959, 0.01014, 0.00687, 0.00701, 0.21064, 0.01838, 
0.14369, 0.0136, 0.01749, 0.02122, 0.06394, 0.07276, 0.02882, 
0.01453, 0.05646, 0.0174, 0.0714, 0.05069, 0.03303, 0.01158, 
0.04024, 0.03759, 0.06414, 0.00566, 0.06868, 0.01201, 0.03361, 
0.03775, 0.03854, 0.00506, 0.01818, 0.03194, 0.04363, 0.01267, 
0.01589, 0.00328, 0.00965, 0.00229, 0.01062, 0.00908, 0.02851, 
0.02654, 0.00732, 0.01082, 0.02281, 0.02332, 0.04561, 0.01981, 
0.0493, 0.04335, 0.07554, 0.0074, 0.00729, 0.00631, 0.0631, 0.10638, 
0.03161, 0.02191, 0.00658, 0.00831, 0.03528, 0.01206, 0.02015, 
0.01668, 0.00712, 0.00762, 0.01905, 0.0512, 0.05175, 0.00956, 
0.02592, 0.00313, 0.00415, 0.0024, 0.00201, 0.00077, 0.00415, 
6e-04, 0.00275, 0.00109, 0.0023, 0.00156, 0.00601, 0.00223, 0.03884, 
0.00295, 0.04161, 0.02727, 0.00582, 0.00725, 0.01396, 0.00538, 
0.03965, 0.01708, 0.01833, 0.02333, 0.03832, 0.02266, 0.02942, 
0.00625, 0.0192, 0.00577, 0.01567, 0.00193, 0.01579, 0.00174, 
0.01351, 0.00787, 0.00599, 0.00326, 0.00456, 0.00482), NbBones = c(0L, 
0L, 0L, 2L, 6L, 1L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 3L, 3L, 2L, 5L, 
4L, 4L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 2L, 1L, 0L, 1L, 0L, 
1L, 0L, 0L, 0L, 2L, 2L, 2L, 2L, 4L, 3L, 0L, 0L, 0L, 2L, 2L, 3L, 
0L, 1L, 0L, 0L, 1L, 0L, 2L, 3L, 1L, 0L, 1L, 3L, 2L, 3L, 2L, 0L, 
2L, 0L, 4L, 2L, 2L, 2L, 1L, 3L, 1L, 1L, 2L, 6L, 6L, 1L, 5L, 4L, 
2L, 1L, 0L, 0L, 0L, 1L, 4L, 0L, 2L, 0L, 3L, 1L, 0L, 2L, 2L, 0L, 
5L, 4L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 
3L, 5L, 3L, 5L, 3L, 1L, 0L, 1L, 4L, 3L, 3L, 0L, 0L, 0L, 0L, 2L, 
0L, 0L, 2L, 0L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 0L, 3L, 0L, 1L, 0L, 
2L, 2L, 1L, 0L, 0L, 0L, 1L, 0L, 3L, 0L, 1L, 0L, 2L, 0L, 0L, 0L, 
2L, 1L, 0L, 1L, 1L, 0L, 2L, 2L, 0L, 0L, 0L, 0L, 3L, 4L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 0L, 2L, 0L, 4L, 4L, 0L, 0L, 0L, 0L, 1L, 1L, 
0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 2L, 1L, 
0L, 0L, 0L, 1L, 2L, 1L, 2L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L)), row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
"17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", 
"28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", 
"39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", 
"50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", 
"61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", 
"72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82", 
"83", "84", "85", "86", "87", "88", "89", "90", "91", "92", "93", 
"94", "95", "96", "97", "98", "99", "100", "101", "102", "103", 
"104", "105", "106", "107", "108", "109", "110", "111", "112", 
"113", "114", "115", "116", "117", "118", "119", "120", "121", 
"122", "123", "124", "125", "126", "127", "128", "129", "130", 
"131", "132", "133", "134", "135", "136", "137", "138", "139", 
"140", "141", "142", "143", "144", "145", "146", "147", "148", 
"149", "150", "151", "152", "153", "154", "155", "156", "157", 
"158", "159", "160", "161", "162", "163", "164", "165", "166", 
"167", "168", "169", "170", "171", "172", "173", "174", "175", 
"176", "177", "178", "179", "180", "181", "182", "183", "184", 
"185", "186", "187", "188", "189", "190", "191", "192", "193", 
"194", "195", "196", "197", "198", "199", "200", "201", "202", 
"203", "204", "205", "206", "207", "208", "209", "210", "211", 
"212", "213", "214", "215", "216", "217", "218", "219", "220", 
"221", "222", "223", "224", "225", "226", "227", "228", "229", 
"230", "231", "232", "233", "234", "235", "236", "237", "238", 
"239", "240", "241", "242", "243", "244", "245", "246", "247"
), class = "data.frame")
2 Likes

I found a solution by using the package brms

and the following code :

posterior_glm1 = brm(NbBones~  Orders * Time  , data = data,
                     family = zero_inflated_negbinomial(),
                     iter  = 6000, chains = 2, 
                     seed  = 1234,
                     cores = 5,thin=5)

If I understand it well it allows taking into account two kind of zero, those not observable and those observable, and give you at the end a zi posteriori probability which is the expected % number of unobservable zeros.

1 Like

Hello,

I wanted to check in, did your brms solution work?