Hello, I am extremely new to Bayesian Modeling and am not sure if I’m on the right track with building this model. I have a dataset containing count data, this data has been generated from a particle analysis using ImageJ. Each row contains the count of particles found in a small image sampled from a larger image. My goal is to determine the most likely count of particles and quantify my uncertainty.
Now, I should mention upfront, that while Count
is the parameter I’m interested in, I really want to use count to compute particle density for the entire image. Therefore, Count
will eventually be transformed as
Where area is the size of the sample image. This isn’t super important at the moment, because my idea is to just model Count
and then transform my posterior to the proper scaling.
I have tried a few ideas so far, such as doing poisson regression and negative binomial regression. These methods produced seemingly reasonable estimates but running pp_check(dat.brm)
gives me a model nowhere near resembling my empirical data. I’m also not even sure if those methods are what I want to use given I have no predictor variables yet.
This has lead me to developing a gaussian mixture
prior, which fits my pp_check
wonderfully, but I’m not sure how to interpret the output or give a reasonable estimate for Count
. I have included my data and code, and am looking for a little guidance to determine if what I am doing is reasonable.
dat <- structure(list(Count = c(65, 37, 38, 34, 37, 38, 43, 29, 37,
39, 31, 37, 62, 30, 32, 34, 34, 32, 38, 42, 49, 47, 41, 51, 39,
41, 37, 30, 44, 32, 41, 39, 36, 46, 43, 43, 36, 37, 38, 31, 28,
33, 30, 30, 34, 45, 25, 37, 31, 33, 41, 27, 38, 34, 24, 37, 26,
39, 42, 37, 39, 48, 35, 38, 39, 42, 38, 47, 45, 39, 39, 30, 42,
42, 38, 39, 33, 36, 31, 38, 41, 42, 52, 41, 36, 49, 50, 43, 35,
35, 43, 53, 42, 49, 32, 41, 36, 44, 39, 49, 60, 31, 23, 25, 34,
36, 28, 29, 34, 35, 24, 29, 63, 32, 24, 23, 27, 31, 29, 32, 22,
33, 35, 39, 28, 22, 29, 23, 29, 27, 29, 28, 25, 28, 33, 42, 24,
24, 28, 17, 22, 24, 27, 26, 25, 37, 25, 25, 29, 27, 19, 25, 35,
33, 30, 31, 21, 39, 30, 34, 38, 38, 27, 32, 32, 32, 37, 46, 35,
35, 30, 29, 34, 34, 34, 30, 34, 32, 30, 32, 32, 38, 37, 40, 33,
37, 31, 29, 31, 35, 33, 26, 29, 31, 33, 28, 37, 33, 34, 39, 23,
17, 12, 11, 11, 14, 12, 20, 15, 13, 9, 14, 26, 14, 17, 8, 15,
11, 12, 18, 23, 16, 12, 18, 13, 17, 17, 15, 17, 14, 20, 17, 18,
22, 11, 19, 17, 19, 27, 19, 13, 19, 13, 19, 11, 20, 15, 21, 20,
17, 15, 9, 16, 15, 15, 21, 14, 20, 15, 19, 18, 16, 18, 19, 24,
19, 14, 23, 16, 22, 14, 12, 14, 6, 16, 19, 14, 16, 12, 14, 15,
19, 19, 20, 16, 23, 22, 16, 19, 17, 19, 13, 16, 19, 18, 20, 17,
16, 12, 18, 14, 18, 33, 11, 12, 15, 11, 13, 14, 18, 14, 23, 24,
16, 18, 14, 18, 16, 13, 15, 23, 14, 14, 17, 16, 9, 17, 16, 13,
8, 7, 12, 20, 19, 11, 11, 20, 17, 18, 22, 5, 13, 16, 11, 14,
10, 17, 20, 18, 16, 14, 16, 16, 15, 17, 12, 14, 17, 20, 12, 15,
14, 18, 13, 19, 19, 18, 5, 18, 16, 19, 21, 14, 13, 15, 16, 15,
14, 17, 13, 15, 16, 19, 14, 10, 23, 17, 23, 12, 16, 12, 18, 19,
17, 20, 29, 14, 22, 11, 20, 18, 29, 26, 30, 37, 31, 27, 32, 35,
32, 32, 29, 34, 23, 38, 39, 27, 29, 36, 33, 33, 37, 32, 25, 35,
26, 26, 25, 34, 25, 29, 21, 38, 29, 21, 23, 31, 29, 31, 33, 27,
20, 28, 29, 22, 21, 20, 25, 14, 23, 24, 24, 29, 17, 29, 30, 32,
34, 25, 24, 23, 28, 24, 22, 24, 28, 27, 24, 22, 20, 26, 23, 28,
23, 30, 19, 28, 29, 29, 31, 29, 24, 25, 35, 37, 27, 34, 30, 28,
42, 31, 32, 30, 29, 34, 25, 26, 29, 24, 29, 66, 38, 37, 33, 36,
42, 39, 29, 35, 36, 30, 36, 59, 29, 31, 34, 32, 32, 39, 43, 51,
51, 38, 50, 39, 43, 38, 29, 43, 32, 38, 38, 36, 48, 44, 42, 36,
38, 34, 29, 26, 33, 30, 31, 34, 42, 27, 36, 33, 31, 40, 28, 38,
35, 24, 37, 25, 37, 44, 36, 36, 46, 33, 35, 39, 40, 37, 48, 39,
38, 40, 33, 45, 41, 36, 38, 35, 39, 30, 36, 43, 40, 52, 41, 34,
47, 50, 42, 36, 37, 45, 56, 41, 46, 30, 45, 34, 44, 44, 55, 59,
31, 23, 26, 36, 36, 27, 26, 34, 33, 22, 27, 61, 33, 25, 24, 25,
31, 28, 31, 22, 36, 37, 40, 29, 23, 29, 25, 29, 28, 27, 28, 24,
28, 33, 41, 24, 24, 29, 16, 27, 24, 28, 26, 23, 37, 25, 25, 29,
27, 19, 26, 34, 36, 26, 30, 21, 37, 29, 36, 36, 37, 28, 34, 34,
30, 37, 48, 33, 35, 26, 27, 33, 34, 33, 31, 34, 32, 30, 31, 34,
37, 37, 37, 31, 35, 30, 29, 29, 37, 32, 24, 31, 30, 33, 29, 37,
35, 36, 39, 23, 18, 12, 12, 13, 15, 14, 20, 16, 13, 9, 14, 26,
14, 17, 8, 16, 11, 13, 18, 23, 16, 10, 16, 14, 17, 17, 16, 16,
13, 20, 17, 19, 22, 11, 19, 17, 20, 27, 19, 13, 18, 13, 18, 12,
21, 15, 22, 20, 17, 14, 9, 15, 14, 15, 23, 14, 20, 14, 18, 19,
17, 18, 19, 24, 21, 15, 22, 16, 25, 14, 12, 14, 7, 19, 19, 15,
16, 13, 14, 13, 17, 18, 21, 15, 24, 24, 16, 19, 18, 18, 14, 17,
21, 16, 20, 17, 17, 13, 19, 12, 19, 34, 12, 13, 15, 10, 13, 16,
19, 13, 22, 24, 15, 19, 14, 17, 17, 14, 16, 21, 13, 15, 17, 15,
10, 17, 15, 13, 9, 7, 13, 19, 19, 13, 14, 21, 18, 17, 21, 5,
13, 16, 13, 16, 10, 18, 18, 18, 14, 14, 15, 16, 15, 17, 12, 12,
18, 19, 12, 17, 13, 18, 13, 19, 22, 20, 4, 18, 16, 19, 22, 14,
13, 15, 16, 15, 14, 17, 13, 17, 16, 19, 13, 10, 22, 16, 23, 11,
17, 12, 18, 18, 18, 20, 29, 15, 22, 13, 21, 18, 27, 28, 30, 36,
34, 26, 29, 35, 30, 29, 29, 34, 23, 37, 40, 26, 29, 34, 37, 33,
37, 32, 23, 35, 25, 26, 25, 34, 26, 27, 23, 34, 28, 20, 22, 27,
29, 35, 34, 27, 20, 28, 28, 21, 20, 20, 25, 14, 22, 23, 22, 28,
17, 28, 28, 32, 34, 25, 21, 23, 26, 24, 23, 22, 28, 26, 23, 22,
20, 24, 23, 29, 26, 27, 20, 28, 29, 28, 30, 31, 25, 24, 35, 39,
29, 35, 30, 27, 45, 31, 31, 30, 29, 35, 25, 25, 28, 24, 30, 62,
38, 38, 34, 39, 38, 44, 32, 40, 37, 35, 40, 60, 31, 32, 35, 38,
37, 41, 42, 48, 46, 41, 48, 40, 40, 38, 33, 42, 32, 38, 39, 38,
45, 42, 46, 41, 39, 39, 32, 31, 33, 31, 33, 36, 47, 27, 36, 33,
38, 42, 28, 39, 34, 28, 40, 27, 41, 40, 40, 40, 50, 38, 40, 44,
44, 41, 44, 44, 39, 40, 34, 44, 45, 44, 40, 34, 39, 33, 38, 42,
47, 52, 40, 41, 50, 51, 45, 37, 40, 47, 55, 43, 49, 33, 41, 41,
47, 42, 50, 56, 32, 20, 26, 38, 39, 29, 30, 33, 38, 29, 29, 53,
34, 25, 23, 27, 33, 29, 35, 24, 35, 35, 42, 28, 21, 30, 22, 28,
27, 26, 28, 27, 28, 34, 40, 25, 26, 27, 16, 24, 23, 27, 29, 24,
37, 24, 25, 27, 26, 21, 25, 35, 36, 29, 29, 24, 37, 29, 37, 39,
38, 31, 37, 37, 34, 40, 47, 36, 38, 34, 31, 36, 37, 35, 34, 35,
36, 30, 30, 35, 42, 37, 40, 34, 39, 34, 32, 36, 31, 37, 27, 33,
36, 35, 28, 39, 34, 32, 34, 24, 17, 11, 12, 11, 13, 13, 19, 13,
13, 12, 14, 29, 15, 16, 8, 15, 11, 12, 19, 21, 19, 9, 19, 13,
14, 17, 16, 16, 13, 21, 19, 22, 23, 14, 21, 17, 19, 26, 21, 13,
18, 16, 20, 13, 25, 13, 22, 19, 16, 15, 9, 16, 14, 15, 19, 16,
21, 15, 19, 19, 17, 18, 18, 24, 17, 14, 23, 17, 21, 16, 12, 14,
7, 17, 19, 15, 17, 12, 14, 16, 20, 17, 18, 16, 22, 21, 17, 21,
18, 21, 15, 16, 19, 19, 23, 17, 17, 10, 17, 15, 18, 27, 12, 12,
18, 9, 15, 16, 18, 15, 25, 24, 17, 20, 13, 16, 19, 14, 14, 19,
13, 15, 15, 16, 9, 17, 14, 14, 9, 7, 10, 20, 20, 12, 11, 20,
16, 18, 22, 5, 14, 13, 12, 15, 11, 20, 20, 19, 16, 13, 17, 16,
15, 17, 13, 13, 17, 23, 12, 18, 13, 19, 14, 17, 19, 20, 7, 19,
17, 18, 19, 14, 13, 15, 13, 16, 13, 14, 14, 15, 16, 21, 13, 9,
22, 17, 27, 13, 14, 12, 17, 16, 18, 21, 27, 18, 23, 11, 21, 21,
31, 23, 32, 38, 32, 27, 36, 36, 32, 33, 28, 33, 25, 37, 38, 29,
31, 36, 33, 34, 36, 32, 22, 35, 28, 25, 24, 37, 27, 28, 23, 38,
31, 25, 22, 32, 29, 35, 35, 28, 23, 28, 33, 22, 23, 20, 25, 16,
26, 28, 23, 30, 19, 29, 30, 34, 32, 27, 24, 24, 29, 27, 22, 25,
29, 27, 31, 23, 20, 24, 25, 28, 25, 32, 22, 27, 27, 28, 31, 30,
25, 23, 36, 35, 30, 33, 33, 28, 46, 31, 31, 32, 30, 31, 27, 26,
28, 26, 29)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1500L))
library(brms)
mix <- mixture(gaussian, gaussian)
prior <- c(
prior(normal(18, 1), Intercept, dpar = mu1),
prior(normal(30, 15), Intercept, dpar = mu2)
)
dat.brm <- brm(Count ~ 1, data = dat,
family = mix,
prior = prior,
chains = 5, iter = 4000, warmup = 2000, thin = 5,
cores = 24)
summary(dat.brm)
pp_check(dat.brm, nsamples = 1e3)
posterior_summary(dat.brm)
Family: mixture(gaussian, gaussian)
Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
Formula: Count ~ 1
Data: sample_dat (Number of observations: 1500)
Samples: 5 chains, each with iter = 4000; warmup = 2000; thin = 5;
total post-warmup samples = 2000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
mu1_Intercept 15.67 0.27 15.16 16.19 1812 1.00
mu2_Intercept 29.98 0.51 29.06 31.04 2080 1.00
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma1 3.12 0.28 2.62 3.70 1975 1.00
sigma2 9.56 0.28 8.97 10.08 2063 1.00
theta1 0.26 0.03 0.21 0.31 2018 1.00
theta2 0.74 0.03 0.69 0.79 2018 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
While you can see the pp_check()
looks great, I’m not even sure how to interpret this to give me an estimate of Count
. Any guidance is surely appreciated! Also, am I correct in thinking that I should be able to transform my posterior to get a similar estimate for N_v?