The function bayes_factor in brms behaves oddly when the model includes a term with measurement error - me(F,sdF1)

I wish to estimate the effect of an environmental factor, F, on the numbers of people with low body weight. The design is a multi-level poisson regression that has repeated measures within counties, within states within the USA.

I constructed (1) a null model - m0 - that has no ‘F’ term; (2) a simple model - m1 - that has a term for ‘F’, but does not include its measurement error; (3) a model - m1e - that includes adjustment for measurement error in F (coded as me(F, sdF)). Here is the (abbreviated) summary for m1e (all Rhats=1.00, all sample sizes>4000):-

```
Family: poisson
Links: mu = log
Formula: LBW ~ 0 + intercept + year + (1 | SCode/County) + year .... + me(F, sdF) + offset(log(population))
Data: dat (Number of observations: 1211)
Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
total post-warmup samples = 20000
Group-Level Effects:
~SCode (Number of levels: 27)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.05 0.01 0.03 0.07 1.00 5348 9513
~SCode:County (Number of levels: 173)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.04 0.00 0.03 0.05 1.00 7493 11669
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
intercept -2.60 0.03 -2.66 -2.54 1.00 4686 6828
year 0.17 0.02 0.12 0.21 1.00 12543 14845
.
.
.
meFsdF 0.12 0.04 0.04 0.20 1.00 4029 5524
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

The loo information criteria for the 3 models are similar:-

```
> loo(m0,m1,m1e)
Output of model 'm0':
Computed from 10000 by 1211 log-likelihood matrix
Estimate SE
elpd_loo -4679.5 32.2
p_loo 115.2 5.4
looic 9359.0 64.4
------
Monte Carlo SE of elpd_loo is 0.1.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 1207 99.7% 1246
(0.5, 0.7] (ok) 4 0.3% 290
(0.7, 1] (bad) 0 0.0% <NA>
(1, Inf) (very bad) 0 0.0% <NA>
Output of model 'm1':
Computed from 10000 by 1211 log-likelihood matrix
Estimate SE
elpd_loo -4680.1 32.4
p_loo 114.0 5.4
looic 9360.3 64.7
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 1209 99.8% 1195
(0.5, 0.7] (ok) 1 0.1% 1440
(0.7, 1] (bad) 1 0.1% 286
(1, Inf) (very bad) 0 0.0% <NA>
Output of model 'm1e':
Computed from 20000 by 1211 log-likelihood matrix
Estimate SE
elpd_loo -4677.9 31.7
p_loo 130.8 6.5
looic 9355.7 63.5
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 1202 99.3% 2533
(0.5, 0.7] (ok) 8 0.7% 768
(0.7, 1] (bad) 1 0.1% 222
(1, Inf) (very bad) 0 0.0% <NA>
Model comparisons:
elpd_diff se_diff
m1e 0.0 0.0
m0 -1.6 2.6
m1 -2.3 1.5
```

Now, if I compute the bayes factor for m1 over m0, I obtain a reasonable value:-

```
set.seed(12345); bayes_factor(m1,m0,log=T,silent=T, reloo=T)
Estimated log Bayes factor in favor of bridge1 over bridge2: -0.46107
```

but, if I compute the bayes factor for m1e over m0, the result is astronomically large:-

```
set.seed(12345); bayes_factor(m1e,m0,log=T,silent=T, reloo=T)
Estimated log Bayes factor in favor of bridge1 over bridge2: 334.56230
```

exp(334.6) is too large to write here.

Why would two models whose looic values are so similar show such a large Bayes factor? Is this a bug in bayes_factor?

Thanks, in anticipation of your help

The priors for m0 and m1 are:

```
prior class coef group resp dpar nlpar bound
1 normal(0,1) b
2 normal(-2.5,.5) b intercept
3 gamma(1,10) sd
The priors for m1e are:-
prior class coef group resp dpar nlpar bound
1 normal(0,1) b
2 normal(-2.5,.5) b intercept
3 gamma(1,10) sd
4 normal(0,1) meanme
5 gamma(1,5) sdme
```

```
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8
[5] LC_MONETARY=en_NZ.UTF-8 LC_MESSAGES=en_NZ.UTF-8
[7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] brms_2.10.0 Rcpp_1.0.3
loaded via a namespace (and not attached):
[1] Brobdingnag_1.2-6 gtools_3.8.1 StanHeaders_2.19.0
[4] threejs_0.3.1 shiny_1.4.0 assertthat_0.2.1
[7] stats4_3.6.1 pillar_1.4.2 backports_1.1.5
[10] lattice_0.20-38 glue_1.3.1 digest_0.6.22
[13] promises_1.1.0 colorspace_1.4-1 htmltools_0.4.0
[16] httpuv_1.5.2 Matrix_1.2-17 plyr_1.8.4
[19] dygraphs_1.1.1.6 pkgconfig_2.0.3 rstan_2.19.2
[22] purrr_0.3.3 xtable_1.8-4 scales_1.1.0
[25] processx_3.4.1 later_1.0.0 tibble_2.1.3
[28] bayesplot_1.7.0 ggplot2_3.2.1 DT_0.10
[31] shinyjs_1.0 lazyeval_0.2.2 cli_1.1.0
[34] magrittr_1.5 crayon_1.3.4 mvnfast_0.2.5
[37] mime_0.7 ps_1.3.0 nlme_3.1-142
[40] xts_0.11-2 pkgbuild_1.0.6 colourpicker_1.0
[43] rsconnect_0.8.15 tools_3.6.1 loo_2.1.0
[46] prettyunits_1.0.2 lifecycle_0.1.0 matrixStats_0.55.0
[49] stringr_1.4.0 munsell_0.5.0 callr_3.3.2
[52] compiler_3.6.1 rlang_0.4.1 grid_3.6.1
[55] ggridges_0.5.1 rstudioapi_0.10 htmlwidgets_1.5.1
[58] crosstalk_1.0.0 igraph_1.2.4.1 miniUI_0.1.1.1
[61] base64enc_0.1-3 gtable_0.3.0 codetools_0.2-16
[64] inline_0.3.15 abind_1.4-5 markdown_1.1
[67] reshape2_1.4.3 R6_2.4.1 gridExtra_2.3
[70] rstantools_2.0.0 zoo_1.8-6 bridgesampling_0.7-2
[73] dplyr_0.8.3 fastmap_1.0.1 shinystan_2.5.0
[76] shinythemes_1.1.2 stringi_1.4.3 parallel_3.6.1
[79] tidyselect_0.2.5 coda_0.19-3
```