Difference for splines between brms and gamm4

Hey there!

I’m getting unexpected results using a 3x3 categorical interaction with cyclic cubic splines in brms.
I tried fitting a similar model in gamm4 and get different results (but had to remove some group-level varying intercepts to be able to fit it).

The brms model converged, as far as I can judge, but the difference to gamm4 and simple descriptive means (both ignoring some structure in the data, but that should not affect the shape, right?) just seems very stark. I’m now fitting a model in brms that is more comparable to the one in gamm4, but it will take a few days. In addition, I fit a spline-free model (using predicted conception risk which peaks in the middle of RCD_squished) and get the results I expect.

Questions

Anybody have an idea why the results may be so different / what I might be doing wrong / suggestions for steps to take to diagnose the problem?

Happy to explain the model/data more, if that helps.

Differences in specification

  • cumulative instead of Gaussian
  • allowing for disc to vary by contraception and woman
  • varying intercepts for item, day:woman (woman=“short” in the model summary model), cycle_nr:woman

Priors

In retrospect, I’m not sure if I should have set a prior on the intercept (or whether it even applied in a cumulative model).

c(set_prior("normal(0,1)", class = "b"),
                               set_prior("normal(0,1)", class = "sd"),
                               set_prior("normal(0,1)", class = "Intercept")),

Differences in results

In the coloured graph (from brms) you can see, for example, the peak in the first column is totally shifted to the left compared to the gamm 4 fit (dashed lines).

Model summary

 Family: cumulative 
  Links: mu = probit; disc = log 
Formula: value ~ estrogen_progestogen * who + s(RCD_squished, by = interaction(estrogen_progestogen, who), bs = "cc") + (1 | short:created_diary) + (1 | short:cycle_nr) + (1 | short) + (1 + who | variable) 
         disc ~ estrogen_progestogen + (1 | short)
   Data: diary_long (Number of observations: 128454) 
Samples: 3 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup samples = 6000

Smooth Terms: 
                                                                                          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.extra_pair_1)                 0.10      0.04     0.05     0.20       1792 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.extra_pair_1)     0.05      0.02     0.02     0.11       2556 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.extra_pair_1)             0.07      0.05     0.01     0.20       2478 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.in_pair_1)                    0.06      0.03     0.03     0.13       1922 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.in_pair_1)        0.06      0.04     0.02     0.15       1432 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.in_pair_1)                0.05      0.05     0.00     0.17       1786 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.single_1)                     0.07      0.03     0.03     0.15       1680 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.single_1)         0.13      0.05     0.05     0.26       1850 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.single_1)                 0.09      0.08     0.01     0.31       1699 1.00

Group-Level Effects: 
~short (Number of levels: 873) 
                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)          0.97      0.14     0.70     1.26        999 1.00
sd(disc_Intercept)     0.46      0.01     0.43     0.48       1105 1.00

~short:created_diary (Number of levels: 37895) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.90      0.13     0.65     1.17        958 1.00

~short:cycle_nr (Number of levels: 2107) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.40      0.06     0.29     0.52        877 1.00

~variable (Number of levels: 7) 
                          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)                 0.71      0.31     0.32     1.53       1665 1.00
sd(whoin_pair)                0.67      0.28     0.30     1.39       2747 1.00
sd(whosingle)                 0.53      0.21     0.26     1.05       2035 1.00
cor(Intercept,whoin_pair)     0.07      0.36    -0.64     0.71       3279 1.00
cor(Intercept,whosingle)     -0.45      0.31    -0.89     0.29       3927 1.00
cor(whoin_pair,whosingle)     0.09      0.38    -0.64     0.75       2939 1.00

Population-Level Effects: 
                                                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept[1]                                                0.74      0.32    -0.04     1.26       1891 1.00
Intercept[2]                                                1.61      0.35     0.87     2.25       2032 1.00
Intercept[3]                                                2.70      0.43     1.87     3.57       1748 1.00
Intercept[4]                                                4.05      0.58     2.94     5.22       1224 1.00
disc_Intercept                                             -0.47      0.15    -0.74    -0.16        966 1.00
estrogen_progestogenprogestogen_and_estrogen               -0.58      0.13    -0.85    -0.35       1057 1.00
estrogen_progestogenprogestogen_only                       -0.06      0.30    -0.64     0.55       2211 1.00
whoin_pair                                                  2.68      0.47     1.77     3.63       1480 1.00
whosingle                                                   1.65      0.31     1.05     2.28       1942 1.00
estrogen_progestogenprogestogen_and_estrogen:whoin_pair     0.77      0.12     0.55     1.01       1024 1.00
estrogen_progestogenprogestogen_only:whoin_pair             0.45      0.11     0.25     0.68       2191 1.00
estrogen_progestogenprogestogen_and_estrogen:whosingle      0.41      0.19     0.06     0.80       1406 1.00
estrogen_progestogenprogestogen_only:whosingle              0.54      0.53    -0.50     1.58       3256 1.00
disc_estrogen_progestogenprogestogen_and_estrogen           0.11      0.04     0.03     0.17        622 1.00
disc_estrogen_progestogenprogestogen_only                   0.14      0.12    -0.10     0.37       1370 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
Warning message:
There were 3 divergent transitions after warmup. Increasing adapt_delta above 0.9 may help.
See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
  • Operating System: Ubuntu 16.04.5 LTS (R 3.5.2)
  • brms Version: 2.7.0

I fit the model with a gaussian distribution now and removed the distributional regression parts. The graph still looks very different than the GAM output. There is still the difference in group-level intercepts, but again, these should not affect the shape, right?
The oddest thing for me is that the middle column looks pretty similar—it’s the other two that deviate so much.

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: value ~ estrogen_progestogen * who + s(RCD_squished, by = interaction(estrogen_progestogen, who), bs = "cc") + (1 | short:created_diary) + (1 | short) + (1 + who || variable) 
   Data: diary_long (Number of observations: 128454) 
Samples: 6 chains, each with iter = 3000; warmup = 1000; thin = 1;
         total post-warmup samples = 12000

Smooth Terms: 
                                                                                          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.extra_pair_1)                 0.06      0.02     0.03     0.11       2435 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.extra_pair_1)     0.02      0.01     0.00     0.04       3462 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.extra_pair_1)             0.04      0.03     0.01     0.12       3960 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.in_pair_1)                    0.04      0.02     0.02     0.09       3101 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.in_pair_1)        0.04      0.03     0.01     0.10       1727 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.in_pair_1)                0.04      0.03     0.00     0.13       3545 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhonon_hormonal.single_1)                     0.04      0.02     0.02     0.09       2862 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_and_estrogen.single_1)         0.08      0.03     0.04     0.17       4018 1.00
sds(sRCD_squishedinteractionestrogen_progestogenwhoprogestogen_only.single_1)                 0.10      0.09     0.01     0.35       2309 1.00

Group-Level Effects: 
~short (Number of levels: 873) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.53      0.01     0.50     0.55       1641 1.00

~short:created_diary (Number of levels: 37895) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.55      0.00     0.55     0.56       4529 1.00

~variable (Number of levels: 7) 
               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)      0.21      0.08     0.11     0.42       5237 1.00
sd(whoin_pair)     0.53      0.21     0.27     1.05       4313 1.00
sd(whosingle)      0.26      0.10     0.14     0.52       4532 1.00

Population-Level Effects: 
                                                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept                                                   1.73      0.09     1.54     1.90       2196 1.00
estrogen_progestogenprogestogen_and_estrogen               -0.30      0.05    -0.39    -0.21        828 1.01
estrogen_progestogenprogestogen_only                       -0.20      0.16    -0.52     0.13       2031 1.00
whoin_pair                                                  1.22      0.22     0.72     1.64       2567 1.00
whosingle                                                   0.78      0.11     0.55     1.00       2118 1.00
estrogen_progestogenprogestogen_and_estrogen:whoin_pair     0.44      0.01     0.42     0.47      18624 1.00
estrogen_progestogenprogestogen_only:whoin_pair             0.54      0.05     0.44     0.64      15142 1.00
estrogen_progestogenprogestogen_and_estrogen:whosingle      0.22      0.09     0.05     0.40       1187 1.01
estrogen_progestogenprogestogen_only:whosingle              0.43      0.31    -0.17     1.03       3307 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     1.04      0.00     1.03     1.04       8716 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

Hi Ruben, are you using marginal_effects or marginal_smooths? The latter would likely be the equivalent of the plot output of mgcv. In any case, I agree it’s very weird that the results look so different, but I cannot judge if not estimating group-level intercepts affects the results too much. It would be necessary to fit the exact same model with both packages and then compare the outputs again…
I am sorry that I currently cannot give you more specific advice.

I’m using marginal_smooths. Ok, if you think the differences in group-level structure can explain, I’ll rerun the exact same model in brms (because I cannot fit it in gamm4 or don’t know how at least) and I’ll make a smaller reproducible example to test that the ad-hoc syntax with by=interaction isn’t the culprit. Thanks!

At Paul’s suggestion, I reran the model in brms, trying to make it very similar to the one I could fit using gamm4. Because I thought my prior choice of normal(0,1) could have been more informative than I intended, I used default priors. It made no difference (and the estimated effects are well within the bulk of normal(0,1) too). Leaving out the more complex random effects structure, I still get a graph that looks very different to the one estimated by gamm4.

For contrast, here is a descriptive plot.

At Paul’s suggestion, I tried coming up with a simpler reproducible example. I could not reproduce the problem using this simpler example, so it’s probably not due to the slightly unusual specification by=interaction(who, contra).

One odd thing that I noticed is that the marginal_smooths estimated for my real model (not for my fake model) in brms leave the range of my data (values from 0 to 4). This is especially odd given that the first model (first posted) uses a cumulative outcome family which is “aware” that there is a boundary at zero.

The fixed effects and the SD for the intercept are estimated very similar by gamm4 and brms. The SDs for the smooths differ more (differences up to 0.10 when most SD estimates are in the range of 0.04), so it doesn’t seem like it’s only about the plotting function or something like that.

I’d greatly appreciate further advice on

  • how to debug my model
  • how to make a more realistic simulated example.

To make sure we are on the same page: Are we now fitting the same model in both packages (minus priors of course) or is there anything else that is different?

One odd thing that I noticed is that the marginal_smooths estimated for my real model (not for my fake model) in brms leave the range of my data (values from 0 to 4). This is especially odd given that the first model (first posted) uses a cumulative outcome family which is “aware” that there is a boundary at zero.

marginal_smooths displays the smooth on the linear scale that is before transformation by the link function. As such, the range of these plots doesn’t tell you much how the smooth term looks like after transforming to the ordinal scale the cumulative density and its link function (logit by default).

Yes, the exact same model formula and default priors in brms.

re marginal_smooths: that explanation makes sense for the first graph on this page (cumulative dist with probit link), but not for the Gaussian model (the last graph I posted), right?

re marginal_smooths : that explanation makes sense for the first graph on this page (cumulative dist with probit link), but not for the Gaussian model (the last graph I posted), right?

Indeed. It’s hard to say what is going on without an example model that I can run with reasonable time myself. In the simple toy model, where results looked similar, did you also use a cc spline? Perhaps there is something wrong in the processing of this spline type?

Argh—of course I meant to use cc splines in my example, but then forgot. This made no difference though.

However, when I added a group-level intercept, I get diverging graphs from gamm4 and brms:
http://rpubs.com/rubenarslan/486368

The dashed line is predicted from gamm4, the solid line with CI is from brms::marginal_smooths.

Does this help? Or is it not a real difference? I haven’t experimented with trying to make the difference bigger.

I thought I’d add a plot from my real comparison. I subtracted 3 from the gamm4 predictions, so that the shapes can be more easily compared (otherwise the brms and gamm4 estimates are so far apart in means). Some of the differences are quite stark; other less so.

I can really just speculate. Perhaps some default prior is accidentally more informative that it should be. Can you show me the generated Stan code?

Of course, it is also possible that this difference indicates a problem in the estimation of gamm4 not of brms, but I cannot judge without seeing how the packages recover parameters from simulated models.

At least for the mean levels, it’s pretty clear that the brms plot is off (marginal_effects seems fine though), given that it goes outside the range of the data and does not recover the mean levels.

I’m happy to send the whole model if that helps (140MB Gaussian, 2,4GB cumulative). I will a) try whether the difference also appears if I subset the data to just one of the panels where I see a big difference (that might make a manageable subset for quicker tests) and b) try a different spline.

What it looks like to me (speculating) is that the cyclic part goes wrong, because the discrepancies seem largest on either end of RCD_squished. Could it be that the level on either end is somehow encoded to be the same despite the interactions? I guess further minimal tests can help answer this.

Stancode

// generated with brms 2.7.0
functions { 
} 
data { 
  int<lower=1> N;  // total number of observations 
  vector[N] Y;  // response variable 
  int<lower=1> K;  // number of population-level effects 
  matrix[N, K] X;  // population-level design matrix 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.extra_pair
  int nb_1;  // number of bases 
  int knots_1[nb_1]; 
  matrix[N, knots_1[1]] Zs_1_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.extra_pair
  int nb_2;  // number of bases 
  int knots_2[nb_2]; 
  matrix[N, knots_2[1]] Zs_2_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.extra_pair
  int nb_3;  // number of bases 
  int knots_3[nb_3]; 
  matrix[N, knots_3[1]] Zs_3_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.in_pair
  int nb_4;  // number of bases 
  int knots_4[nb_4]; 
  matrix[N, knots_4[1]] Zs_4_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.in_pair
  int nb_5;  // number of bases 
  int knots_5[nb_5]; 
  matrix[N, knots_5[1]] Zs_5_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.in_pair
  int nb_6;  // number of bases 
  int knots_6[nb_6]; 
  matrix[N, knots_6[1]] Zs_6_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.single
  int nb_7;  // number of bases 
  int knots_7[nb_7]; 
  matrix[N, knots_7[1]] Zs_7_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.single
  int nb_8;  // number of bases 
  int knots_8[nb_8]; 
  matrix[N, knots_8[1]] Zs_8_1; 
  // data of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.single
  int nb_9;  // number of bases 
  int knots_9[nb_9]; 
  matrix[N, knots_9[1]] Zs_9_1; 
  // data for group-level effects of ID 1
  int<lower=1> J_1[N];
  int<lower=1> N_1;
  int<lower=1> M_1;
  vector[N] Z_1_1;
  int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
  int Kc = K - 1; 
  matrix[N, K - 1] Xc;  // centered version of X 
  vector[K - 1] means_X;  // column means of X before centering 
  for (i in 2:K) { 
    means_X[i - 1] = mean(X[, i]); 
    Xc[, i - 1] = X[, i] - means_X[i - 1]; 
  } 
} 
parameters { 
  vector[Kc] b;  // population-level effects 
  real temp_Intercept;  // temporary intercept 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.extra_pair
  vector[knots_1[1]] zs_1_1; 
  real<lower=0> sds_1_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.extra_pair
  vector[knots_2[1]] zs_2_1; 
  real<lower=0> sds_2_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.extra_pair
  vector[knots_3[1]] zs_3_1; 
  real<lower=0> sds_3_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.in_pair
  vector[knots_4[1]] zs_4_1; 
  real<lower=0> sds_4_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.in_pair
  vector[knots_5[1]] zs_5_1; 
  real<lower=0> sds_5_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.in_pair
  vector[knots_6[1]] zs_6_1; 
  real<lower=0> sds_6_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")non_hormonal.single
  vector[knots_7[1]] zs_7_1; 
  real<lower=0> sds_7_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_and_estrogen.single
  vector[knots_8[1]] zs_8_1; 
  real<lower=0> sds_8_1; 
  // parameters of smooth s(RCD_squished,by=interaction(estrogen_progestogen,who),bs="cc")progestogen_only.single
  vector[knots_9[1]] zs_9_1; 
  real<lower=0> sds_9_1; 
  real<lower=0> sigma;  // residual SD 
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // unscaled group-level effects
} 
transformed parameters { 
  vector[knots_1[1]] s_1_1 = sds_1_1 * zs_1_1; 
  vector[knots_2[1]] s_2_1 = sds_2_1 * zs_2_1; 
  vector[knots_3[1]] s_3_1 = sds_3_1 * zs_3_1; 
  vector[knots_4[1]] s_4_1 = sds_4_1 * zs_4_1; 
  vector[knots_5[1]] s_5_1 = sds_5_1 * zs_5_1; 
  vector[knots_6[1]] s_6_1 = sds_6_1 * zs_6_1; 
  vector[knots_7[1]] s_7_1 = sds_7_1 * zs_7_1; 
  vector[knots_8[1]] s_8_1 = sds_8_1 * zs_8_1; 
  vector[knots_9[1]] s_9_1 = sds_9_1 * zs_9_1; 
  // group-level effects 
  vector[N_1] r_1_1 = sd_1[1] * (z_1[1]);
} 
model { 
  vector[N] mu = temp_Intercept + Xc * b + Zs_1_1 * s_1_1 + Zs_2_1 * s_2_1 + Zs_3_1 * s_3_1 + Zs_4_1 * s_4_1 + Zs_5_1 * s_5_1 + Zs_6_1 * s_6_1 + Zs_7_1 * s_7_1 + Zs_8_1 * s_8_1 + Zs_9_1 * s_9_1;
  for (n in 1:N) { 
    mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
  } 
  // priors including all constants 
  target += student_t_lpdf(temp_Intercept | 3, 2, 10); 
  target += normal_lpdf(zs_1_1 | 0, 1); 
  target += student_t_lpdf(sds_1_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_2_1 | 0, 1); 
  target += student_t_lpdf(sds_2_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_3_1 | 0, 1); 
  target += student_t_lpdf(sds_3_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_4_1 | 0, 1); 
  target += student_t_lpdf(sds_4_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_5_1 | 0, 1); 
  target += student_t_lpdf(sds_5_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_6_1 | 0, 1); 
  target += student_t_lpdf(sds_6_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_7_1 | 0, 1); 
  target += student_t_lpdf(sds_7_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_8_1 | 0, 1); 
  target += student_t_lpdf(sds_8_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(zs_9_1 | 0, 1); 
  target += student_t_lpdf(sds_9_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += student_t_lpdf(sigma | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(z_1[1] | 0, 1);
  // likelihood including all constants 
  if (!prior_only) { 
    target += normal_lpdf(Y | mu, sigma);
  } 
} 
generated quantities { 
  // actual population-level intercept 
  real b_Intercept = temp_Intercept - dot_product(means_X, b); 
} 

When I restrict the data to the vertically middle right panel, I get the same shape in brms and gamm4 (and not the one in the full model with brms). The mean level is still outside the data range though.

Doesn’t seem to be about “cc”. This is a single panel model with out specifying spline options.

could you send me the smallest of these models for which you see the difference?

To update this: with Paul’s help, I figured out that the inconsistent results only happened when I included the small subgroup with very uncertain values. It’s certainly plausible that priors had a bigger influence than anticipated there. What was surprising—because I didn’t know much about splines with interactions are parameterised—was the effect on the other two groups.

I might come back to this to reproduce it in mock data and understand it better at a later date.

I would guess it is the diagonal.penalty argument being FALSE when mgcv:::gam.setup gets called that is making the difference in low information settings.

@bgoodri, thanks for the suggestion!

@ruben let’s try this out! In the latest github version of brms, I have changed the penality for newly fitted spline models. Would you mind fitting the model again and see if that changes anything?

Do I mind? I live for this ;-) I’ll update when I have results.

Results are different, but worse I guess. Should I try to alter my small reproducible example to mirror the problem?