Improving PPC Fit in MELS model (brms) when using Multimodal Data

Hi everyone,

First time brms and stan user here! I’m having some problems with my mixed effects locations scale (MELS) model fit and would really appreciate any advice. More specifically, I’m trying to explore the relationship between various personality trait scores (continuous) and both the mean (location) and variance (scale) in “cents” given across recipients from 10 different countries. For this I’m trying to use a MELS model, using the brms package. I think my main issue is that my outcome “cents” has a lot of 0’s (i.e., many people decided to give no money away) and, additionally, appears to have a multimodal distribution:


.
To try to deal with this I attempted to specify the family as skew_normal() in my code:

# Main effects model with all traits 
fit2_cents_skew5 <- df_long %>%
  brm(bf(cents ~ Agreeableness + 
           D_Factor + 
           Fairness + 
           Honesty_Humility + 
           Openness + 
           RWA + 
           SDO + 
           (1|ppno|country), 
         sigma ~ Agreeableness + 
           D_Factor + 
           Fairness + 
           Honesty_Humility + 
           Openness + 
           RWA + 
           SDO + 
           (1|ppno|country)), 
      data = ., 
      family = skew_normal(),
      control = list(adapt_delta = 0.95),
      iter = 10000, 
      cores = 4, 
      file = "Final_Models/fit2_cents_skew5")

But when I do the posterior predictive checks it looks like this (i.e., the distributions don’t match at all):

Any advice on what to change in the model to adapt it to my data (e.g., which family to use) or whether I can actually use such a model for this kind of data at all?

Thanks a lot in advance! :)


Howdy!
I would step back and try to think about how this data might be generated. If “cents” is money, then it would be likely that they can only give away the amount of cents in the selection of coins available. For example, in US currency, unless in the unlikely event that someone was carrying around a pocket full of pennies, you would see your data as manifesting as multiples of 5, 10, and 25 cent amounts, corresponding to nickels, dimes, and quarters. Since people find it annoying to carry change these days, and even a few dollars doesn’t buy much of anything, you might see a lot of multiples of quarters (25 cent). So, a continuous distribution on positive reals doesn’t really make a lot of ‘cents’ (haha pun intended). Perhaps instead of the amount of cents, you might think about modeling the number of each type of coin… I am not sure right off the top of my head. ‘Cents’ is a very small amount of money… where did this data come? Is this some sort of psychology experiment where people were given small amounts of money?

1 Like

Hi! Thanks so much for the response and very good point! So, in more detail, this is indeed a psychological experiment: Participants took part in an online study where they received a bonus of 100 cents and asked how much of this 100 cent bonus they would like to share with another recipient (participants would answer this question a total of 10 times with 10 different recipients). Particularly, participants could give their answer in increments of 10 cents (e.g., 0 cents, 10 cents, 20 cents), but the total amount of money is not actually that large to convert it to types of coins (e.g., dimes) I think! Unless I missed something?

Thanks again for the help :)

So essentially they were given 10 virtual dimes to share with another recipient, since they could only answer in increments of 10 with a max of 100 cents, that is like giving them 10 dimes. So, you could model your data as the number of dimes out of 10 that were shared, which might suggest a binomial distribution, where your outcome variable was 0 - 10 and number of trials 10. However, your data generation process seems a bit more complex than this. As you noted, you appear to have excess responses at zero and at 50 cents. Why might this be? Was this slider scale data, and if so, was the slider already on 50 cents? How did people choose to share?

This data might suggest a mixture model where you model the probability of sharing at all, and if a person shares, the excess probability that they share 5 dimes over that of some distribution like Poisson or binomial. You might model this with a mixture model like a hurdle-Poisson where the Poisson is truncated at both 1 and 10 and with an additional mixture for the 5 dime response. You might could also use a mixture with a binomial distribution with 10 trials, as the max number of dimes participants get is 10. (Your design of 11 possible choices (0 + 1-10 donation of dimes) repeated 10 times per participant reminds me of multinomial as well, except your data has order to it.)

Using the hurdle-truncated-Poisson mixture as an example, you can simulate data that looks reasonably similar to your outcome with this R code:

rtpois <- function(N, lambda)
  qpois(runif(N, dpois(0, lambda), 1-dpois(10, lambda)), lambda)

theta <- 0.55
pi <- 0.3
lambda <- 2.5
y <- rbinom(10000, 1, theta)
for (n in 1:length(y)) {
       if(y[n] == 1) {
         y[n] <- rbinom(1, 1, pi)
         y[n] <- ifelse(y[n]==1, 5, rtpois(1, lambda))
       }
     }
hist(y)

I think you can fit hurdle-Poisson in brms but to add the mixture for the 5 dimes, you would need to use Stan.

1 Like

To follow up on my response, here is the simulation and the Stan model to recover the parameters and generate retrodictive checks (aka posterior predictive checks) for the data for the hurdle-Poisson data generation process that I described with a truncated Poisson and inflation at 5 dimes. (Note I am using an older version of rstan, so you would want to use the newer array synax).

library(rstan)

#function to simulate truncated Poisson data where data is between 1 and 10
rtpois <- function(N, lambda)
  qpois(runif(N, dpois(0, lambda), 1-dpois(10, lambda)), lambda)

#simulation of hurdle-Poisson process with inflation at 5 and truncation at 10
theta <- 0.55 #45% zeroes for below sim process
pi <- 0.3
lambda <- 2.5
y <- rbinom(500, 1, theta)
for (n in 1:length(y)) {
       if(y[n] == 1) {
         y[n] <- rbinom(1, 1, pi)
         y[n] <- ifelse(y[n]==1, 5, rtpois(1, lambda))
       }
     }
hist(y)

#Stan data
stan_data <- list(y = y, N = length(y))

#Stan model
stan_model <- "
data {
  int<lower=0> N;
  int<lower=0> y[N];
}
parameters {
  real<lower=0, upper=1> theta;
  simplex[2] pi;
  real<lower=0> lambda;
}
model {
  target += beta_lpdf(theta | 1, 1); 
  target += dirichlet_lpdf(pi | rep_vector(1, 2));
  target += normal_lpdf(lambda | 3, 1);
  for (n in 1:N) {
    real poisson_t_lp = log(pi[2]) + poisson_lpmf(y[n] | lambda)
                              - log_diff_exp(poisson_lcdf(10 | lambda), poisson_lcdf(0 | lambda));
    if (y[n] == 0)
      target += log(theta);
    else if (y[n] == 5)
      target += log_sum_exp(log(pi[1]), poisson_t_lp) + log1m(theta);
    else
      target += poisson_t_lp + log1m(theta);
  }

}
generated quantities {
  vector[N] y_pred;
  for (n in 1:N) {
    y_pred[n] = 11;
    if (bernoulli_rng(theta))
      y_pred[n] = 0;
    else if (bernoulli_rng(pi[1]))
        y_pred[n] = 5;
    else
      while (y_pred[n] < 1 || y_pred[n] > 10)
        y_pred[n] = poisson_rng(lambda);
  }
}
"

m1 <- stan(model_code = stan_model, data = stan_data,
             chains = 4, iter = 2000, warmup = 1000,
             thin = 1, cores = 4)

print(m1, pars=c("theta","pi","lambda"))

#extract predictions for retrodictive checks
y_preds <- extract(m1)$y_pred

#retrodictive check plot
library(scales)
c_light_m1 <- c("#edf8e9")
c_light_highlight_m1 <- c("#c7e9c0")
c_mid_m1 <- c("#a1d99b")
c_mid_highlight_m1 <- c("#74c476")
c_dark_m1 <- c("#31a354")
c_dark_highlight_m1 <- c("#006d2c")


B <- max(y_preds) + 1
idx <- rep(1:B, each=2)
x <- sapply(1:length(idx), function(b) if(b %% 2 == 0) idx[b] + 0.5 else idx[b] - 0.5)
x <- (x - 1)

obs_counts <- hist(y, breaks=(0:(B)) - 0.5, plot=FALSE)$counts
pad_obs_counts <- sapply(idx, function(n) obs_counts[n])

pred_counts <- sapply(1:1000, function(n) 
                      hist(y_preds[n,], breaks=(0:B) - 0.5, plot=FALSE)$counts)

probs = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)
cred<- sapply(1:B, function(b) quantile(pred_counts[b,], probs=probs))
pad_cred <- do.call(cbind, lapply(idx, function(n) cred[1:9,n]))

plot(1, type="n", main="",
     xlim=c(-0.5, B + 0.5), xlab="number of dimes",
     ylim=c(0, 500), ylab="", cex.lab=2 , cex.axis=1.9)
title("Posterior predictions for number of dimes", line = 0.5, font.main=1, cex.main=2.2)
polygon(c(x, rev(x)), c(pad_cred[1,], rev(pad_cred[9,])),
        col = alpha(c_light_m1, 0.7), border = NA)
polygon(c(x, rev(x)), c(pad_cred[2,], rev(pad_cred[8,])),
        col = alpha(c_light_highlight_m1, 0.7), border = NA)
polygon(c(x, rev(x)), c(pad_cred[3,], rev(pad_cred[7,])),
        col = alpha(c_mid_m1, 0.7), border = NA)
polygon(c(x, rev(x)), c(pad_cred[4,], rev(pad_cred[6,])),
        col = alpha(c_mid_highlight_m1, 0.7), border = NA)
lines(x, pad_cred[5,], col=alpha(c_dark_m1, 0.7), lwd=2)

lines(x, pad_obs_counts, col="white", lty=1, lw=2.5)
lines(x, pad_obs_counts, col="black", lty=1, lw=2)

Which provides quite a bit better retrodictive checks for data like yours. You could write the model such that theta, pi, and lambda were predicted by your covariates of interest.

1 Like