Beta binomial mixture model

I am trying to fit a beta binomial mixture model with weights as a random variable (hyper parameter) with two parts.

The likelihood results is in between the two priors. However, in the posterior distributions do not move toward the likelihood.

What am I doing wrong in the stan code? From looking at examples online I understood that the b parameter must be ordered, but this does not seem to help (and adds warnings since I can’t constrain the range [0,1] in an ordered vector).

I have also tried to use log_mix, but the documentation of this function and examples are very limited.


data

input_data_post <- list(
  N = 59,
  y = c(rep(1,45), rep(0,14)),
  K = 2,
  alpha = c(25.8, 7.5),
  beta = c(4.2, 7.5),
  prior_only = 0 #if 1 then only prior will be used (no data)
)

stan code


data {
  int<lower=1> K; // number of mixture components
  int<lower=1> N; // number of data points
  int<lower=0,upper=1> y[N]; // observations
  real<lower = 0> alpha[K];
  real<lower = 0> beta[K];
  int<lower=0,upper=1> prior_only; // =1 to get data from prior predictive
}
parameters {
  //simplex[K] w;
  real<lower=0.59, upper=0.61> w0; // mixing proportions
  positive_ordered[K] b; // locations of mixture components
}
transformed parameters {
  simplex[K] w;
  vector[K] log_w;
  w[2] = w0;
  w[1] = 1 - w0;
  
  for(i in 1:K){
    log_w[i] = log(w[i]);
  }
  
}
model {
  // contributions from each component
  // these get overwritten for each observation
  vector[K] lps; 
  
  // priors
  w0 ~ uniform(0,1);
  b[2] ~ beta(alpha[1], beta[1]);
  b[1] ~ beta(alpha[2], beta[2]);

  // individual likelihoods, as sum of component contributions
  for (n in 1:N) {
    for (k in 1:K) {
      lps[k] = log_w[k] +  bernoulli_lpmf(y[n] | b[k]);
    }
    // add the observation likelihood contribution if not prior-only
    if (prior_only == 0) {
      target += log_sum_exp(lps);  
    }
  }
}

I haven’t run the code, and I don’t have time to suggest an alternative, but as written the code restricts w0 to [0.59, 0.61], and alpha and beta are invariant, since they’re data. Unless I’m missing something, b won’t be influenced by the data, and w0 will be restricted.

Kent

Thank you, Kent, for looking at my question. The reason w0 is so tight is because I wanted to make sure I can get a similar result when using a regular R code for a fixed w0. You are right that alpha and beta are in the data - these are fixed parameters to define beta distribution of the each mixture part, i.e., b[1] and b[2]. However, each mixture part is still a random variable, therefore the data, y, should have an impact on the corresponding posteriors.

The posterior does move toward the likelihood but it’s hard to see because likelihood is not very informative. In particular, the data does not say anything about either b[1] or b[2] separately but only informs the combination (1-w0)*b[1] + w0*b[2].

Rplot

There’s a workaround.

parameters {
  simplex[K+1] b_diff;
}
transformed parameters {
  vector[K] b = cumulative_sum(b_diff[:K]);
}
1 Like

That’s the point I was trying to make, but you explained it more clearly.

1 Like

Thank you, Both, for your help.
Yulia.

1 Like