Comparison of two models in one script and see which one provides best fit

HJAM24 · June 28, 2023, 9:37am

Hi all,

I have a model where there are different groups. I want to know whether these different groups are equal or not.
My question is based on part 12.2.2 and beyond in Doing Bayesian Data Analysis of J.K. Krushke.

Initially , I estimated the parameters of each group and compared the posterior differences between the parameter in question.
However, I would like to explore a model-comparison approach where I compare a model with a single shared parameter and a model where each group has his own distinct parameter.

Krushke says the following regarding the approach:

The model specification begins with saying that each subject has an individual ability theta[s] from
a condition-specific beta distribution:

model {

for ( s in 1:nSubj ) {
   nCorrOfSubj[s] ˜ dbin( theta[s] , nTrlOfSubj[s] )
   theta[s] ˜ dbeta( aBeta[CondOfSubj[s]] , bBeta[CondOfSubj[s]] )
}

The shape parameters of the beta distribution are then re-written in terms of the mode and concentration. Model 1 uses condition-specific omega[j], while Model 2 uses the same omega0 for all conditions. The JAGS function equals(mdlIdx,…) is used to select the appropriate model for index mdlIdx:

for ( j in 1:nCond ) {
   # Use omega[j] for model index 1, omega0 for model index 2:
   aBeta[j] <- ( equals(mdlIdx,1)*omega[j] + equals(mdlIdx,2)*omega0 ) * (kappa[j]-2)+1
   bBeta[j] <-(1-( equals(mdlIdx,1)*omega[j] + equals(mdlIdx,2)*omega0 ) ) * (kappa[j]-2)+1
   omega[j] ˜ dbeta( a[j,mdlIdx] , b[j,mdlIdx] )
}
omega0 ˜ dbeta( a0[mdlIdx] , b0[mdlIdx] )

I would like to reproduce this in Stan, but I don’t know how I can achieve this.
Inspired by the JAGS example I write

data {
   ....
}

parameters {
   vector<lower=0, upper=1>[n_trials] p;
   ....
}


model {
for (trial in 1:n_trials) {
   int<lower=0, upper=1> model_idx;
   model_idx ~ categorial(p[trial]);

   for (group in 1:n_groups) {
      if (model_idx == 0) {                                                    
         theta[group] ~ beta(phi[group] * shared_kappa, (1 - phi[group] * shared_kappa);
         y[group] ~ binomial(trials[group], theta[group]);
      } else {
         theta[group] ~ beta(phi[group] * kappa[group], (1 - phi[group] * kappa[group]);
         y[group] ~ binomial(trials[group], theta[group]);
    }

   }
}

}

Is this the correct way? My goal is to reproduce the below chart.
Thank you for your time

Bob_Carpenter · June 28, 2023, 9:24pm

Do you really think they might be equal or is it just that you might not have enough data to discriminate?

First, I’d urge you not to try to compare models this way (using a mixture model and attempting to estimate posterior model probabilities through the mixture bernoulli parameter). At least that’s what I think you’re doing. It’s making a strong closed-world assumption that you have a complete set of models along with reasonable prior estimates of their truth. As an alternative, I’d suggest using cross-validation (we have a very efficient approximate implementation in the loo package in R).

Second, I’d suggest including the whole JAGS model if you want help translating it. I don’t see how the JAGS program models model_idx for example. You won’t be able to get the Stan model to compile because Stan doesn’t support discrete parameters. Instead, you have to marginalize them out. There are examples that are similar to yours in the mixture model chapter (and more examples in the latent discrete parameters chapter).

Third, you will probably need priors on all your model parameters. But maybe those were just left out of the discussion.

mike-lawrence · June 29, 2023, 2:15am

Interesting. I’ve long wondered why there seems to be preference among the experts for separate-models-and-loo-comparison over the mixture approach. How is it that the former skirts the closed-world assumptions of the latter?

HJAM24 · June 29, 2023, 5:56am

Do you really think they might be equal or is it just that you might not have enough data to discriminate?

Well, that’s the question I guess! Overall the kappa /concentration of each group is quite large so there is enough data.

Duly noted. I will make sure to also use cross-validation. However, for my own education I still want to explore the mixture model methodology.

The full JAGS model

model {
  for ( s in 1:nSubj ) {
    nCorrOfSubj[s] ~ dbin( theta[s] , nTrlOfSubj[s] )
    theta[s] ~ dbeta( aBeta[CondOfSubj[s]] , bBeta[CondOfSubj[s]] ) 
  }

  for ( j in 1:nCond ) {
    # Use omega[j] for model index 1, omega0 for model index 2:
    aBeta[j] <-       ( equals(mdlIdx,1)*omega[j] 
                      + equals(mdlIdx,2)*omega0  )   * (kappa[j]-2)+1
    bBeta[j] <- ( 1 - ( equals(mdlIdx,1)*omega[j] 
                      + equals(mdlIdx,2)*omega0  ) ) * (kappa[j]-2)+1
    omega[j] ~ dbeta( a[j,mdlIdx] , b[j,mdlIdx] )
  }

  omega0 ~ dbeta( a0[mdlIdx] , b0[mdlIdx] )
  for ( j in 1:nCond ) {
    kappa[j] <- kappaMinusTwo[j] + 2
    kappaMinusTwo[j] ~ dgamma( 2.618 , 0.0809 ) # mode 20 , sd 20
  }
  # Constants for prior and pseudoprior:
  aP <- 1
  bP <- 1
  # a0[model] and b0[model]
  a0[1] <- .48*500       # pseudo
  b0[1] <- (1-.48)*500   # pseudo 
  a0[2] <- aP            # true
  b0[2] <- bP            # true
  # a[condition,model] and b[condition,model]
  a[1,1] <- aP           # true
  a[2,1] <- aP           # true
  a[3,1] <- aP           # true
  a[4,1] <- aP           # true
  b[1,1] <- bP           # true
  b[2,1] <- bP           # true
  b[3,1] <- bP           # true
  b[4,1] <- bP           # true
  a[1,2] <- .40*125      # pseudo
  a[2,2] <- .50*125      # pseudo
  a[3,2] <- .51*125      # pseudo
  a[4,2] <- .52*125      # pseudo
  b[1,2] <- (1-.40)*125  # pseudo
  b[2,2] <- (1-.50)*125  # pseudo
  b[3,2] <- (1-.51)*125  # pseudo
  b[4,2] <- (1-.52)*125  # pseudo
  # Prior on model index:
  mdlIdx ~ dcat( modelProb[] )
  modelProb[1] <- .5
  modelProb[2] <- .5
}

I will look into the examples you mentioned. Thanks!

Topic		Replies	Views
Compare posterior distribution across different models General	1	42	July 23, 2025
Identifying relationships between parameters estimated from different data sources Modeling techniques , fitting-issues , specification	6	580	August 12, 2022
Multiple likelihoods with same outcome, shared parameter Modeling techniques	5	227	May 28, 2024
Use posterior predictive distribution in another model Modeling	5	1295	March 25, 2023
How to properly compare priors Modeling specification , cognitive-science	3	1026	November 12, 2018

Comparison of two models in one script and see which one provides best fit

Related topics