Modeling likelihood

Hi! I’m new here and new with Stan.

I would like to ask the following question.
How do we define define a likelihood that is binomial(m1,N,theta)*binomial(m2,N,theta) in the model section.
I’m stuck and maybe I missing something basic.

Thanks for your help.

Hi!

In general, you can always achieve this by doing something like

model{
  m1 ~ binomial(N, theta)
  m2 ~ binomial(N, theta)
}

This is general in the sense that it doesn’t matter that both distributions be binomial or that they share the same parameters.

In this particular case, though, you can be a bit more concise and efficient by exploiting the properties of the binomial distribution and recognizing that binomial(m_1|N,\theta)*binomial(m_2|N,\theta) \propto binomial(m_1 + m_2|2N,\theta), where binomial refers to the binomial probability mass function. Since Stan only needs the likelihood up to a constant of proportionality, you can replace your two statements with a single statement:

transformed data{
  int m3 = m1 + m2;
  int N2 = 2*N;
}
model{
   m3 ~ binomial(N2, theta);
}
2 Likes

Thanks for your fast reply.
Just to confirm.
I suppose its equivalent with

data {
int N;
int x[2]; //where x[1] is m1 and x[2] is m2
}
parameters{
real<lower=0,upper=1> theta;
}
model {
x~binomial(N,theta);
}

Yes, that’s also the same model. There might still be a (very small, in this case) efficiency gain from using the single binomial for the sum. In general, it’s good practice to exploit such “sufficient statistics” when they are available.

1 Like