Mixture formula of discrete and continuous distributions

kimuchan · June 14, 2018, 12:04am

Hi, I’m a beginner of Stan, and facing the following problem.
I want to set alpha[i] to inv_gamma(a, b) or just 1 depending on the probability theta, i.e.,
alpha[i] ~ inv_gamma(a, b) (with the probability theta) or
= 1 (with the probability 1-theta),
but I don’t know how to implement.
According to Chap.12.7 in the manual, there seems to be no possible method, but do you know any suitable methods ?
I would be happy if you give me some advice.
Here, alpha is a latent parameter and I don’t have direct data on it, so it is difficult to implement … If you need additional information, please let me know.
Actually, at first, I tried introducing another latent integer parameter, but I noticed it is impossible in Stan, and my question arose on the course of marginalization.

bgoodri · June 14, 2018, 12:26am

You cannot assign the result of inv_gamma to anything. One of the many things you need to know about Stan before attempting to write anything in the Stan language is that the model block just builds up a log-kernel of the posterior PDF with target += statements or almost equivalently statements involving a ~. But I think what you are saying may be equivalent to a mixture model where alpha has an inverse gamma prior with probability theta and is fixed to 1 otherwise. In that case, you are going to have to use the log_mix function to handle the two possibilities.

kimuchan · June 14, 2018, 3:16pm

Thank you very much for your reply. I didn’t understand how to use log_mix function. Now I write the following, and my code starts sampling.
target += log_mix(theta, inv_gamma_lpdf(alpha[i] | a, b), alpha[i] == 1);
Would it mean that alpha[i] = 1 with probability 1 - theta ?

bgoodri · June 14, 2018, 3:55pm

Think of it as having two posterior distributions, one where theta is positive and another where theta is exactly 1. Then build the log-kernel separately in target_1 and target_2 respectively. Then target += log_mix(theta, target_1, target_2);.

kimuchan · June 14, 2018, 4:26pm

Thank you very much for your quick reply. At first I tried writing as follows:
target += log_mix(theta, inv_gamma_lpdf(alpha[i] | a, b), bernoulli_lpmf(alpha[i] | 1.0));.
But I already stated “vector[Tnum] alpha;” in parameters block, so cannot apply the Bernoulli distribution to alpha[i]. Do you mean that there are some other functions to produce 1 with the probability 1 ?

bgoodri · June 14, 2018, 5:43pm

Yes, just put 1 wherever you put alpha[i].

kimuchan · June 18, 2018, 12:55pm

Thank you so much for your comment. However, I don’t find a solution after considering your comment. I need your help again.
For reference, I provide my code that is simplified as follows:

data {
…
vector<lower=0>[Tnum] t;
…
}

parameters {
…
vector<lower=0>[Tnum] theta;
vector<lower=1>[Tnum] alpha;
}

transformed parameters {

vector[Tnum] a;
vector[Tnum] omega;
vector[Tnum] nu;

for(i in 1:Tnum){
a[i] = function(t);
omega[i] = function(a[i], alpha[i-1], omega[i-1], …);
nu[i] = function(a[i], alpha[i-1], nu[i-1], …);
}

}

model {

…

target += beta_lpdf(theta | beta1, beta2);
for(i in 1:Tnum) {
target += log_mix(theta[i], inv_gamma_lpdf(alpha[i] | a[i], b[i]), 1); //(**)
}

…

}

I think the prior distribution (**) represents what I want to implement, but there is still the lack of information that alpha[i] = 1 with the probability 1 - theta[i].
Do you mean that I may put 1 in alpha[i] in the transformed parameters block, and remove alpha[i] in parameters block ?

bgoodri · June 18, 2018, 3:07pm

No. I mean your model block should be something like

model {
  for (i in 1:Tnum) {
    real target_1;  // log-likelihood assuming alpha[i] != 1
    real target_2;  // log-likelihood assuming alpha[i] == 1
    // define target_1 and target_2
    target += log_mix(theta[i], target_1, target_2);
  }
  // priors
}

kimuchan · June 19, 2018, 5:16pm

Thank you very much. I’m convinced by your explanation, and tried classifying the likelihood, but found the number of the classification is 2^{Tnum-1} … This is because the classification of the likelihood on time “t” depends on the past observational data (I use time-series data). In my case, it may not be straightforward to write down the marginalization. There are some cases where mathematical formula are valid, but now I don’t find any useful methods on my model. Anyway, I’m studying on Stan, and looking for some other methods.

Bob_Carpenter · July 20, 2018, 5:26pm

The usual way to do that is dynamic programming. There’s an example for HMMs in the mixtures or time series chapter of the manual (can’t remember which as it’s both) and also for the CJS population models in the marginalizing discrete parameters chapter.

Topic		Replies	Views
Mixed discrete-continuous parametre Modeling	4	720	December 27, 2018
Mixture model in stan Modeling	5	1316	September 3, 2018
Summing out discrete paramters in a mixture. How can I derive that it works for individual observations? Modeling specification	9	770	June 2, 2020
Specifying a custom prior distribution Modeling	5	675	July 13, 2023
Fitting a discrete integer mixture model within the Stan paradigm Modeling	10	2497	May 11, 2017

Mixture formula of discrete and continuous distributions

Related topics