Mixed discrete-continuous parametre

I’m trying to modify the Stan code used by Facebook’s Prophet. In particular, I want to replace the parametre delta with a new parametre alpha with:

\begin{aligned} \alpha_1 &\sim \mathcal L (0, \tau) \\ X_{i>1} &\sim Ber (S / T) \\ Y_{i>1} &\sim \mathcal L (0, \tau) \\ \alpha_{i>1} &= X_i * Y_i + (1 - X_i) * \alpha_{i -1} \end{aligned}

That is, the first value of alpha is distributed according to a Laplace (double exponential) distribution and all further values of alpha have a mixed distribution where there’s a S/T probability they came from another Laplace distribution and a (T - S) / T probability that they are the same value as the previous alpha.

I looked around for something like this and couldn’t find it, but I found something about modelling a zero-inflated Poisson. Based on that, I wrote the following in my model:

model {
    alpha [1] ~ double_exponential (0, tau);
    theta ~ beta (S, T - S);
    for (i in 2:T) {
        if (alpha [i] == alpha [i - 1]) {
            target += log_sum_exp (bernoulli_lpmf (1 | theta) + double_exponential_lpdf  (alpha [i] | 0, tau), bernoulli_lpmf (0 | theta));
        } else {
            target += bernoulli_lpmf (1 | theta) + double_exponential_lpdf  (alpha [i] | 0, tau);

Is that correct? Will that work? If not, what do I do instead?

(I realised while writing this example that I could just model X_i and Y_i explicitly and then use alpha as a transformed parametre, but if there’s a better way to do it I’d still like to.)


Okay so I just found out that Stan doesn’t support integer parametres so I can’t model the X_i variable, apparently. How do I work around that?

You marginalise X_i out by summing over its possible values. The manual has a few examples in Chapters 13 and 15.

I’m still not sure how to implement that. We have:

\alpha_i | (X_i = 1, \alpha_{i-1}) \sim \mathcal L(0, \tau) \\ \alpha_i | (X_i = 0, \alpha_{i-1}) = \alpha_{i-1}

Which means that:

\begin{aligned} p(\alpha_i | X_i = 1, \alpha_{i-1}) &= \mathcal L(\alpha_i|0, \tau) \\ p(\alpha_i | X_i = 0, \alpha_{i-1}) &= \delta(\alpha_i - \alpha_{i-1})\\ p(\alpha_i | \alpha_{i-1}) &= \frac S T \mathcal L(\alpha_i|0, \tau) + \frac {T - S} {T} \delta(\alpha_i - \alpha_{i-1}) \\ p(\alpha_i) &= \frac S T \mathcal L(\alpha_i|0, \tau) + \frac {T - S} {T} \int \delta(\alpha_i - \alpha_{i-1}) p(\alpha_{i - 1}) \text d\alpha_{i - 1} \\ &= \frac S T \mathcal L(\alpha_i|0, \tau) + \frac {T - S} {T} p_{\alpha_{i - 1}}(\alpha_i) \end{aligned}

Where the third line was marginalising X_i out. But I’m really not sure how to implement that, at all.

(I also just checked chapters 13 and 15 and didn’t find the relevant parts.)

@Bob_Carpenter: would implementing the third line of @pedromvilar’s derivation above break differentiability? If not it seems to be a case of implementing it as a log_sum_exp or a log_mix if you’re feeling fancy.

So I’m not sure whether a bump counts as a “no-content reply” and is thus prohibited, but I still need help with this. I tested the code in the top post and it definitively does not work, so I would like to know what does.