Estimating expectations of marginalised discrete variable

Hi,

I’m following Stan’s manual (Latent Discrete Parameters) to estimate a discrete parameter. All chains converge and the method seems to recover the ground truth. At least visually, but I don’t know how to summarise it with a probability statement. Let’s assume that the parameter of interest is \tau, which can take values between 360 and 4920. I have evaluated the likelihood at discrete points of this interval in increments of 120.

This is the Stan code that generates the probabilities and draws. According to Stan’s manual, sim_cp should only be used for visual inspection, but for precise calculations (expectations), I should use prob_cp. I would like a statement that says “There is a 95% chance that the change point (cp) is between a & b”. Any pointers as to how to do that will be highly appreciated. I got lost with the notation in the section “estimating expectations”.

generated quantities {
  array[n_i_y] int sim_cp;
  array[n_i_y] vector[n_cp] prob_cp;

  for(i in 1:n_i_y) {
    sim_cp[i] = cp[categorical_logit_rng(lp[i, ])];
    prob_cp[i] = softmax(lp[i]);
  }
}

In R, for a given individual i, I have a data frame with three columns (sample.csv (1.3 KB)), where j is the index of the change point, prob_cp the probability, and .iter the iteration.

Thanks.

You also want to add randomness here if you are doing posterior predictive inference.

In general, suppose you have an unknown theta in Stan. You can define an indicator as follows that assigns a value 1 if \theta \in (a, b) and value 0 otherwise (Stan’s comparison functions return either 0 for false or 1 for true).

generated quantities {
  int<lower=0, upper=1> Pr_btw = a < theta && theta < b;
}

The posterior mean of this quantity is the probability:

\Pr[\theta \in (a, b) \mid y] = \int_\Theta \textrm{I}_{(a, b)}\!(\theta) \cdot p(\theta | y) \, \textrm{d} \theta \approx \frac{1}{M} \sum_{m=1}^M \textrm{I}_{(a, b)}(\theta^{(m)}),

where \theta^{(m)} \sim p(\theta \mid y) are the posterior draws and \textrm{I}_A(u) = 1 if u \in A and 0 otherwise (i.e., it’s an indicator function for the interval).

1 Like