Latent discrete parameter: can it be hierarchical?

I have been working on a model with two success probabilities and a change point. The model recovers well when I marginalize out the change point as a latent discrete parameter. However, in the final model I would like it if I don’t have to assume one population change point, but rather change points that can vary between subjects. I have read some examples of random change point models, but the one implementation I found in Stan does not model the change point as a discrete parameter. This brings me to my question: is it possible to have a hierarchical discrete latent parameter or can it only be hierarchical if the change point is modeled as a continuous parameter?

``````// data:
data{

int<lower=1> L; // number of participants
int<lower=1> N; // number of datapoints
int<lower=0,upper=1> Y[N]; // outcome (cancelled or not)
int<lower=1,upper=L> ll[N]; // cases going from int 1 to L
vector[N] X; // X-coordinates

}

// transformed data: (necessary for latent discrete parameter)
transformed data{
real boundary[7];
// possible boundary values are -0.75, -0.50, -0.25, 0, 0.25, 0.50, 0.75
for(i in 1:7){
boundary[i] = -0.75 + (0.25 * (i-1));
}
}

// parameters:
parameters{

// population-level theta's in log odds:
real theta_left_logit;
real theta_right_logit;

// participant-level theta's in log odds:
vector[L] theta_left_l;
vector[L] theta_right_l;

// population sd of theta by participant:
real<lower=0> sigma_l;
real<lower=0> sigma_r;
}

// transformed parameter:
// RECORD Log probability FOR EACH BOUNDARY LEVEL:
transformed parameters{

vector[7] lp; // lp = log probability
lp = rep_vector(-log(7), 7); // uniform prior on all boundary values

// for each possible boundary:
for(b in 1:7){
for(n in 1:N){
lp[b] = lp[b] + bernoulli_logit_lpmf(Y[n]|X[n]<boundary[b] ? theta_left_l[ll[n]] : theta_right_l[ll[n]]);
}
}
}

// model:
model{

// priors for population level: (hyperpriors)
theta_left_logit ~ normal(0, 2);
theta_right_logit ~ normal(0, 2);

// priors for population-level sigma
// (by lb they exclude values lower than zero):
sigma_l ~ normal(0.05, 0.025);
sigma_r ~ normal(0.05, 0.025);

// priors for participant level parameters:
// should be drawn with population parameter as the mean:
// and population sigma as sd:
theta_left_l ~ normal(theta_left_logit, sigma_l);
theta_right_l ~ normal(theta_right_logit, sigma_r);

target += log_sum_exp(lp);
}

// extra code that allows to infer the probability of the discrete parameter:
generated quantities{

vector[7] pState;
pState = exp(lp - log_sum_exp(lp));

}
``````
1 Like

Hi, sorry for not getting to you earlier.

I think defining some sort of hierarchy for a marginalized-out discrete parameter should be possible (although not sure it will work well in practice). A simple example would be to take a binomial or beta-binomial with an unknown success probability as the prior for the discrete parameter.

So in your code instaed of ` lp = rep_vector(-log(7), 7);` you would have something like:

``````for(b in 1:7) {
lp[b] = binomial_lpmf(b | theta);
}
``````

where `theta` is a new parameter representing the location (scaled to [0,1]) where you roughly expect the “average” changepoint.

To what extent is the `binomial` hyperprior sensible is not clear - you could add more flexibility by using a beta-binomial or maybe use a different distribution completely. Maybe running a prior predictive check (simulating from the distributions to see how big of a correlation in the change point locations you can get) could clarify stuff.

Hope that helps at least a bit!