Hi everyone,
I am fairly new to coding in STAN so appologies in advance if this is a simple question.
TLDR: How can I constrain specific probabilities in a simplex? E.g. set a min and max.
For some context regarding the data, I am analysing timeseries from different sources, that at each timepoint can have show either an inactive state (0) or 1 of 11 active states (1-11). Key aspects: No states can be active at the same time, the inactive state is unbalancedly present (like 50% of the timepoints will be in the inactive state).
I am developing a Hidden Markov Model to study these timeseries, with the assumption that these states come about with different underlying rules and that those rules might change depending on characteristics of the sources, i.e. covariate metrics.
Intially I coded a relatively simple HMM initially but the posteriors were too broad and I did not have convergency, so I started implementing some assumptions that are well supported by the underlying theory of my field.
One assumption is that the hidden states should have biased self-transitions. The way I did this was by implementing priors in the model:
parameters {
simplex[M] meta_base_trans[M]
}
model{
for(m in 1:M){
meta_base_trans[m] ~ dirichlet(append_row(append_row(rep_vector(1.0, m - 1), rep_vector(8.0, 1)), rep_vector(1.0, M - m)));
}
}
(which to guarantee it works I also implemented as)
real selfBias = 2.0;
if (m == 1){ meta_base_trans[m] ~ dirichlet(append_row(rep_vector(selfBias, 1), rep_vector(1.0, M - 1))); }
else if (m == M){ meta_base_trans[m] ~ dirichlet(append_row(rep_vector(1.0, M - 1), rep_vector(selfBias, 1))); }
else { meta_base_trans[m] ~ dirichlet(append_row(append_row(rep_vector(1.0, m-1), rep_vector(selfBias, 1)), rep_vector(1.0, M-m)));}
However, the problem is that very quickly all my models started having the probability of self-transition for the hidden states to be ~1. This meant that effectively the probability space was not being explored and for example I would get posterior for emissions that would only have sample for one metastate. So I came up with two ideas
(1) Not allow the probabilities of transitions between metastates to be under 0.05 nor above 0.95, which I implemented as:
transformed parameters {
simplex[M] meta_trans_perPat[P, M]; // Transition probabilities for metastates for each patient
vector[M] meta_trans_perPat_constrained[P, M]; // Transition probabilities for metastates for each patient
vector[M] meta_trans_perPat_constrained_logits[P, M]; // Logs of the transition probabilities for metastates for each patient
for (p in 1:P) {
for (m_from in 1:M) {
meta_trans_perPat[p, m_from] = softmax(log(meta_base_trans[m_from]) + to_vector(X[p] * beta_trans_meta));
// constrain
meta_trans_perPat_constrained[p, m_from] = (meta_trans_perPat[p, m_from] * 0.9) + 0.05;
// normalise to ensure row sum to 1
meta_trans_perPat_constrained[p, m_from] = meta_trans_perPat_constrained[p, m_from] / sum(meta_trans_perPat_constrained[p, m_from]);
meta_trans_perPat_constrained_logits[p, m_from] = log(meta_trans_perPat_constrained[p, m_from]);
}
}
}
(2) Concentrate emissions of the inactive state to an inactive metastate. This is something that is supported conceptually, and as you can see below I give minimal leeway for an inactive state to occur sporadically in an active metastate, or for an active state to occur sporadically during the inactive metastate. I implemented this as:
transformed parameters{
simplex[O] meta_emiss_contrained[M];
vector[O] meta_emiss_logits[M]; // Logs of emission probabilities for each metastate
for (m in 1:M) {
if(m == 1) {
// Metastate only emits inactive states
meta_emiss_contrained[m] = rep_vector(0.01/(O-1), O);
meta_emiss_contrained[m][1] = 0.99;
} else {
meta_emiss_contrained[m] = meta_base_emiss[m];
}
meta_emiss_logits[m] = log(meta_emiss_contrained[m]);
}
}
model {
for (m in 1:M) {
if(m==1) { meta_base_emiss[m] ~ dirichlet(append_row(rep_vector(75.0, 1), rep_vector(0.1, O - 1))); }
else { meta_base_emiss[m] ~ dirichlet(append_row(rep_vector(0.1, 1), rep_vector(100/(O-1), O - 1)));}
}
}
Now my question is basically if this is well implemented or not? Since I am new to STAN logic I am not sure this is the best way to guarentee constraints in individual values of probabilities. I know that in the parameters I can declare for example: int<lower=1, upper=5> X; but how would you do that for a single element of a simplex in a vector of simplexes without doing what I did?
Thank you very much in advance for the help!