Hello,
I would like to insert information on some of the proportion values.
For example
simplex[K] p;
y ~ normal(p[1] * a + p[2] * b + p[3] * c ... , sigma);
p ~ dirichlet();
p[1] + p[2] ~ normal(0.6, 0.0001);
What would be the Jacobian here?
The reason why I would like to give prior is that with the same code I should be able to to give arbitrary sum values to arbitrary groups of “p” (with input mapping array), and arbitrary K.
Thanks
Try this
parameters{
simplex[K] p;
}
transformed parameters{
real sum_p12;
sum_p12 = p[1] + p[2];
}
model {
y ~ normal(p[1] * a + p[2] * b + p[3] * c ... , sigma);
p ~ dirichlet();
sum_p ~ normal(0.6, 0.0001);
}
Mixing is going to be pretty poor though as you effectively have a redundant parameter.
Thanks Julian,
What would the redundant parameter?
I don’t understand what you’re trying to do here. It helps immensely if people tell us what’s data and what’s a parameter and what the shape of everything is (vector, matrix, etc.). We’re good guessers, but it’s a constant effort.
Why do you want two priors on p
here?
Also, you don’t really ever want to use normal(0.6, 0.0001)
in Stan as that constrains the value to a very small window around 0.6—better to take a raw variable with a unit scale and rescale it if you really need this prior.
Hello @Bob_Carpenter ,
yes I always try to keep at minimum for not overwhelm. Here a better part of the (bigger) original model.
basically sometime I have information about groups of pi, for an higher hierarchy. I known that the proportion of higher hierarchy sub cell types, meaning that the sum off immune cell types makes the 50% of the mixture.
Therefore I want to constrain their sum to be what I want.
Very important… I want to do it in a way that the same code can deal with any grouping so this grouping/constrain/prior cannot be hard coded. That’s why I though to put a harsh prior on their sum.
data{
int G; // Number of marker genes
int P; // Number of cell types
int S; // Number of mix samples
vector[G] m[S]; // Mix samples matrix
}
parameters{
simplex[P] pi[S]; // Matrix of cell type proportions
real<lower=0> sigma; // Standard deviation of predicted mixed expression
vector[P] expr[G]; // Predicted marker signature
}
model{
sigma ~ normal(0,0.5);
for(s in 1:S) alpha[group[s]] ~ gamma(1.1,0.1); // Prior to alpha
for(s in 1:S) pi[s] ~ dirichlet(alpha[group[s]]); // Prior to a
for(s in 1:S) for(g in 1:G) {
m[s,g] ~ student_t(2, log_sum_exp(expr[g] + log(pi[s])), sigma); // Calculating probability of inferred mixed gene expression
}
}
You don’t want to try to make things deterministic by imposing a very strong prior. It’s better to code the constraint up directly.
As I keep trying to tell people on the list, it’s hard for me to go back through chains of posts and put information together. I read a lot of these every day and they all run together after a while.
The manual explains how to impose summation constraints. If you want a parameter vector alpha
to sum to sum_alpha
, then you can do this:
parameters {
vector[K - 1] alpha_raw;
transformed parameters {
vector[K] alpha;
alpha[1:(K - 1)] = alpha_raw;
alpha[K] = alpha_sum - sum(alpha_raw);
And then you’re guaranted that sum(alpha) = alpha_sum
by construction.
Is there a reason to do that rather than using simplex? It seems that the alphas are constrained to be positive.
parameters{
simplex[K] alpha_raw;
}
transformed_parameters{
vector[K] alpha;
alpha = alpha_sum*alpha;
}
Also, is the block assignment new? I didn’t see it in the manual and I think I tried it at one point and got an error.
Thanks Bob and Aaron,
The @aaronjg solution seems quite elegant
The problem is that I cannot programmatically create groups of proportions of different length.
For example for the proportion vector
K=10;
alpha[K];
I cannot group alpha arbitrarily, for example
alpha = list(
simplex[2],
simplex[5],
simplex[3]
)
and then do
y ~ normal(alpha[1,1] * SUM_P_GROUP_1 * a + alpha[1,2] * SUM_P_GROUP_1 * b + alpha[2,1] * SUM_P_GROUP_2 * c ... , sigma);
As far as I know lists are not available in Stan.
In that case, I would look at the manual to see how the simplex transformation is implemented, and then implement it yourself. So you could have a new unconstrained vector of length (K-Groups) and then use the broken stick transformation to get the constrained scaled simplex.