 # How to give prior to ordered parameters, and also to simplex?

HI all,

Suppose I require two arrays, A and B as my parameters. Now I know that A would be simplex (sum(A)=1) and most importantly B has to be ordered. But I could not find easily how to put priors on them, especially on the ordered B.

 parameters {
simplex[N] A;
ordered[N] B ;
real <lower=0> C;
}


Please let me know how should I force that say B>0.5; and 0.1<=A;

Thank you.

Hi, I would guess that if you have ordered categorical data a Dirichlet prior could be used:
https://cran.r-project.org/web/packages/brms/vignettes/brms_monotonic.html

So there is no simple way of putting the lower/upper limit to sat the ordered parameter B?
In other words, how should implement that all elements of B has a uniform prior in the range 0.1<B_i<1.0 where I always want to secure that B_i<=B_{i+1} for all i?

A lower bound ordered can be enforced using positive_ordered.

parameters {
simplex[N] A_r;
positive_ordered[N] B_r ;
}
transformed parameters {
simplex[N] A = 0.1 + (1-N*0.1)*A_r;
ordered[N] B = 0.5 + B_r;
}


If you need both upper and lower bound at the same time, a cumulative sum of simplex works

parameters {
simplex[N+1] B_r ;
}
transformed parameters {
ordered[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}

5 Likes

I got it. So there is no inbuilt way in STAN to put upper limit to orderedor positive_ordered parameters. Now focusing on the last code:

parameters {
ordered[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}


the right hand side is already ordered. Would it be anu different if I write:

transformed parameters {
vector[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}


I wanted my ordered parameter dt to have an upper limit of 10 (the lower limit should be 0). The code is given below:

parameters {
simplex[N] dt_r ;
}
transformed parameters {
ordered[N-1] dt = 10*cumulative_sum(dt_r[:N-1]);
}


with N=4.
I am getting many warnings:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:

Exception: validate transformed params: dt is not a valid ordered vector. The element at 2 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)
Exception: validate transformed params: dt is not a valid ordered vector. The element at 2 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.Exception: validate transformed params: dt is not a valid ordered vector. The element at 3 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

............................................................................


There are many many such warning lines. Are these serious, should I worry? Does simplex vector have all values positive?

Is the above code really doing what I wanted it to do? I could not check it myself because I have no idea how STAN works.

I have already implemented such a model with an ordered parameter in my package BayesianFROC in which the lower limit was only considered. However, in your case, there is, in addition, an upper limit.

Consider the model f(y|\theta) where y is a dataset and \theta is a vector of model parameter, namely \theta = (\theta_1, \theta_2, \theta_3) \in \mathbb{R}^3. (In what follows, we only consider the parameter length 3 for simplicity.)

We assume that the restriction a<\theta_1< \theta_2 <\theta_3<b where a and b are fixed numbers, and each pamaeter \theta_i has flat a prior.

To accomplish this restriction , we introduce new parameters as follows.

d\theta_2:= \theta_2 -\theta_1,

d\theta_3 := \theta_3 -\theta_2.

Then the desired answer is the following.

\theta_1 \sim \text{Uniform}(a,b),

d\theta_2 \sim \text{Uniform}(0, b-\theta_1),

d\theta_3 \sim \text{Uniform}(0, b-\theta_2),

where \text{Uniform}(a,b) denotes a flat prior whose support is the interval \{x \in \mathbb{R}; a \leq x \leq b\}.

To implement this, I have declared the parameter \theta = (\theta_1, d\theta_2,d\theta_3) in the parameter block in my Stan file. And as generated parameters, I declared \theta_2,\theta_3 in the generated parameters block in my Stan file. However, when I submitted my paper with this methods, all reviewers cannot understand this method. But, actually it runs in my package, so I believe this method goes through.

The upper bound is quite new for me, and I did not confirm this methods by running Stan. So if there is a problem, then please let me know.

1 Like

I would actually like @paul.buerkner to weigh in on this matter due to his monotonic paper. I asked a similar question (I believe!) in another thread in this forum. In short, depending on what data (e.g., Likert scale 1-5) and if we have nobody answering 4, or 5, should we use, i.e., Dirichlet(c(1,1,1,1)) or Dirichlet(c(1,1,1))?

Question 2
a) If I have ordered categorical predictors from 1-5 and 4 has not been used should I still use ordered categorical in five levels, i.e., factor(x, c("1", "2", "3", "4", "5"), ordered=TRUE) ? My question is due to the rls variable above which is really 1-8, but the data has recorded only levels 1 and 2.
b) If yes, does the same apply when 5 has not been used above in a) , i.e., should I still use factor(x, c("1", "2", "3", "4", "5"), ordered=TRUE)?

As an example, if I model this ordered categorical variable:

table(bdf$f6f_rls) 1 2 3 4 5 6 7 8 27 0 0 0 0 0 0 0  as mo(f6f_rls) in brms it bails when starting to sample:  Exception: model159d46253acca_21c393ad993ab34f22af08399084b917_namespace::model159d46253acca_21c393ad993ab34f22af08399084b917: Jmo[i_0__] is 0, but must be greater than or equal to 1 (in 'model159d46253acca_21c393ad993ab34f22af08399084b917' at line 26)  Well, it’s not strange, it’s not monotonic in that sense, but it would be nice from a model specification point of view to be able to write mo() indicating that it should be a 1,...,x ordered categorical var. Okay this looks interesting. But why can’t I declare like 3 variables \theta_1, \theta_2 and \theta_3 in the parameter part and give the priors in the model part like: \theta_1 \sim \text{Uniform}(a,b), \theta_2 \sim \text{Uniform}(\theta_1,b), \theta_3 \sim \text{Uniform}(\theta_2,b) I think the above is same as what you have suggested. Please see my issue described above. How about modelling a positive ordered vector of length K-1, append value zero to get length K, transform the whole thing to the unit scale with the softmax function, and then multiply by ten? However, you would have to manually play around to get a uniform prior (or set the uniform prior on the transformed result and correct the posterior for the non-linear transformation). [Edit: softmax typo corrected] As you say, I think the difference variables such as t[i+1] - t[i] are redundant if consider it only in theory without programming. However, I think such redundant variable dt is required in order to let PC understand the model. To tell the truth, I do not try Stan programming without such redundant parameters, so … I think it is possible without it, but … now, I do not try. Consider the following model. X_1 \sim \text{Normal}(t_1,1), X_2 \sim \text{Normal}(t_2,1), X_3 \sim \text{Normal}(t_3,1), with the condition 0 < t_! <t_2<t_3 <1 and each t_i has the flat prior. Note that \text{Normal}(t_a,1), a=1,2,3 denotes the Gaussian distribution of mean t_a and variance 1. In the following code, I implemented it. The last part of the code, I have checked that the order constraints are satisfied for all MCMC iterations.  m <- rstan::stan_model(model_code = ' data{real x; } parameters { real<lower=0,upper=1> t1; real<lower=0> dt; } transformed parameters { real t; t=t1; t=t1+dt; t=t+dt; } model { x ~ normal(t,1); x ~ normal(t,1); x ~ normal(t,1); t1 ~ uniform(0,1); dt ~ uniform(0,1-t1); dt ~ uniform(0,1-t1-dt); }') f <- rstan::sampling(m, data=list(x=c(0.1,0.2,0.3)), iter = 11100,chains=1) extract(f)$t[,1] > 0
extract(f)$t[,2] > extract(f)$t[,1]
extract(f)$t[,3] > extract(f)$t[,2]
1  >  extract(f)\$t[,3]


I doubt the ordered prior in this manner is theoretically correct. but I think that it is practically enough.