How to give prior to ordered parameters, and also to simplex?

deltasata · November 22, 2019, 5:28am

HI all,

Suppose I require two arrays, A and B as my parameters. Now I know that A would be simplex (sum(A)=1) and most importantly B has to be ordered. But I could not find easily how to put priors on them, especially on the ordered B.

 parameters {
    simplex[N] A;
    ordered[N] B ;
    real <lower=0> C;
}

Please let me know how should I force that say B[1]>0.5; and 0.1<=A;

Thank you.

torkar · November 22, 2019, 6:00am

Hi, I would guess that if you have ordered categorical data a Dirichlet prior could be used:
https://cran.r-project.org/web/packages/brms/vignettes/brms_monotonic.html

deltasata · November 22, 2019, 6:11am

So there is no simple way of putting the lower/upper limit to sat the ordered parameter B?
In other words, how should implement that all elements of B has a uniform prior in the range 0.1<B_i<1.0 where I always want to secure that B_i<=B_{i+1} for all i?

nhuurre · November 22, 2019, 9:31am

A lower bound ordered can be enforced using positive_ordered.

parameters {
    simplex[N] A_r;
    positive_ordered[N] B_r ;
}
transformed parameters {
    simplex[N] A = 0.1 + (1-N*0.1)*A_r;
    ordered[N] B = 0.5 + B_r;
}

If you need both upper and lower bound at the same time, a cumulative sum of simplex works

parameters {
    simplex[N+1] B_r ;
}
transformed parameters {
    ordered[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}

deltasata · November 24, 2019, 12:20pm

I got it. So there is no inbuilt way in STAN to put upper limit to orderedor positive_ordered parameters. Now focusing on the last code:

parameters {
    ordered[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}

the right hand side is already ordered. Would it be anu different if I write:

transformed parameters {
    vector[N] B = 0.1 + 0.9*cumulative_sum(B_r[:N]);
}

deltasata · November 25, 2019, 5:51am

I wanted my ordered parameter dt to have an upper limit of 10 (the lower limit should be 0). The code is given below:

parameters {
    simplex[N] dt_r ;
}
transformed parameters {
    ordered[N-1] dt = 10*cumulative_sum(dt_r[:N-1]);
}

with N=4.
I am getting many warnings:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:


Exception: validate transformed params: dt is not a valid ordered vector. The element at 2 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)
Exception: validate transformed params: dt is not a valid ordered vector. The element at 2 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.Exception: validate transformed params: dt is not a valid ordered vector. The element at 3 is 10, but should be greater than the previous element, 10  (in 'unknown file name' at line 24)



If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

............................................................................

There are many many such warning lines. Are these serious, should I worry? Does simplex vector have all values positive?

Is the above code really doing what I wanted it to do? I could not check it myself because I have no idea how STAN works.

Jean_Billie · November 25, 2019, 10:20am

I have already implemented such a model with an ordered parameter in my package BayesianFROC in which the lower limit was only considered. However, in your case, there is, in addition, an upper limit.

Consider the model f(y|\theta) where y is a dataset and \theta is a vector of model parameter, namely \theta = (\theta_1, \theta_2, \theta_3) \in \mathbb{R}^3. (In what follows, we only consider the parameter length 3 for simplicity.)

We assume that the restriction a<\theta_1< \theta_2 <\theta_3<b where a and b are fixed numbers, and each pamaeter \theta_i has flat a prior.

To accomplish this restriction , we introduce new parameters as follows.

d\theta_2:= \theta_2 -\theta_1,

d\theta_3 := \theta_3 -\theta_2.

Then the desired answer is the following.

\theta_1 \sim \text{Uniform}(a,b),

d\theta_2 \sim \text{Uniform}(0, b-\theta_1),

d\theta_3 \sim \text{Uniform}(0, b-\theta_2),

where \text{Uniform}(a,b) denotes a flat prior whose support is the interval \{x \in \mathbb{R}; a \leq x \leq b\}.

To implement this, I have declared the parameter \theta = (\theta_1, d\theta_2,d\theta_3) in the parameter block in my Stan file. And as generated parameters, I declared \theta_2,\theta_3 in the generated parameters block in my Stan file. However, when I submitted my paper with this methods, all reviewers cannot understand this method. But, actually it runs in my package, so I believe this method goes through.

The upper bound is quite new for me, and I did not confirm this methods by running Stan. So if there is a problem, then please let me know.

torkar · November 25, 2019, 1:46pm

I would actually like @paul.buerkner to weigh in on this matter due to his monotonic paper. I asked a similar question (I believe!) in another thread in this forum. In short, depending on what data (e.g., Likert scale 1-5) and if we have nobody answering 4, or 5, should we use, i.e., Dirichlet(c(1,1,1,1)) or Dirichlet(c(1,1,1))?

Question 2
a) If I have ordered categorical predictors from 1-5 and 4 has not been used should I still use ordered categorical in five levels, i.e., factor(x, c("1", "2", "3", "4", "5"), ordered=TRUE) ? My question is due to the rls variable above which is really 1-8, but the data has recorded only levels 1 and 2.
b) If yes, does the same apply when 5 has not been used above in a) , i.e., should I still use factor(x, c("1", "2", "3", "4", "5"), ordered=TRUE)?

As an example, if I model this ordered categorical variable:

table(bdf$f6f_rls)
 1  2  3  4  5  6  7  8 
27  0  0  0  0  0  0  0

as mo(f6f_rls) in brms it bails when starting to sample:

  Exception: model159d46253acca_21c393ad993ab34f22af08399084b917_namespace::model159d46253acca_21c393ad993ab34f22af08399084b917: Jmo[i_0__] is 0, but must be greater than or equal to 1  (in 'model159d46253acca_21c393ad993ab34f22af08399084b917' at line 26)

Well, it’s not strange, it’s not monotonic in that sense, but it would be nice from a model specification point of view to be able to write mo() indicating that it should be a 1,...,x ordered categorical var.

deltasata · November 26, 2019, 2:44am

Okay this looks interesting. But why can’t I declare like 3 variables \theta_1, \theta_2 and \theta_3 in the parameter part and give the priors in the model part like:
\theta_1 \sim \text{Uniform}(a,b),
\theta_2 \sim \text{Uniform}(\theta_1,b),
\theta_3 \sim \text{Uniform}(\theta_2,b)

I think the above is same as what you have suggested.

deltasata · November 26, 2019, 2:53am

Please see my issue described above.

LucC · November 28, 2019, 8:55am

How about modelling a positive ordered vector of length K-1, append value zero to get length K, transform the whole thing to the unit scale with the softmax function, and then multiply by ten? However, you would have to manually play around to get a uniform prior (or set the uniform prior on the transformed result and correct the posterior for the non-linear transformation).

[Edit: softmax typo corrected]

Jean_Billie · November 28, 2019, 9:32am

As you say, I think the difference variables such as t[i+1] - t[i] are redundant if consider it only in theory without programming.
However, I think such redundant variable dt is required in order to let PC understand the model.
To tell the truth, I do not try Stan programming without such redundant parameters, so … I think it is possible without it, but … now, I do not try.

Consider the following model.

X_1 \sim \text{Normal}(t_1,1),

X_2 \sim \text{Normal}(t_2,1),

X_3 \sim \text{Normal}(t_3,1),

with the condition 0 < t_! <t_2<t_3 <1 and each t_i has the flat prior.

Note that \text{Normal}(t_a,1), a=1,2,3 denotes the Gaussian distribution of mean t_a and variance 1.

In the following code, I implemented it.
The last part of the code, I have checked that the order constraints are satisfied for all MCMC iterations.

   m <- rstan::stan_model(model_code = '
                                data{real x[3];
                                }
                                parameters {
                                            real<lower=0,upper=1>  t1;
                                            real<lower=0>  dt[2];
                                                        }
                              transformed parameters {
                                            real   t[3];
                                          t[1]=t1;
                                          t[2]=t1+dt[1];
                                          t[3]=t[2]+dt[2];


                              }

                                model {
                                x[1] ~ normal(t[1],1);
                                x[2] ~ normal(t[2],1);
                                x[3] ~ normal(t[3],1);
                                  t1 ~ uniform(0,1);
                                 dt[1] ~ uniform(0,1-t1);
                                 dt[2] ~ uniform(0,1-t1-dt[1]);


                             }')
      f <- rstan::sampling(m, data=list(x=c(0.1,0.2,0.3)), iter = 11100,chains=1)




  extract(f)$t[,1] > 0
  extract(f)$t[,2] > extract(f)$t[,1]
  extract(f)$t[,3] > extract(f)$t[,2]
                1  >  extract(f)$t[,3]

I doubt the ordered prior in this manner is theoretically correct. but I think that it is practically enough.

Topic		Replies	Views
Ordered Simplex Modeling	4	4243	September 10, 2017
Order the simplex Modeling	2	1313	September 17, 2018
Combining inequality, order, and no constraints on a Simplex Modeling specification	1	478	November 14, 2018
Ordered and constrained parameters Modeling	3	639	April 20, 2021
How can I enforce ordering on normally distributed parameters Modeling	8	529	March 29, 2021

How to give prior to ordered parameters, and also to simplex?

Related topics