How to realize the R code in Stan

I wanna realize the following R code in Stan, but it doesn’t work.

 for(k in 1:M)
#     mu[1,k] <- runif(1,2*theta1,1.2*theta1)
#     mu[2,k] <- runif(1,0.8*theta1,0)
#     mu[3,k] <- runif(1,0,0.8*theta2)
#     mu[4,k] <- runif(1,1.2*theta2,2*theta2)
     mu[1,k] <- rnorm(1,2*theta1,1)
     mu[2:3,k] <- rnorm(2,0,1)
     mu[4,k] <- rnorm(1,2*theta2,1)
    sig[,k] <- runif(4,mn.sig,mx.sig)
    alpha[,k] <- make.alpha(mu[,k],sig[,k],tau1,tau2,theta1,theta2)
    ind[k] <- ind[k]+1

The motivation of this code is to specify a solution of a random matrix s.t the root \alpha is able to be weight. But Stan does not allow rng function to occur in the model block.

Can any one help me?

you are correct, Stan needs the posterior density defined by the parameters, transformed parameters and model block to be completely deterministic, without any randomness. On the other hand I cannot think of many sensible use cases where randomness in the posterior density would be desirable…

So if you are just trying to “invert” the R code, i.e. build a model that takes the resulting data and can make inference about some unobserved parameters, than you would usually “replace RNG with sampling statements”. But maybe you need something else - could you be more specific about the broader context of the problem you are trying to solve? How do your data look like? What inferences do you hope to make based on the data?

Best of luck with your model!

Thank your suggestion, Martin! Let me add more background introduction.

The question arised from our recently completed research work implemented by R. When I tried to practise in Stan, it becomes problematic. For the heteroscedastic regression model with unspecified error distribution F_\epsilon that works on aberrant data,

y = \beta^T x + \exp(\gamma^T x_i)\epsilon,

we aim to trim both sides. Let 0 < \tau_1<1/2<\tau_2<1 and F_{\epsilon}^{-1}(\tau) be the lower \tau th quantile of F_{\epsilon}. Then the trimmed mean is written as:

T_{\tau_1, \tau_2}(F_{\epsilon}) = \frac{1}{\tau_2-\tau_1}\int_{F^{-1}(\tau_1)}^{F^{-1}(\tau_2)} sdF_{\epsilon}(s).

We propose the DPM as the prior for f_\epsilon s.t

f_\epsilon (s) =\sum_{k=1}^{\infty} p_k f_{\epsilon_k}(s),

where p_k is drawn from a stick-breaking process and the self-defined kernel is of the following form:

f_{\epsilon_k}(s) = \sum\limits_{l=1}^4 \alpha_{lk}\phi(\frac{s-\mu_{lk}}{\sigma_{lk}}).

Thus it is subject to the constraints and \Phi(\cdot) is the cdf of N(0,1).
Then \alpha is fully determined by (\tau_1, \tau_2) and (\mu_1, \dots, \mu_4, \sigma_1, \dots, \sigma_4).
We also have the prior setting like \mu_{jk} \sim N(0, \sigma_0^2), \sigma_{jk}^2 \sim \text{Inv_Gamma}(\alpha, \beta), j = 1,2,3, 4, k \ge 1.

Notice that \alpha_k should be a weight s.t \sum_{l=1}^{4} \alpha_{lk} =1 and 0 < \alpha_{\cdot k} <1 for any k. That comes the problem that an \alpha determined by the other parameters may not be a legal weight at all! In this case, a resampling scheme is needed to resample the other parameters until the determined \alpha is a legal weight and in R, we use the code we showed in previous question to overcome this problem and that’s why we have to make the posterior not completely deterministic with randomness. But random number generator is not allowed in Stan model block and the resampling scheme is illegal in Stan, which makes us fail to practice that methodology in Stan.

May you give any suggestions to us?

1 Like

That seems to be the core of the problem. Stan is indeed incompatible with such a resampling scheme. Usually what people do in such cases in Stan is to find a way to parametrize the model such that invalid combinations are never considered. So if \alpha were parameters directly, you would model them as a simplex.

If \alpha are supposed to be a function of other parameters, you IMHO have exactly the same problem as in categorical/multinomial/dirichlet regression (where a simplex is also parametrized). The most commonly used approach is to create a new set of parameters \bar\alpha where \alpha = \mathrm{softmax}(\bar\alpha). However, there is an identifiability issue here - there are infinitely many \bar\alpha that give the same \alpha after the softmax approach. The easiest solution is to fix \bar\alpha_1 = 0 and put predictors only on the other elements. The only disadvantage is that now the first element is “special”. Alternatively, one can constrain the sum of all individual predictors on \bar\alpha to 0 (either via a soft or hard constraint, see the discussion at Test: Soft vs Hard sum-to-zero constrain + choosing the right prior for soft constrain )

Would that make sense in your model?