Divergencies with truncated normal

prophet_normal_truncated.stan (3.9 KB)

Hello, I am working on a project devoted to promotions estimation. As a starting model, I use the stan model written for the prophet python library. The first experiments showed that some of the promotions of the starting model have a negative effect on sales. This cannot be a priori. Trend charts showed that the trend factor captured part of the effect of promotions.

So, I modified stan model in order to truncate normal distribution for the part of the model beta coefficients.

data {
  int<lower=1> K;           // Number of regressors
  int n_constr;             // Number of regressors with constrained priors
  int constr_vec[n_constr]; // Indexes to find a priori constrained features in X
  int norm_vec[K-n_constr]; // Indexes to find a priori unconstrained features in X
  real L;                   // Lower bound constraint
  real U;                   // Upper bound constraint
  // Unconstrained parameters initialization
  for (i in norm_vec) {
     beta[i] ~ normal(0, sigmas[i]);
  // Constrained parameters initialization
  for (j in constr_vec) {
      beta[j] ~ normal(0, sigmas[j]) T[L, U];

(I attached the full model code in the message.)
I set 5000 samples and got the expected results consistent with reality.

But stan writes me a warning that I have divergencies even with adapt_delta=0.999. Is it critical for such truncated normal models? If it is, could you give me some model modification advice, please?

The second warning for me was r_hat statistics that sometimes is slightly higher than empirical 1.1 (1.12~1.15) for a part of beta coefficients. Is it critical too or not? (I suppose that it is the result of normal distribution truncation).

I’d appreciate any advice. Thanks.


1 Like

Welcome Aleksandr to Stan!

If you use:
beta[j] ~ normal(0, sigmas[j]) T[L, U];
you also have to use constraints:
vector<lower=L, upper=U>[K] beta; // Regressor coefficients

1 Like

Thank you,
Could you please give an advice how can I initialize a vector where not all elements need to be constrained? I’m a little confused.

parameters {
  vector<lower=L, upper=U>[Nconstrained] beta_constrained; // Regressor coefficients
  vector[Nunconstrained] beta_unconstrained; // Regressor coefficients
transformed parameters {
  vector[Nconstrained+Nunconstrained] beta;
  beta[norm_vec] = beta_unconstrained;
  beta[constr_vec] = beta_constrained;
model {
  beta_unconstrained ~ normal(0, sigma[norm_vec]);
  beta_constrained ~ normal(0, sigma[constr_vec]);

You don’t need to truncate, but need to constrain.

Please also note that 0 might not be part of the interval [L, U]. Thus you might consider:

beta_constrained ~ normal(L, sigma[constr_vec]);

1 Like

I see, thank you! I will modify the code and write about results.

1 Like

Thank you for the advice!

I tested the modified model. It turns out that without prior distribution truncation trace plots show low oscillation near zero and several large offsets. As a result, Gelman-Rubin statistics tells that Markov chains don’t converge. Perhaps, when Markov chain warms up and approximates joint distribution, sampling operation from the chain has lots of values that are rejected because of initial constraints (>0 in my case).

So I also added truncated normal initialization to priors.

transformed parameters {
  vector[K] beta;
  vector[n_constr] sigmas_pos;
  beta[norm_vec] = beta_unconstrained;
  beta[constr_vec] = beta_constrained;
  sigmas_pos = sigmas[constr_vec];

model {
  k ~ normal(0, 5);
  m ~ normal(0, 5);
  delta ~ double_exponential(0, tau);
  sigma_obs ~ normal(0, 0.5);

  beta_unconstrained ~ normal(0, sigmas[norm_vec]);
  for (i in 1:n_constr) {
      beta_constrained[i] ~ normal(0, sigmas_pos[i]) T[L, U];

This may not be the best option, but it works. Markov chains converge, Gelman-Rubin statistics is between 1 and 1.01. All constraints are working correctly, there are no divergencies.
In the documentation I see the same idea.

If my reasoning is not quite correct, please tell me.

1 Like