Dear Stan Community,
I am very new to stan and trying to fit a simple normal linear regression with linear inequality constraints on the parameters.
y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \epsilon, ~~ \epsilon \sim N(0,\sigma^2)
The inequality constraints on the parameter are to be 0>\beta_0, 0<\beta_1<1, 0<\beta_2<1. I first fit the model with diffuse reference prior for the parameters: (\beta, \sigma) \propto \sigma^{1}.
data {
int<lower=0> N; // Number of observations
vector[N] X1;
vector[N] X2;
vector[N] y; // response
}
parameters {
real<lower = 0> sigma;
real<lower = 0> beta0;
real<lower = 0, upper = 1> beta1;
real<lower = 0, upper = 1> beta2;
}
model {
real mu[N];
// Likelihood
for (i in 1:N) {
mu[i] = beta0 + beta1*X1[i] + beta2*X2[i];
}
y ~ normal(mu, sigma);
}

In terms of constraints, Is this the right way to specify this model in Stan?

In general, define a set for inequality constraints, e.g. Q = \{a \le D\beta \le w \}, where D is a nonsingular matrix with real numbers and use this set as an indicator function of the constraints for \beta. In vector form,
data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
parameters {
vector[K] b;
real<lower=0> sigma;
}
model {
y ~ normal(X*b, sigma);
}
How can I put the indicator function in the above code so that it is possible to employ prior distribution in the model?
Thank you very much.
If your constraint is:
a \le D\beta \le w
Then if you multiply all by the inverse of D:
D^{1}a \le D^{1}D\beta \le D^{1}w
This simplifies to:
D^{1}a \le \beta \le D^{1}w
Stan has overloaded the division operator /
for matrices/vectors to perform this multiplication by an inverse in more numerically stable and efficient way. So you would express your model in Stan as:
data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_a = D / a;
vector[K] D_div_w = D / w;
}
parameters {
vector<lower=D_div_a, upper=D_div_w>[K] b;
real<lower=0> sigma;
}
model {
// More efficient likelihood for linear regression
y ~ normal_id_glm(X, 0, b, sigma);
}
3 Likes
Thank you for the reply.
In the transformed data block, isnâ€™t the matrix operation should be D_div_a = D \ a , not D/a for D^{1}a?
Thanks.
2 Likes
Oh yes, good catch! It should be \
for left division
1 Like
Can you clarify what you mean by â€śtransforming the quantities to the original scaleâ€ť? Which quantities are you wanting to transform and what scale do you want them to be on?
1 Like
Nvm. My question was not clear. Thanks!
1 Like
Hi,
I am trying to add timevarying coefficients to the model.
Let \gamma = D\beta \in \{\gamma: a \le \gamma \le w\} and the model becomes
z_t = x_t D^{1}\gamma + \epsilon_t, ~~ t = 1, \ldots, T.
We assign appropriate priors to timevarying \gamma_t. Could you check if I coded \gamma and model z correctly?
Note that I changed the lower and upper bound notation to l and u.
data {
int<lower=0> T; // Time
int<lower=0> K; // number of explanatory variables.
matrix[T,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] z; // Vector of dependent variables
vector[K] l; // Vector of lower inequality constraints
vector[K] u; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_l = D \ l; // D^{1}l
vector[K] D_div_u = D \ u; // D^{1}u
}
parameters {
real<lower=0> sigma;
real<lower=l, upper=u> phi_zero;
vector[K] psi;
real<lower=0> tau;
matrix[K, T] gamma;
}
transformed parameters {
vector<lower=1, upper=1>[K] phi =
vector<lower=D_div_l, upper=D_div_u>[K] D_div_gamma = D \ gamma
}
model {
z ~ normal(X*D_div_gamma, sigma);
// priors
psi ~
sigma ~
tau ~
gamma[, 1] ~ normal(phi_zero, tau);
gamma[, 2:T] ~ normal(phi * gamma[1:(T1)], tau);
}
Thank you.
This seems to be missing the definition for phi
, but the constraints are right to make the columns of gamma
stationary.
If K
is large, Iâ€™d suggest making the transformed parameters local variables in the model block so they donâ€™t get saved.
Also, the upper and lower bounds for transformed parameters are just for error checking.