# Simple normal regression model with inequality constraints on the parameters

Dear Stan Community,

I am very new to stan and trying to fit a simple normal linear regression with linear inequality constraints on the parameters.

y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \epsilon, ~~ \epsilon \sim N(0,\sigma^2)

The inequality constraints on the parameter are to be 0>\beta_0, 0<\beta_1<1, 0<\beta_2<1. I first fit the model with diffuse reference prior for the parameters: (\beta, \sigma) \propto \sigma^{-1}.

data {
int<lower=0> N; // Number of observations
vector[N] X1;
vector[N] X2;
vector[N] y; // response
}
parameters {
real<lower = 0> sigma;
real<lower = 0> beta0;
real<lower = 0, upper = 1> beta1;
real<lower = 0, upper = 1> beta2;
}
model {
real mu[N];

// Likelihood
for (i in 1:N) {
mu[i] = beta0 + beta1*X1[i] + beta2*X2[i];
}
y ~ normal(mu, sigma);
}

1. In terms of constraints, Is this the right way to specify this model in Stan?

2. In general, define a set for inequality constraints, e.g. Q = \{a \le D\beta \le w \}, where D is a nonsingular matrix with real numbers and use this set as an indicator function of the constraints for \beta. In vector form,

data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
parameters {
vector[K] b;
real<lower=0> sigma;
}
model {
y ~ normal(X*b, sigma);
}


How can I put the indicator function in the above code so that it is possible to employ prior distribution in the model?

Thank you very much.

a \le D\beta \le w

Then if you multiply all by the inverse of D:

D^{-1}a \le D^{-1}D\beta \le D^{-1}w

This simplifies to:

D^{-1}a \le \beta \le D^{-1}w

Stan has overloaded the division operator / for matrices/vectors to perform this multiplication by an inverse in more numerically stable and efficient way. So you would express your model in Stan as:

data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_a = D / a;
vector[K] D_div_w = D / w;
}
parameters {
vector<lower=D_div_a, upper=D_div_w>[K] b;
real<lower=0> sigma;
}
model {
// More efficient likelihood for linear regression
y ~ normal_id_glm(X, 0, b, sigma);
}

3 Likes

In the transformed data block, isnâ€™t the matrix operation should be D_div_a = D \ a , not D/a for D^{-1}a?

Thanks.

2 Likes

Oh yes, good catch! It should be \ for left division

1 Like

Can you clarify what you mean by â€śtransforming the quantities to the original scaleâ€ť? Which quantities are you wanting to transform and what scale do you want them to be on?

1 Like

Nvm. My question was not clear. Thanks!

1 Like

Hi,

I am trying to add time-varying coefficients to the model.

Let \gamma = D\beta \in \{\gamma: a \le \gamma \le w\} and the model becomes

z_t = x_t D^{-1}\gamma + \epsilon_t, ~~ t = 1, \ldots, T.

We assign appropriate priors to time-varying \gamma_t. Could you check if I coded \gamma and model z correctly?

Note that I changed the lower and upper bound notation to l and u.

data {
int<lower=0> T; // Time
int<lower=0> K; // number of explanatory variables.
matrix[T,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] z; // Vector of dependent variables
vector[K] l; // Vector of lower inequality constraints
vector[K] u; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_l = D \ l; // D^{-1}l
vector[K] D_div_u = D \ u; // D^{-1}u
}
parameters {
real<lower=0> sigma;
real<lower=l, upper=u> phi_zero;
vector[K] psi;
real<lower=0> tau;
matrix[K, T] gamma;
}
transformed parameters {
vector<lower=-1, upper=1>[K] phi =
vector<lower=D_div_l, upper=D_div_u>[K] D_div_gamma = D \ gamma
}
model {
z ~ normal(X*D_div_gamma, sigma);

// priors
psi ~
sigma ~
tau ~

gamma[, 1] ~ normal(phi_zero, tau);
gamma[, 2:T] ~ normal(phi * gamma[1:(T-1)], tau);
}


Thank you.

This seems to be missing the definition for phi, but the constraints are right to make the columns of gamma stationary.

If K is large, Iâ€™d suggest making the transformed parameters local variables in the model block so they donâ€™t get saved.

Also, the upper and lower bounds for transformed parameters are just for error checking.