Dear Stan Community,
I am very new to stan and trying to fit a simple normal linear regression with linear inequality constraints on the parameters.
y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \epsilon, ~~ \epsilon \sim N(0,\sigma^2)
The inequality constraints on the parameter are to be 0>\beta_0, 0<\beta_1<1, 0<\beta_2<1. I first fit the model with diffuse reference prior for the parameters: (\beta, \sigma) \propto \sigma^{-1}.
data {
int<lower=0> N; // Number of observations
vector[N] X1;
vector[N] X2;
vector[N] y; // response
}
parameters {
real<lower = 0> sigma;
real<lower = 0> beta0;
real<lower = 0, upper = 1> beta1;
real<lower = 0, upper = 1> beta2;
}
model {
real mu[N];
// Likelihood
for (i in 1:N) {
mu[i] = beta0 + beta1*X1[i] + beta2*X2[i];
}
y ~ normal(mu, sigma);
}
-
In terms of constraints, Is this the right way to specify this model in Stan?
-
In general, define a set for inequality constraints, e.g. Q = \{a \le D\beta \le w \}, where D is a nonsingular matrix with real numbers and use this set as an indicator function of the constraints for \beta. In vector form,
data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
parameters {
vector[K] b;
real<lower=0> sigma;
}
model {
y ~ normal(X*b, sigma);
}
How can I put the indicator function in the above code so that it is possible to employ prior distribution in the model?
Thank you very much.
If your constraint is:
a \le D\beta \le w
Then if you multiply all by the inverse of D:
D^{-1}a \le D^{-1}D\beta \le D^{-1}w
This simplifies to:
D^{-1}a \le \beta \le D^{-1}w
Stan has overloaded the division operator /
for matrices/vectors to perform this multiplication by an inverse in more numerically stable and efficient way. So you would express your model in Stan as:
data {
int<lower=0> N; // Number of observations
int<lower=0> K; // number of explanatory variables.
matrix[N,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] y; // Vector of dependent variables
vector[K] a; // Vector of lower inequality constraints
vector[K] w; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_a = D / a;
vector[K] D_div_w = D / w;
}
parameters {
vector<lower=D_div_a, upper=D_div_w>[K] b;
real<lower=0> sigma;
}
model {
// More efficient likelihood for linear regression
y ~ normal_id_glm(X, 0, b, sigma);
}
4 Likes
Thank you for the reply.
In the transformed data block, isn’t the matrix operation should be D_div_a = D \ a , not D/a for D^{-1}a?
Thanks.
2 Likes
Oh yes, good catch! It should be \
for left division
1 Like
Can you clarify what you mean by “transforming the quantities to the original scale”? Which quantities are you wanting to transform and what scale do you want them to be on?
1 Like
Nvm. My question was not clear. Thanks!
1 Like
Hi,
I am trying to add time-varying coefficients to the model.
Let \gamma = D\beta \in \{\gamma: a \le \gamma \le w\} and the model becomes
z_t = x_t D^{-1}\gamma + \epsilon_t, ~~ t = 1, \ldots, T.
We assign appropriate priors to time-varying \gamma_t. Could you check if I coded \gamma and model z correctly?
Note that I changed the lower and upper bound notation to l and u.
data {
int<lower=0> T; // Time
int<lower=0> K; // number of explanatory variables.
matrix[T,K] X; // Matrix of explanatory variables
matrix[K,K] D; // Matrix that represents inequality constraints
vector[N] z; // Vector of dependent variables
vector[K] l; // Vector of lower inequality constraints
vector[K] u; // Vector of upper inequality constraints
}
transformed data {
vector[K] D_div_l = D \ l; // D^{-1}l
vector[K] D_div_u = D \ u; // D^{-1}u
}
parameters {
real<lower=0> sigma;
real<lower=l, upper=u> phi_zero;
vector[K] psi;
real<lower=0> tau;
matrix[K, T] gamma;
}
transformed parameters {
vector<lower=-1, upper=1>[K] phi =
vector<lower=D_div_l, upper=D_div_u>[K] D_div_gamma = D \ gamma
}
model {
z ~ normal(X*D_div_gamma, sigma);
// priors
psi ~
sigma ~
tau ~
gamma[, 1] ~ normal(phi_zero, tau);
gamma[, 2:T] ~ normal(phi * gamma[1:(T-1)], tau);
}
Thank you.
This seems to be missing the definition for phi
, but the constraints are right to make the columns of gamma
stationary.
If K
is large, I’d suggest making the transformed parameters local variables in the model block so they don’t get saved.
Also, the upper and lower bounds for transformed parameters are just for error checking.
1 Like