Dear Community,
I hope you are all doing well.
I am joining this forum to reach out for help with the following Bayesian Analysis application. I am working on replicating the methodology presented in this paper by Feng et al. (2014). I posted the paper for reference, but I am breaking it out here so the post stands on its own. The subjacent model I am estimating is:
\begin{align} y &= \beta_0 + \beta_1 (b + y) + \sum_{n=1}^{3} \gamma_n x_n + \beta_{\tau} t + \frac{1}{2} \beta_{11} (b + y)^2 + \frac{1}{2} \sum_{n=1}^{3} \sum_{n'=1}^{3} \gamma_{nn'} x_n x_{n'} \nonumber\\ &+ \frac{1}{2} \beta_{\tau \tau} t^2 + \sum_{n=1}^{3} \phi_{n1} x_n (b + y) + \beta_{\tau 1} t (b + y) + \sum_{n=1}^{3} \gamma_{\tau n} t x_n - u + v \end{align}
Which after rearranging, renaming, and adding observation and time subscripts, can be written as:
\begin{equation} q_{it} = s'_{it} \theta - u_{it} + v_{it}, \quad i = 1, \dots, K, \quad t = 1, \dots, T \end{equation}
Where:
- q_{it} = y_{it}
- \theta is the vector of parameters to be estimated (all the “betas”, “gammas” and “Phis”)
- s'_{it} is a vector containing the data (all the x_{it,...}, t, b_{it}, y_{it}, all data is always positive)
- Assume v_{it} \backsim iid \quad N(0, \sigma^{2})
My first question is theoretical :
- I wonder if, when estimating this model using Bayesian methods, there is any potential challenge or consequence arising from the fact that the subjacent equation is defining y implicitly.
Further, I have some questions relating to the actual implementation of the previously defined model. Let’s assume we know the following priors:
-
p (u_{it} | \lambda^{-1}) = f_{\text{Gamma}} (u_{it} | 1, \lambda^{-1})
-
p (\lambda^{-1}) = f_{\text{Gamma}} (\lambda^{-1} | 1,\rho ) \quad (\rho is just a know constant, say \rho = 0.1)
-
p(h) = f_{\text{Gamma}} (h | 1, c) \quad where \quad h = \frac{1}{\sigma^2} \quad (c is just a know constant, say c = 0.0001)
-
Let’s also assume noninformative priors for the parameters \theta (e.g. each parameter is distributed N(0, 1)
Finally, and here is the more critical aspect, following the paper by Feng et al. (2014), I want to impose some restrictions on the estimated parameters. For example, I want parameters that meet the following inequality:
-1 + \beta_{1} + \beta_{11}(b+y) + \sum_{n=1}^{3} \phi_{n1} x_n + \beta_{\tau 1} t \leq 0
Ideally, the above inequality should be satisfied for all the x_{it,...}, t, b_{it}, y_{it}, but in a less restrictive application, I could also assume that the inequality is met only at the means.
This takes me to the next question:
- Is there any way to impose this type of condition in a Stan program? I have read that the restriction may be imposed during sampling by using a metropolis hastings algorithm (See, for example this paper on which the Feng et al. 2014 is based). Can this type of procedure be implemented in stan? is there any other way of doing this?
My final question is related to the implementation of a stan program for this application. For now, I am working on the non-restricted case, and I have made progress with the stan code pasted below.
-
Can this code be adjusted to include the restriction related to the second question? The model also seems to take a very long time to run if I use 4 chains, 4,000 samples, and 1,000 warmup iterations. Any advice for making it more efficient?
-
How can the prior for \sigma^2 be specified?
P.s. The below code was generated by the brms package in R. I posted the STAN code instead because, given my application, I think that I need a more flexible approach (e.g., STAN) than the available R packages.
# // generated with brms 2.22.0
functions {
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K_beta0; // number of population-level effects
matrix[N, K_beta0] X_beta0; // population-level design matrix
int<lower=1> K_beta1; // number of population-level effects
matrix[N, K_beta1] X_beta1; // population-level design matrix
int<lower=1> K_gamma1; // number of population-level effects
matrix[N, K_gamma1] X_gamma1; // population-level design matrix
int<lower=1> K_gamma2; // number of population-level effects
matrix[N, K_gamma2] X_gamma2; // population-level design matrix
int<lower=1> K_gamma3; // number of population-level effects
matrix[N, K_gamma3] X_gamma3; // population-level design matrix
int<lower=1> K_gamma4; // number of population-level effects
matrix[N, K_gamma4] X_gamma4; // population-level design matrix
int<lower=1> K_gamma5; // number of population-level effects
matrix[N, K_gamma5] X_gamma5; // population-level design matrix
int<lower=1> K_betatau; // number of population-level effects
matrix[N, K_betatau] X_betatau; // population-level design matrix
int<lower=1> K_beta11; // number of population-level effects
matrix[N, K_beta11] X_beta11; // population-level design matrix
int<lower=1> K_gamma11; // number of population-level effects
matrix[N, K_gamma11] X_gamma11; // population-level design matrix
int<lower=1> K_gamma12; // number of population-level effects
matrix[N, K_gamma12] X_gamma12; // population-level design matrix
int<lower=1> K_gamma13; // number of population-level effects
matrix[N, K_gamma13] X_gamma13; // population-level design matrix
int<lower=1> K_gamma14; // number of population-level effects
matrix[N, K_gamma14] X_gamma14; // population-level design matrix
int<lower=1> K_gamma15; // number of population-level effects
matrix[N, K_gamma15] X_gamma15; // population-level design matrix
int<lower=1> K_gamma22; // number of population-level effects
matrix[N, K_gamma22] X_gamma22; // population-level design matrix
int<lower=1> K_gamma23; // number of population-level effects
matrix[N, K_gamma23] X_gamma23; // population-level design matrix
int<lower=1> K_gamma24; // number of population-level effects
matrix[N, K_gamma24] X_gamma24; // population-level design matrix
int<lower=1> K_gamma25; // number of population-level effects
matrix[N, K_gamma25] X_gamma25; // population-level design matrix
int<lower=1> K_gamma33; // number of population-level effects
matrix[N, K_gamma33] X_gamma33; // population-level design matrix
int<lower=1> K_gamma34; // number of population-level effects
matrix[N, K_gamma34] X_gamma34; // population-level design matrix
int<lower=1> K_gamma35; // number of population-level effects
matrix[N, K_gamma35] X_gamma35; // population-level design matrix
int<lower=1> K_gamma44; // number of population-level effects
matrix[N, K_gamma44] X_gamma44; // population-level design matrix
int<lower=1> K_gamma45; // number of population-level effects
matrix[N, K_gamma45] X_gamma45; // population-level design matrix
int<lower=1> K_gamma55; // number of population-level effects
matrix[N, K_gamma55] X_gamma55; // population-level design matrix
int<lower=1> K_betatautau; // number of population-level effects
matrix[N, K_betatautau] X_betatautau; // population-level design matrix
int<lower=1> K_phi1; // number of population-level effects
matrix[N, K_phi1] X_phi1; // population-level design matrix
int<lower=1> K_phi2; // number of population-level effects
matrix[N, K_phi2] X_phi2; // population-level design matrix
int<lower=1> K_phi3; // number of population-level effects
matrix[N, K_phi3] X_phi3; // population-level design matrix
int<lower=1> K_phi4; // number of population-level effects
matrix[N, K_phi4] X_phi4; // population-level design matrix
int<lower=1> K_phi5; // number of population-level effects
matrix[N, K_phi5] X_phi5; // population-level design matrix
int<lower=1> K_betatau1; // number of population-level effects
matrix[N, K_betatau1] X_betatau1; // population-level design matrix
int<lower=1> K_gammatau1; // number of population-level effects
matrix[N, K_gammatau1] X_gammatau1; // population-level design matrix
int<lower=1> K_gammatau2; // number of population-level effects
matrix[N, K_gammatau2] X_gammatau2; // population-level design matrix
int<lower=1> K_gammatau3; // number of population-level effects
matrix[N, K_gammatau3] X_gammatau3; // population-level design matrix
int<lower=1> K_gammatau4; // number of population-level effects
matrix[N, K_gammatau4] X_gammatau4; // population-level design matrix
int<lower=1> K_gammatau5; // number of population-level effects
matrix[N, K_gammatau5] X_gammatau5; // population-level design matrix
int<lower=1> K_u; // number of population-level effects
matrix[N, K_u] X_u; // population-level design matrix
// covariates for non-linear functions
vector[N] C_1;
vector[N] C_2;
vector[N] C_3;
vector[N] C_4;
vector[N] C_5;
vector[N] C_6;
vector[N] C_7;
vector[N] C_8;
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
vector[K_beta0] b_beta0; // regression coefficients
vector[K_beta1] b_beta1; // regression coefficients
vector[K_gamma1] b_gamma1; // regression coefficients
vector[K_gamma2] b_gamma2; // regression coefficients
vector[K_gamma3] b_gamma3; // regression coefficients
vector[K_gamma4] b_gamma4; // regression coefficients
vector[K_gamma5] b_gamma5; // regression coefficients
vector[K_betatau] b_betatau; // regression coefficients
vector[K_beta11] b_beta11; // regression coefficients
vector[K_gamma11] b_gamma11; // regression coefficients
vector[K_gamma12] b_gamma12; // regression coefficients
vector[K_gamma13] b_gamma13; // regression coefficients
vector[K_gamma14] b_gamma14; // regression coefficients
vector[K_gamma15] b_gamma15; // regression coefficients
vector[K_gamma22] b_gamma22; // regression coefficients
vector[K_gamma23] b_gamma23; // regression coefficients
vector[K_gamma24] b_gamma24; // regression coefficients
vector[K_gamma25] b_gamma25; // regression coefficients
vector[K_gamma33] b_gamma33; // regression coefficients
vector[K_gamma34] b_gamma34; // regression coefficients
vector[K_gamma35] b_gamma35; // regression coefficients
vector[K_gamma44] b_gamma44; // regression coefficients
vector[K_gamma45] b_gamma45; // regression coefficients
vector[K_gamma55] b_gamma55; // regression coefficients
vector[K_betatautau] b_betatautau; // regression coefficients
vector[K_phi1] b_phi1; // regression coefficients
vector[K_phi2] b_phi2; // regression coefficients
vector[K_phi3] b_phi3; // regression coefficients
vector[K_phi4] b_phi4; // regression coefficients
vector[K_phi5] b_phi5; // regression coefficients
vector[K_betatau1] b_betatau1; // regression coefficients
vector[K_gammatau1] b_gammatau1; // regression coefficients
vector[K_gammatau2] b_gammatau2; // regression coefficients
vector[K_gammatau3] b_gammatau3; // regression coefficients
vector[K_gammatau4] b_gammatau4; // regression coefficients
vector[K_gammatau5] b_gammatau5; // regression coefficients
vector[K_u] b_u; // regression coefficients
real<lower=0> sigma; // dispersion parameter
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += normal_lpdf(b_beta0 | 0,1);
lprior += normal_lpdf(b_beta1 | 0,1);
lprior += normal_lpdf(b_gamma1 | 0,1);
lprior += normal_lpdf(b_gamma2 | 0,1);
lprior += normal_lpdf(b_gamma3 | 0,1);
lprior += normal_lpdf(b_gamma4 | 0,1);
lprior += normal_lpdf(b_gamma5 | 0,1);
lprior += normal_lpdf(b_betatau | 0,1);
lprior += normal_lpdf(b_beta11 | 0,1);
lprior += normal_lpdf(b_gamma11 | 0,1);
lprior += normal_lpdf(b_gamma12 | 0,1);
lprior += normal_lpdf(b_gamma13 | 0,1);
lprior += normal_lpdf(b_gamma14 | 0,1);
lprior += normal_lpdf(b_gamma15 | 0,1);
lprior += normal_lpdf(b_gamma22 | 0,1);
lprior += normal_lpdf(b_gamma23 | 0,1);
lprior += normal_lpdf(b_gamma24 | 0,1);
lprior += normal_lpdf(b_gamma25 | 0,1);
lprior += normal_lpdf(b_gamma33 | 0,1);
lprior += normal_lpdf(b_gamma34 | 0,1);
lprior += normal_lpdf(b_gamma35 | 0,1);
lprior += normal_lpdf(b_gamma44 | 0,1);
lprior += normal_lpdf(b_gamma45 | 0,1);
lprior += normal_lpdf(b_gamma55 | 0,1);
lprior += normal_lpdf(b_betatautau | 0,1);
lprior += normal_lpdf(b_phi1 | 0,1);
lprior += normal_lpdf(b_phi2 | 0,1);
lprior += normal_lpdf(b_phi3 | 0,1);
lprior += normal_lpdf(b_phi4 | 0,1);
lprior += normal_lpdf(b_phi5 | 0,1);
lprior += normal_lpdf(b_betatau1 | 0,1);
lprior += normal_lpdf(b_gammatau1 | 0,1);
lprior += normal_lpdf(b_gammatau2 | 0,1);
lprior += normal_lpdf(b_gammatau3 | 0,1);
lprior += normal_lpdf(b_gammatau4 | 0,1);
lprior += normal_lpdf(b_gammatau5 | 0,1);
lprior += exponential_lpdf(b_u | 0.1);
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] nlp_beta0 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_beta1 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma1 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma2 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma3 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma4 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma5 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_betatau = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_beta11 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma11 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma12 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma13 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma14 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma15 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma22 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma23 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma24 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma25 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma33 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma34 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma35 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma44 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma45 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gamma55 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_betatautau = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_phi1 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_phi2 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_phi3 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_phi4 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_phi5 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_betatau1 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gammatau1 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gammatau2 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gammatau3 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gammatau4 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_gammatau5 = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] nlp_u = rep_vector(0.0, N);
// initialize non-linear predictor term
vector[N] mu;
nlp_beta0 += X_beta0 * b_beta0;
nlp_beta1 += X_beta1 * b_beta1;
nlp_gamma1 += X_gamma1 * b_gamma1;
nlp_gamma2 += X_gamma2 * b_gamma2;
nlp_gamma3 += X_gamma3 * b_gamma3;
nlp_gamma4 += X_gamma4 * b_gamma4;
nlp_gamma5 += X_gamma5 * b_gamma5;
nlp_betatau += X_betatau * b_betatau;
nlp_beta11 += X_beta11 * b_beta11;
nlp_gamma11 += X_gamma11 * b_gamma11;
nlp_gamma12 += X_gamma12 * b_gamma12;
nlp_gamma13 += X_gamma13 * b_gamma13;
nlp_gamma14 += X_gamma14 * b_gamma14;
nlp_gamma15 += X_gamma15 * b_gamma15;
nlp_gamma22 += X_gamma22 * b_gamma22;
nlp_gamma23 += X_gamma23 * b_gamma23;
nlp_gamma24 += X_gamma24 * b_gamma24;
nlp_gamma25 += X_gamma25 * b_gamma25;
nlp_gamma33 += X_gamma33 * b_gamma33;
nlp_gamma34 += X_gamma34 * b_gamma34;
nlp_gamma35 += X_gamma35 * b_gamma35;
nlp_gamma44 += X_gamma44 * b_gamma44;
nlp_gamma45 += X_gamma45 * b_gamma45;
nlp_gamma55 += X_gamma55 * b_gamma55;
nlp_betatautau += X_betatautau * b_betatautau;
nlp_phi1 += X_phi1 * b_phi1;
nlp_phi2 += X_phi2 * b_phi2;
nlp_phi3 += X_phi3 * b_phi3;
nlp_phi4 += X_phi4 * b_phi4;
nlp_phi5 += X_phi5 * b_phi5;
nlp_betatau1 += X_betatau1 * b_betatau1;
nlp_gammatau1 += X_gammatau1 * b_gammatau1;
nlp_gammatau2 += X_gammatau2 * b_gammatau2;
nlp_gammatau3 += X_gammatau3 * b_gammatau3;
nlp_gammatau4 += X_gammatau4 * b_gammatau4;
nlp_gammatau5 += X_gammatau5 * b_gammatau5;
nlp_u += X_u * b_u;
for (n in 1:N) {
// compute non-linear predictor values
mu[n] = (nlp_beta0[n] + nlp_beta1[n] * (C_1[n] + C_2[n]) + nlp_gamma1[n] * C_3[n] + nlp_gamma2[n] * C_4[n] + nlp_gamma3[n] * C_5[n] + nlp_gamma4[n] * C_6[n] + nlp_gamma5[n] * C_7[n] + nlp_betatau[n] * C_8[n] + nlp_beta11[n] * ((1 / 2) * (C_1[n] + C_2[n]) ^ 2) + nlp_gamma11[n] * C_3[n] ^ 2 + nlp_gamma12[n] * C_3[n] * C_4[n] + nlp_gamma13[n] * C_3[n] * C_5[n] + nlp_gamma14[n] * C_3[n] * C_6[n] + nlp_gamma15[n] * C_3[n] * C_7[n] + nlp_gamma22[n] * C_4[n] ^ 2 + nlp_gamma23[n] * C_4[n] * C_5[n] + nlp_gamma24[n] * C_4[n] * C_6[n] + nlp_gamma25[n] * C_4[n] * C_7[n] + nlp_gamma33[n] * C_5[n] ^ 2 + nlp_gamma34[n] * C_5[n] * C_6[n] + nlp_gamma35[n] * C_5[n] * C_7[n] + nlp_gamma44[n] * C_6[n] ^ 2 + nlp_gamma45[n] * C_6[n] * C_7[n] + nlp_gamma55[n] * C_7[n] ^ 2 + nlp_betatautau[n] * ((1 / 2) * (C_8[n]) ^ 2) + nlp_phi1[n] * C_3[n] * (C_1[n] + C_2[n]) + nlp_phi2[n] * C_4[n] * (C_1[n] + C_2[n]) + nlp_phi3[n] * C_5[n] * (C_1[n] + C_2[n]) + nlp_phi4[n] * C_6[n] * (C_1[n] + C_2[n]) + nlp_phi5[n] * C_7[n] * (C_1[n] + C_2[n]) + nlp_betatau1[n] * C_8[n] * (C_1[n] + C_2[n]) + nlp_gammatau1[n] * C_8[n] * C_3[n] + nlp_gammatau2[n] * C_8[n] * C_4[n] + nlp_gammatau3[n] * C_8[n] * C_5[n] + nlp_gammatau4[n] * C_8[n] * C_6[n] + nlp_gammatau5[n] * C_8[n] * C_7[n] - nlp_u[n]);
}
target += normal_lpdf(Y | mu, sigma);
}
// priors including constants
target += lprior;
}
generated quantities {
}
Many thanks in advance for any insight or support!
Best wishes,
Juan P. Henao
(Ph.D Student)