Jacobian for prior on P-spline transformation

#1

I’m working on a discrete choice model for schools where the probability of a child’s family choosing a school is a monotonic function of distance (i.e. you never like a school more just because it is further away). Its a multinomial logit. My data has D demographic subgroups (cross-classification of ethnicity, socio-economic status etc.) and I want the model to include a flexible B-spline function of d_{ij} that varies by these demographics. So a vector \beta_d for each d \in 1,\dots,D.

Because of sparsity and the monotonicity condition, there are three types of structure on the spline coefficients: hierarchical structure to take account of demographics; a smoothness penalty on second differences; and a hard constraint on first differences (\nabla \beta \leq 0) to enforce monotonicity.

So, at the moment it seems most natural to build each vector from main effects \beta_d = \beta_0 + \beta_{eth(d)} + \beta_{SES(d)} + \dots + b_d to provide hierarchical shrinkage. The components (\beta_0,\beta_{eth(d)},b_d etc.) are raw parameters, and \beta_d are transformed parameters.

The problem is that the first and second differences of \beta_d are transformations of these already-transformed parameters), and it’s those I would like to put priors on, in addition to the priors on the components.

So, my question is, is there a way to put priors on the structure and then also put priors on the first and second differences of the \beta_d? If it entails putting priors on the transformed parameters, do I need a Jacobian adjustment? I’m not sure, but I hope maybe not because the first and second diffs are a linear transformation? Any help would be appreciated.

Below you can see what I’m doing now. I’ve taken out irrelevant parts of the model:

data {
int<lower=1> N;                                // Number of students
int<lower=2> J;                                 // Number of schools
int<lower=1> D;                                // Number of demographics
int<lower=1> n_eth;                          // number of ethnic groups
int<lower=1> n_coef;                        // Number of spline coefficients
int<lower=1,upper=J> y[N];              //Dependent variable taking values in {1,...,J} ie. NOT binary
int<lower=1,upper=D> demog[N];    // which demographic group is pupil in?
matrix[J, n_coef] X[N];                      // Matrix of regressors (spline coefficients)
int<lower=1,upper=n_eth> eth[D];    // which ethnicity does demographic represent?
int<lower=1,upper=2> ses[D];           // which SES group does demographic represent?
matrix[n_coef,n_coef] DD;                // structure matrix of spline coefficients
vector[n_coef] zeroes;
}

transformed data{
int<lower=1> n_sigmas;
n_sigmas = 3;
}

parameters {
vector[n_coef] betas;                          // mean spline coefficients
vector[n_coef] beta_e[n_eth];             // ethnic variations of spline coefficients
vector[n_coef] beta_p[2];                    // SES variations of spline coefficients
vector[n_coef] b_d[D];                         // Demographic variations of spline coefficients (residual)
real<lower=0> sigmas[n_sigmas];       // variances of all params
}
transformed parameters{
vector[n_coef] beta_d[D];

for(d in 1:D){
beta_d[d]  = betas + sigmas[1]*beta_e[eth[d]] + sigmas[2]*beta_p[puppri[d]] + sigmas[3]*b_d[d];
}
}

model {

betas ~ multi_normal_prec(zeroes,DD);

for(i in 1:n_eth){
beta_e[i] ~ multi_normal_prec(zeroes,DD);
}
for(i in 1:2){
beta_p[i] ~ multi_normal_prec(zeroes,DD);
}

for(i in 1:D){
b_d[i] ~ multi_normal_prec(zeroes,DD);
}
for(i in 1:n_sigmas){
sigmas[i] ~ student_t(5,0,5);
}

for(n in 1:N){
y[n] ~ categorical_logit( X[n]*beta_d[demog[n]] );
}

}


NB. the structure matrix is the p-spline formulation DD = tau*crossprod(diff(diag(n_coef),differences=k)) plus a little bit of diag(n_coef)*1e-4 because DD is not a full rank precision matrix. This transformation allows me to put smoothness priors on the original parameters, but I’m not sure there’d be a way to do it with the monotonicity prior. For the moment I’m fixing tau (the smoothness parameter) rather than estimating it.

#2

You can have multiple changes to the posterior log-kernel for the “same” unknowns. And if the Jacobian determinant is constant, then whether you adjust the posterior log-kernel for it or not will not affect what parameters are proposed or accepted. I would check your logic several times though.

#3

Thanks Ben. In that case, I guess that the transformation can be represented by a matrix, so the Jacobian won’t depend on the parameters? I’ll code something up and share.

#4

Is there any reason why cross-classification isn’t discussed and used more frequently?

#5

My “cell membership” indicators eth and ses are arrays of indices of length D, where, say, the values of eth are the ethnicity codes of the members of cross-classification cell d. For concreteness, ethnicity might be binary, SES might be binary, and then D=4. Then eth might be (1,2,1,2) and ses might be (1,1,2,2). If I replace

with

vector[D] eth;
vector[D] ses;


could I then vectorise the statement

for(d in 1:D){
beta_d[d] = betas + sigmas[1]*beta_e[eth[d]] + sigmas[2]*beta_p[ses[d]] + sigmas[3]*b_d[d];
}


as

beta_d = betas + sigmas[1]*beta_e[eth] + sigmas[2]*beta_p[ses] + sigmas[3]*b_d ;


even though each element being indexed is not a scalar but a vector, and I would therefore be using one statement to construct an array of vectors?

#6

No, you cannot index with a vector, but

beta_d = betas + sigmas[1]*beta_e[eth] + sigmas[2]*beta_p[ses] + sigmas[3]*b_d ;


should work if eth and ses are integer arrays.