 # Negative value in log when using randomly assign coefficient for B spline and Integration of exponential of cubic B spline question

Hi All,
I currently encounter a question about the negative value occurring in the log. I used the B-spline to estimate the baseline function in the survival model. I write the log-likelihood which is shown at bottom of the code. However, since the coefficient of the spline basis is randomly assigned, so it could be positive or negative and it made the value of “bsl” negative sometimes, however, the “bsl” is the value in the log, it cannot be negative, so I’m not sure how to make the likelihood part work.

Please see my code below. * means multiply.

``````// B spline used for baseline function
data {
int<lower=0> N; // number of subject
int<lower=0> N_row; // number of rows in the dataset
//int<lower=0> K; // number of observed events
//int<lower=0> Cn; // number of censored observations
int ncov; // number of covariates
int nknot; // number of basis functions
int<lower=1,upper=N_row> ID[N_row];
matrix[N_row,ncov] X; // covariate matrix (treatment, sex)
matrix[N_row, nknot] bv; // basis matrix of cubic B spline
matrix[N_row, nknot] ibv; // integrated basis matrix of cubic B spline
vector[N_row] cen;  // cen=1, observed; cen=0, censored
}

parameters {
vector[nknot] u;  // coefficients of spline basis
real beta1;      // coefficient of treatment
real beta2;      // coeffcient of sex
vector[N] w;     // frialty term
real<lower = 0> sigmaw; // variance parameter for frailty term
}

model {
vector[N_row] bsl;
vector[N_row] ibsl;
real x_beta;

// prior
for (i in 1:nknot){
u[i] ~ normal(0,10);
}
beta1 ~ normal(0,sqrt(1000));           // normal(mu,sigma)
beta2 ~ normal(0,sqrt(1000));
w ~ normal(0,sqrt(sigmaw));
sigmaw ~ inv_gamma(0.001, 0.001);
//sigmaw ~ normal(0,sqrt(1000));

// likelihood
bsl = bv*u;                       // B spline baseline in first part of likelihood
ibsl = ibv*u;                     // integrated B spline baseline in second part of likelihood

for (i in 1:N_row){
x_beta = beta1*X[i,1] + beta2*X[i,2]+w[ID[i]];
//target += cen[i]*(log(bsl[i]) + x_beta) - exp(x_beta)*ibsl[i];
target += cen[i]*log(bsl[i]*exp(x_beta)) - exp(x_beta)*ibsl[i];
}
}
``````

I’m also thinking about putting the baseline into the exponential function to avoid the log function. But I’m not quite familiar with how to integrate the exponential of the B spline function. I googled and found it would make it even more complicated. (I could be wrong.) Please see the code for how to get the “bv” and “ibv”.

``````# Prepare data for fitting the model
kl<-c(min(pwpgt.data\$t2)+1,c(1:5)*(max(pwpgt.data\$t2) min(pwpgt.data\$t2))/6,max(pwpgt.data\$t2)-1) # knots for B-splines
bv<- t(as.matrix(t(bs(pwpgt.data\$t2,knots=kl,degree=3, intercept = TRUE))))
ibv<- t(as.matrix(t(splines2::ibs(pwpgt.data\$t2,knots=kl,degree=3, intercept = TRUE))))
``````

It would be great if anyone could help with this question and much more highly appreciated!

Thank you very much!

EDIT: @maxbiostat edited this post for syntax highlighting.

@hhau , can you spot something off here?

If you are modelling the baseline hazard directly with a B-spline, then the coefficients of the spline must be positive. So `u` needs a `<lower = 0>` constraint.

1 Like

Thanks for helping edit the post and direct it to the expert on the topic.

1 Like

Cheers. That’s my job round here.

1 Like

Thanks, @hhau! So you mean in the STAN, I have to make the coefficient to be positive if directly using B-spline for constructing the baseline hazard. I understand the likelihood is written in the log scale, then I can use the target += for loglikelihood (which is adding process). However, I also see the WINBUGS could write the likelihood directly (I mean without using log scale, so the likelihood is written by the multiplication of each part) and assign either positive or negative value for the coefficient, and I think it is more reasonable. So my questions are, 1) is there any way to re-write my current STAN code so that I can also assign negative value in the STAN? 2) Since WINBUGS could assign negative values for the coefficient which I think is more reasonable, does it mean I’d better use WINBUGS than using STAN for B-spline baseline hazard?
Thank you very much if you may offer more help!