Hey guys,
First of all, i appreciate it very much being part of this group.
I’m currently working at a R/RStan project where i want to build a linear model with splines and random effects. So far, i tried my model on simulated data. The simulation was made in R.
So basically, the splines are working quite stable and good, i pretty much borrowed this part from Kharratzadeh case study (https://github.com/milkha/Splines_in_Stan , thank you at this point).
Unfortunately, the random effects i get from the stan-model do not match with my simulated random effects from R. I am getting results, but not the results i want to have… the random effects i get from my stan model should be as similar as possible as my random effects i generated in R.
I’m not really sure where my mistake is, so maybe someone of you could help me.
First:
How i simulated my data in R:
#Splines
X1 <- seq(from=-5, to=5, length=150) # generating inputs
B <- t(bs(X1, knots=seq(-5,5,1), degree=3, intercept = TRUE)) # creating the B-splines
num_data <- length(X1)
num_basis <- nrow(B)
a0 <- 0.2 # intercept
a <- rnorm(num_basis, 0, 1) # coefficients of B-splines
#Random Effects
set.seed(123)
sigma_u0 <- 1
u_0j <- rnorm(num_basis, mean = 0, sd = sigma_u0) #"Random intercepts"
beta_0j <- rep(u_0j, length=num_data) # Varying intercepts
cluster <- rep(1:num_basis, length=num_data) # Cluster indicator
Y_true <- beta_0j + as.vector(a0*X1 + a%*%B) # generating the output
So… i’m not 100% sure if this Y_true is well simulated for a mixed linear model with random effects and Splines, but it looks quite good to me. Any suggestions …?
#combine
dat <- list(Ni = length(Y_true2),
Nj = num_basis,
cluster = cluster,
Y = Y_true2,
X = X1,
B = B
)
sooooo and now to my Stan model. So far i just do normal b-splines without any smoothing splines or sth.
data {
// Define variables in data
int<lower=0> Ni; // Number of observations (an integer)
int<lower=0> Nj; // Number of clusters (an integer)
int<lower=1> cluster[Ni]; // Cluster IDs
vector[Ni] Y; // output plus noise, model vector?
vector[Ni] X; // X as in intervalls
matrix[Nj, Ni] B; // splines-matrix
}
parameters {
row_vector[Nj] a_raw; //rows of the b-splines
real a0; //intercept
real<lower=0> sigma; //std.dev.
real<lower=0> tau; //coeff of b-splines
real u_0j[Nj]; // Random effects
real<lower=0> sigma_u0; // Standard deviation of random effects
}
transformed parameters {
row_vector[Nj] a;
vector[Ni] Y_hat;
real mu[Ni]; // Individual mean
a = a_raw*tau;
for (i in 1:Ni) {
mu[i] = u_0j[cluster[i]]; // Individual mean
}
Y_hat = a0*X + to_vector(a*B); //Creating Splines
}
model {
a_raw ~ normal(0, 1);
tau ~ normal(0, 1);
sigma ~ normal(0, 1);
sigma_u0 ~ normal(0, 1); //try
u_0j ~ normal(0, sigma_u0); // Random effects distribution
for (i in 1:Ni)
Y ~ normal(mu[Ni] + Y_hat, sigma); //model
}
If someone has any ideas for getting a solution for my problem, also how to optimize my model or making it more efficient, i would be forever grateful.
Thanks in advance,
Annika