# Multilevel Gaussian mixture regression model

Help needed in modelling

Implementing a multilevel model as follows:

Level 1 :

y = intercept + x * slope + error

Level 2 :

slope represents Gaussian Mixture Model

=> GMM(slope | mean, S.D)

``````data {
int<lower=1> N; // number of subjeccts (assumed numbered 1 to N)
int<lower=1> D; // number of data points
real<lower=0,upper=100> y[D];  // Feature
real<lower=0> time[D];        // timing of visits
int<lower=1,upper=N> sub[D];  // ID of subjects
int K; // Number of gaussians
}

parameters {

// Level 1 Parameters for Linear Model.

real a[N];              // intercept, varies by subject
real b[N];              // slope over time, varies by subject
real<lower=0> err;      // error of the regression

// Level 2 Parameters for Gaussian Mixture model.

ordered[K] mu;
real<lower=0, upper=1> theta;
real<lower=0> sigma[K];

}

transformed parameters {
real y_hat[D];
for (i in 1:D) {
y_hat[i] = a[sub[i]] + b[sub[i]] * time[i];
}
}

model {
// the hyper-model
for (j in 1:N) {
a[j] ~ normal(0,1);
b[j] ~ normal(0,1);
}

// the base level of the model
for (i in 1:D) {
y[i] ~ normal(y_hat[i],err);
}

sigma ~ normal(0, 1);
mu ~ normal(0, 1);
theta ~ beta(5, 5);

for (n in 1:N){
target += log_mix(theta,
normal_lpdf(b[n] | mu[1], sigma[1]),
normal_lpdf(b[n] | mu[2], sigma[2]));

}

}
``````
1. Would like to know whether the model is correct.
2. Experiencing Low E-BFMI.

One thing I’ve noticed is that `b` influences the density twice:

``````b[j] ~ normal(0,1);
``````

and

``````           target += log_mix(theta,
normal_lpdf(b[n] | mu[1], sigma[1]),
normal_lpdf(b[n] | mu[2], sigma[2]));
``````

I.e. you are saying the `b` should belong to two distributions at the same time. This could easily lead to the problems you describe. You probably want to remove the first one.

Also not that the sampling statements are vectorized, so you can have:

``````a ~ normal(0,1);

y ~ normal(y_hat,err);
``````

avoiding the for loops and slightly increasing performance.