Hierarchical bayesian regression

I am trying to model the given equation using pystan

For a given cluster, there are n states.
Values of alpha and beta are different for each state within a cluster.

I have had success in modelling this differently, In my case,
I am assuming alpha and beta are same for each state within a cluster.
Model code that i came up with goes as,

"""
    data {
        int<lower=1>     N;                      // number of states
        int<lower=1>     M;                      // number of months
        row_vector[N]    X1[M];
        row_vector[N]    X2[M];
        row_vector[N]    X3[M];
        row_vector[N]    X4[M];
        row_vector[N]    X5[M];
        row_vector[N]    X6[M];
        row_vector[N]    D[M];
    }

    parameters {
        real                   alpha;
        vector[6]            beta;
        vector[6]            mu_beta;
        cov_matrix[6]    sigma_beta;
        cov_matrix[N]   sigma_err;
    }

    transformed parameters {
        row_vector[N]    D_hat[M];
        for(i in 1:M) {
            D_hat[i] = alpha + X1[i] * beta[1] + X2[i] * beta[2] + X3[i] * beta[3] + X4[i] * beta[4] + X5[i] * beta[5] + X6[i] * beta[6];
        }
    }

    model {
        alpha             ~    normal(0, 100);
        mu_beta        ~    multi_normal(rep_vector(0, 6), diag_matrix(rep_vector(100,6)));
        sigma_beta   ~    inv_wishart(7, diag_matrix(rep_vector(0.0001,6)));
        beta               ~    multi_normal(mu_beta, sigma_beta);
        sigma_err      ~    inv_wishart(N+1, diag_matrix(rep_vector(0.0001,N)));
        D                   ~    multi_normal(D_hat, sigma_err);
    }
    """

Now, I want to model it with states have different values for alpha and beta , but the representation for D_hat gets me all confused, for this case.

Can anyone assist me or provide some pointers that I can look into?

I understand that parameters block will be changed as follows, but representation for D_hat leaves me clueless.

parameters {
        vector[N]           alpha;
        vector[6]            beta[N];
        vector[6]            mu_beta;
        cov_matrix[6]    sigma_beta;
        cov_matrix[N]   sigma_err;
    }

And, Is there a way I can print values of D_hat[i] after each iteration???

It looks like the measurement process is something like, for every month (there are M of these) you make a measurement of every state (there are N of these).

D_hat looks like an array of row_vector s of means, one element in the row_vector per state, and one row_vector in the array for each month.

In the original model, alpha was the same for every state and every month. Now you have defined a different alpha for each state.

The clearer way to write this is maybe not do vectorization.

In that case the old code is:

row_vector[N]    D_hat[M];
for(i in 1:M) {
  for(j in 1:N) {
    D_hat[i, j] = alpha + ...;
  }
}

And with your new alpha you have:

row_vector[N]    D_hat[M];
for(i in 1:M) {
  for(j in 1:N) {
    D_hat[i, j] = alpha[j] + ...;
  }
}

Similarly with beta you have things like:

row_vector[N]    D_hat[M];
for(i in 1:M) {
  for(j in 1:N) {
    D_hat[i, j] = alpha[j] + X1[i] * beta[1, j] + ...;
  }
}

I think the code you wrote would have still worked (I could be missing something) if you defined alpha and beta like:

row_vector[N] alpha;
row_vector[N] beta[6];
1 Like

Yes.

Also yes.

Absolutely.

But, you see, X1[i] is row_vector of length N that means each element of X1[i] represents a value at month i for different states.
I need to do element wise multiplication of X1[i]with beta[1] and X2[i]with beta[2] and so forth.

Something like this,

row_vector[N]    D_hat[M];
for(i in 1:M) {
  D_hat[i] = alpha + X1[i] .* beta[1] + X2[i] .* beta[2] + ...;
}

Its still bit confusing but I think this should work.
Thanks for reaching out. :)

Hold on second.

There seems to arise a different problem now.

as you may see,

beta[i] for a given state is of size 6 and should be drawn from MVN distribution.

But doing so, will change the model, cause now beta[i] is length N

So i guess this works out, but need changes in the model itself.

I came up with different strategy

model_code = """
data {
    int<lower=1>     N;                      // number of districts
    int<lower=1>     M;                      // number of months
    matrix[M, 6]     X[N];
    vector[N]        cc[M];
}

parameters {
    vector[N]        alpha;
    vector[6]        beta[N];
    vector[6]        mu_beta;
    cov_matrix[6]    sigma_beta;
    cov_matrix[N]    sigma_err;
}

transformed parameters {
    vector[N]    cc_hat[M];
    for(j in 1:N) {
        for(t in 1:M) {
            cc_hat[t, j] = alpha[j] + dot_product(X[j, t], beta[j]);
        }
    }
}

model {
    alpha       ~    normal(0, 100);
    mu_beta     ~    multi_normal(rep_vector(0, 6), diag_matrix(rep_vector(10,6)));
    sigma_beta  ~    inv_wishart(6, diag_matrix(rep_vector(10,6)));
    beta        ~    multi_normal(mu_beta, sigma_beta);
    sigma_err   ~    inv_wishart(N, diag_matrix(rep_vector(0.01,N)));
    cc          ~    multi_normal(cc_hat, sigma_err);
}
"""

Good catch!

That looks cleaner.

1 Like