Mixture Model with joint probability

Jennifer · May 31, 2017, 1:49pm

Hi,

I am very new to stan and I am trying to program a mixture model with joint probability for a clustering/classification of data.
I have two data sets (descriptors), distances and angles, and I am trying read in both and consider both to find classes in my data points.

For now I have a model which uses one data set and calculates classes, but I have no idea how to manage it, to include both datasets into it.

My code:

    data {
int         K;  // number classes
int         N;  // number of data points
vector[N]   y;
}
parameters{
vector <lower=0, upper=16>  [K] means;
vector<lower=0>[K]  sigma;
simplex [K] weights;
}
model{
  for (n in 1:N) {
    vector [K] tmp;
    for (k in 1:K) {
        tmp[k] = log(weights[k]) + normal_lpdf(y[n] | means[k], sigma[k]);
    }
    target += log_sum_exp (tmp);
  }
}

I would appreciate any help!

sakrejda · May 31, 2017, 1:56pm

Looks like you’ll need to make y an array of N 2-vectors like this:

vector[2] y[N];

Your means will be a K-length array of 2-vectors:

vector[2] y[K];

Your sigma will be a K-length array of covariance matrices

matrix[2,2] sigma[K];

Then you can use a the multivariate normal in place of your normal_lpdf, and you will need to estimate a covariance matrix instead of just a s.d… Instead of just jumping into that you might want to code up fitting a simple multivariate normal to some data so you get all the syntax down.
There’s also a nice section in the manual on optimizing the MVN and what priors to use.

Hope that helps.

Krzysztof

Jennifer · June 1, 2017, 8:10am

thank you very much! I am going to try this, hopefully I get it right

Jennifer · June 1, 2017, 9:55am

So I don’t know if I understand this, just ‘replacing’ the vectors and the normal_lpdf is not sufficient?

   data {
int         K;  // number classes
int         N;  // number of all data points
vector[2] y[N];
}
parameters{
vector[2] means[K];
matrix[2,2] sigma[K];
simplex [K] weights;
}
model{
  for (n in 1:N) {
    vector [K] tmp;
    for (k in 1:K) {
        tmp[k] = log(weights[k]) + multi_normal_lpdf(y[n] | means[k], sigma[k]);
    }
    target += log_sum_exp (tmp);
  }
}

Also, where exactly do I find the section on optimizing the MVN? Is this p 147/148?

Bob_Carpenter · June 21, 2017, 8:19pm

You need to declare covariance matrixes as such to make sure they remain positive definite. It’s even better to use Cholesky factors of covariance matrices.

You may have problems here because you don’t have a prior on means or sigma (I’d suggest mu and Sigma here to follow conventional notation; means and covariance would also be consistent).

You also want to compute those log weights only once, so all together, that’d look like:

parameters {
  cholesky_factor_cov[2] L_Sigma[K];
  vector[2] mu[K];
  ...
model {
  vector[K] log_weights = log(weights);
  ...
    tmp[k] = log_weights[k] + multi_normal_lpdf(y[n] | mu[k], L_Sigma[k]);

We like using Cholesky factors of correlation matrices, providing some LKJ prior, then scaling, also with priors.

P.S. Sorry this didn’t get answered sooner—I was missing messages because you have to explicitly tell Discourse to let you know about new messages in topics in which you haven’t participated.

Topic		Replies	Views
Joint Probability with von Mises Modeling	1	756	June 8, 2017
Mixture Models with heterogeneous mixtures Modeling	2	314	November 17, 2023
Mixture model: Different perspective to the schools example Modeling	35	1267	April 2, 2020
Mixture of Multivariate Bernouilli Modeling	5	883	September 29, 2019
Modeling multi modal distribution Modeling fitting-issues	3	639	March 19, 2023

Mixture Model with joint probability

Related topics