Multivariate Skew Normal

Hi guys,

I am having to adapt a multivariate normal model into a skew multivariate normal model. I note that a SMN pdf is not yet part of Stan, however it does make an appearance in the manual on page 346:

This reparameterization of a multivariate normal distribution in terms of standard normal variates can be extended to other multivariate distributions… such as the skew multivariate normal distribution.

Are there any examples or similar I might have missed to that would show what form of reparameterisation would be correct for a multivariate skew normal, it is definitely not obvious to me!

Cheers

http://arxiv.org/abs/0911.2093

Yeah, I came across that article from a similar topic, however I found it rather opaque to understand what to extract from it unfortunately. Any pointers you can give me?

I think 5.1 is the relevant section, so would this be a correct way of reading their paper. I’ve mocked up a similar model on my workstation and been unable to convincing results, so I might be misunderstanding.

// Fitting a population of  K dimensional events to a multivariate skew normal
data {
    int<lower=2> K;
    int n_obs;
    vector[K] obs [n_obs];
}
parameters {
     vector[K] means;
     vector[K] sigmas;
     cholesky_factor_corr[K, K] correlation;
     vector[K] shapes;
}
transformed_parameters {
    matrix[K,K] cov;
    row_vector[K] offsets;

    cov = diag_pre_multiply(sigmas, correlation);           // From after Eq 10
    offsets = shape' *  diag_matrix(inv(sigmas));   // From Eq 10

}
model {
    for (i in 1:n_obs) {
        target += normal_lpdf(obs[i] | means, cov) + normal_lcdf(offsets * (obs[i] - means) | 0, 1);
    }
}

I think it is

data {
  int<lower=0> N;
  int<lower=2> K;
  vector[K] Y[N];
}
parameters {
  vector[K] xi;
  vector<lower=0>[K] omega;
  vector[K] alpha;
  cholesky_factor_corr[K] L;
}
model {
  target += multi_normal_cholesky_lpdf(Y | xi, diag_pre_multiply(omega, L));
  for (i in 1:N)
    target += normal_lcdf(dot_product(alpha, (Y[i] - xi) ./ omega) | 0, 1);
  // priors  
}
2 Likes

Ah yes, this is a much better implementation! Thank you for the help!

Hello! Sorry to dredge this up, but I’m dipping my toes into the waters of Stan, and am hoping to solve this exact issue. I can find blueprints for skewed normal, and multivariate normal distributions, but not the combination of the two. I’m working with the implementation posted by bgoodri above, and am curious how one would go about pulling simulations/samples with this sort of framework (specifically using the ‘target’/summed model structure)? Forgive me if this is something easily found in the documentation, if so, I’ve had trouble finding it and could use some direction. Thanks!

If you run that code above or something equivalent to it, then the resulting object (or csv file) contains thousands of draws of xi, omega, alpha, and L. You can get them individually with the rstan::extract function or collectively by calling as.matrix, as.data.frame, etc.

Thanks! This might not be the right way to ask the question, but I suppose what I’m getting at is generating posterior samples/datasets to see how well the resulting parameters reflect the original data. In functional examples I’ve worked with, the samples are being generated via the particular distribution’s rng function. Since there isn’t an existing one for this MV skew distribution, I haven’t been sure how to set up something equivalent.

Though, I suppose there are other non-Stan rng functions available the parameters could be fed in to.

Equation (1) in the linked paper gives you the definition how to sample a skew multivariate normal.
Otherwise look here: https://rdrr.io/cran/miscF/src/R/rmvst.R

Great, I’ll take a look at the paper, and that particular function was my next thought, in fact. Thanks for your help!

Hi there, I am trying to adapt this model to the factor analysis model developed by r.farouni:

where the likelihood samples are drawn from a multivariate normal. My question is that, since in this case the covariance matrix is generated through another process, I am not sure what the scale or omega vector would be. Should I just assume it to be an identity matrix?
I am sorry if my question is too trivial, but I am kind of new to the field.
Thanks in advance!