How to decide on whether to use a Multivariate Gaussian Mixture model?

xenia · March 10, 2021, 1:10pm

Hi,

I am new to stan and bayesian analysis and looking at modelling some spectroscopic data (UV-Vis), do you think for this kind of ‘signal’ data with discrete counts over a range of energies and a number of peaks (some which can be indistinguishable), a gaussian multivariate mixture model is appropriate?

Are there any resources dedicated to getting started with MV-GMM’s in Stan in any case?

Thanks for any assistance and resources.

torkar · March 10, 2021, 1:57pm

What is the outcome(s)? What are the predictors?

xenia · March 10, 2021, 2:26pm

I want to be able to get an estimate of parameters, and possibly be able to identify the number of peaks within the data (this is often indistinguishable), so that the absorbance can be identified. The parameters include the FWHM, amplitudes and any peak positions. Eventually this will also include a ‘background’ estimate for example.

torkar · March 10, 2021, 3:09pm

Let me rephrase this since I’m a bit slow :)

If you want to predict y, what will you use to predict y, i.e., what are x and z here,

y ~ x + z

Please explain what types the variables are, e.g., real positive number.

xenia · March 10, 2021, 3:18pm

it is my fault sorry I misunderstood your question. Y i.e. the excitation here can be modelled as gaussian function to account for the peak shape as a function of the wavelength(x). so e(v) ~ A*exp(-(x-xo)/fwhm)^2).
where xo= centre/peak position which is real and positive. x=wavelength values, again real and positive fwhm=peak width, real and positive and Amplitude is also real and positive.

torkar · March 10, 2021, 3:42pm

Aah, ok, so it’s a nonlinear formula you need? If you want to try it out a bit before coding all in Stan then perhaps run it with brms first?
https://cran.r-project.org/web/packages/brms/vignettes/brms_nonlinear.html

i.e.,

m <- brm( e ~ A*exp(-(x-xo)/fwhm)^2), nl = TRUE,
    family = normal(),
    data = d,
    prior = youPriors
)

Make sure to do prior predictive checks and simulate some data first to see that you recover the parameters properly.

torkar · March 10, 2021, 4:12pm

I would try a more straightforward approach first. Have you seen any GMM examples in Stan?

xenia · March 10, 2021, 4:17pm

I have tried fitting this in pystan using a general non-linear approach and although the fit is visually good with convergence. The parameter values have high levels of deviation. So I was looking at whether another model could better describe the data. But I think I may possibly be out of my depth here

Topic		Replies	Views
Gaussian Mixture Modeling/LPA in Stan? Modeling	2	845	February 20, 2022
Model distribution selection for response variable with peaks Modeling techniques , fitting-issues , specification	2	874	June 8, 2020
Multivariate Gaussian mixture model help Modeling fitting-issues , specification	17	2713	April 14, 2020
Mixture of Gaussian functions Modeling	11	1696	June 13, 2017
Random Question: Stan Model vs EM Algorithm General	3	1868	June 15, 2018

How to decide on whether to use a Multivariate Gaussian Mixture model?

Related topics