Modeling electropherogram data using stan

Hey i am fairly new to bayesian modeling and i am trying to use it to model electropherogram data. I apologize if the question seems a little vague as i am not sure where to begin. I basically have traces which are generated when some molecules pass a fluorescence detector. The maximum types of signal are 10 and each point in the trace is drawn from these 10. I was thinking of doing a hierarchical model for this where I assume each trace in the population of traces is drawn from some dirichlet distribution which gives the probability of the presence and absence of these molecules in the trace.

After this I want to draw the intensity of each type of molecule from some other underlying distribution but I am not sure how to do this because the intensity can be between 50-1e6 and I would like to draw these intensity values but how do I draw values from such a huge range and still maintain the hierarchy or connectedness of my model. I feel like I am missing some intermediate steps because there is no spatial localization in my model so even if i infer the intensity and the parameters of the dirichlet I still dont know which point in the trace correspond to the type of the molecule detected.

That sounds like something you’d model on a log scale.

Whatever you build, you’re probably going to have to do it in pieces, so I wouldn’t expect to know how everything fits together before you start.

I like the golf example as a generic getting started: Model building and expansion for golf putting

Without knowing what you’re doing, probably the place to start is plots of your data. (Edit: if you have a couple plots of your data, put them here and it’s possible someone will recognize something)

1 Like