Thanks for the discussion,
I was curious whether simulating data is practically mandatory, even if cost some time.
So far I tended to build a model and tested on some real world data, and see whether the inferred parameters match known facts about the data set/science, that have been published previously.
I manly used simulated data when I had to solve some issue, and needed to identify what part of the model was problematic, if any, or the real world prior knowledge data I was using itself was not sufficient/non representative of the observations I wanted to infer quantity of.
@jonah an example could be that for my deconvolution problem (understanding the proportions of cell types within a tissue from the overall “gene production” observed of that tissue, for many replicates), the challenging part was not the principle itself which is really quite straightforward but the way to treat prior information available, that is often non representative of the observed data. In this case a tricky part was to model the noise provided by misleading prior knowledge rather than the mathematical model of the deconvolution.
I realised that I had to backtrack many times, and building a solid simulation data set first is always worth the time. I just wanted to know the general opinion on the best practices to rule out some bad habits definitely.