The Monster Model | Hierarchical ODE models

Funko_Unko · May 25, 2021, 11:18am

So I was actually looking around for something else (scalability/massive parallelism) when I stumbled across this thread: Scalable Bayesian multilevel modeling - #31 by andrewgelman

As I understood the model/paper it describes a combination of a parmacokinetic/toxicokinetic model, a population model and a measurement model, so nothing especially fancy.

I was wondering whether this (or something similar) has made it into a case study yet? Or did difficulties occur with this particular model (which I guess would improve a case study)? In sum, I’m looking for “the” ideal workflow and common pitfalls for hierarchical ODE models.

Searching the internet for “stan monster model” was only semi-successful, as was searching the forums.

wds15 · May 25, 2021, 2:13pm

@charlesm93 has worked on this model quite a bit as I recall. I would hope that the adjoint solver is helpful here to make progress more quickly, but the model is complex was my impression.

charlesm93 · May 25, 2021, 3:16pm

Hi,
This is indeed one of my pending projects with @andrewgelman . It has a lot of moving parts: a stiff ODE that is highly sensitive to parameter values, sparse data, a low number of patients (which makes it it difficult to estimate inter-individual variability), hierarchies across multiple parameters, etc. The workflow is complicated by the fact each model fit takes several hours, even after firing up the cluster and parallelizing within chains.

I fitted simplifications of the model and got reasonable results. Fixing one or two parameters, or the population variance greatly simplifies the problem. The full model returns divergent transitions and I haven’t found the right parameterization. I tried monolithic centering and non-centering, and things in between. I now suspect the problematic geometry may not be only due to the interaction between population and patient parameters. It’s indeed possible to have a funnel between population parameters, in which case no parameterization can help you. Divergences aside, I don’t trust the estimated parameters, because there are somewhat inconsistent with the priors which were very carefully constructed. Truncating priors, as was done in the original paper, limits our ability to reparameterize the model. When using truncation, I find a lot of the probability mass concentrates at the boundary (note that on the unconstrained scale, you actually have infinite volume near the boundary). All that said, you shouldn’t need to truncate the priors, because the truncation occurs at the tail of the prior distributions.

It’s also worth pointing the original paper didn’t use HMC, so they didn’t have access to diagnostics such as divergent transitions. You can also imagine how an accept-reject sampler will interact differently with a truncated distribution…

To sum up, I agree this would make a very good case study and writing it has been my ambition for some time. But it’s a challenging model to fit, with behaviors I don’t fully understand (yet). There exist other examples of hierarchical ODE models, built and used with Stan, which can be considered, if the goal is to write a case study. But yes, I think we have a lot to learn from the Monster model.

Funko_Unko · May 25, 2021, 4:45pm

Thank you @charlesm93 very much for the elaborate response, this is much more than I was hoping for :)

I was wondering whether we could actually expect the model to “just work”, there was this sentence in the paper that made me a bit suspicious:

after 15,000 iterations, most [\sqrt{\hat{R}}-values] were less than 1.1. The highest value is 1.36, […]

and as you say, they did not use HMC, but “a variant of the Gibbs sampler”, which I think should be much less efficient than HMC, so I don’t quite know what the 15k or 5x50k iterations correspond to, in addition to the missing built-in diagnostics.

The other thing that made me suspicious was that the first mention of someone (you) working on it was mid/end 2019, but I could not find a case study.

I do have several goals. My first and foremost is to learn how to fit models like the Monster model (efficiently). Then, I would like to try out some ideas I had that might help speed things up and that might help debug more quickly at the same time. A model where a single, even preliminary run takes several hours must pose a serious challenge to “the” iterative Bayesian workflow. Of course, all these things get simplified a lot if there already exists some sort of reference solution from which to learn and against which to compare.

I would be very interested in what you have tried so far, and what you have determined works well or not so well.

Thank you again!

Edit:

Ah, also, @wds15: I thought the per person ODEs were not that horrible? Though, on second thought, I think they do have 16 parameters each?

Funko_Unko · May 26, 2021, 8:00pm

So what is actually even going on per person? Meaning, what is the ODE that (supposedly) models the process?

I understood this as follows, but not everything adds up for me:

From exposure to measurement Cair (concentration in ambient air) switches (is piecewise constant). Calveolar ( concentration in alveolar air) is just Cair?
Between Calveolar and Cart (concentration in arterial blood) we have instantaneous equilibrium, i.e. dCart/dT=Cart-Cair/Pba=0?
Cvenous (concentration in venous blood) is just the weighted combination of F (blood flow fractions per compartment) with C (concentration per compartment), i.e. Cvenous=dot_product(F,C)? This gets measured?
Within the compartments we just have the linear ODE dC/dt=(Cart - C/P)*F/V, except for the liver-compartment where we have the additional term Vmax*C/[V*(K+C)]

With this you can almost set everything up, except:

What’s going on with the units? Where are they, what are they? t gets measured in what, hours? The figures make it look so, but I don’t think it’s stated explicitly.
What’s the function of VPR?
The sum of the given volumes is restricted to .873(?), but we don’t have a volume for the fat? Supposedly this gets measured?
The concentration in exhaled air is also supposed to be measured. But isn’t this just the venous blood concentration?

Ignoring VPR (and the summation constraints) I can easily fit this model (for simulated data) with (I think) the provided priors and truncated parameters within the order of minute(s), so I strongly suspect I’m missing something crucial:

What’s going on there?

charlesm93 · May 27, 2021, 1:32am

@Funko_Unko Give me your github credential and I’ll share a (private) repository with an attempt at the model. I had to dig through the old code and pick Andrew’s brain to get a grasp on what was going on.

Funko_Unko · May 27, 2021, 6:35am

Cool, that would be just @funko-unko, but I’ll also send you a PM.

This probably went as swimmingly as me trying to reconstruct the model from the paper. Makes you appreciate when people “just” provide the Stan file and what Stan must have done for reproducibility.

Funko_Unko · June 7, 2021, 12:45pm

Okay, so with the approval of @charlesm93 I’ll post the ODE describing the pharmacokinetic model. I’ll tack on a short derivation from first principles for good measure and I’ll tag people that probably could tell me at once if it is absurd: @wds15, @yizhang, @andrewgelman

It’s actually not that wild.

First: In addition to the parameters in Table 1 of the original paper, there are four additional parameters that are measured and assumed to be known “exactly”:

The lean body mass [kg],
The mass fraction of fat [N/A],
The pulmonary volume flow [l/min],
The concentration of PERC in the air during exposure [PPM] with 1PPM=409/144 mug/l.

From the lean body mass [kg] and the mass fraction of fat [N/A] we can compute the total volume of fat [l] from the density of fat (.92 g/ml). The body of each person gets modeled as four distinct compartments which are connected with each other via the circulatory system, mixing the blood “coming out” of each compartment, then passing through the lungs and mixing with in/exhaled air, and then again feeding the blood into each compartment. As described in the original paper, section 2.1, the concentration C_s [mg/l] in each compartment follows the ODE

dC_s/dt=(C_{art}-C_s/P_s)F_s/V_s,

except for the liver compartment where we have the additional mass sink

V_{max}C_s/(V_s(K_m+C_s)).

The latter term is just the Michaelis–Menten ansatz, while the first term derives from the conservation of mass for each compartment. Namely, for the mass in each compartment M_s=C_sV_s [mg] from mass conservation (rate of change in mass = mass flow in - mass flow out) we have

dM_s/dt=(C_{art}-C_{out,s})F_s

where F_s [l/min] is the blood volume flow per compartment, C_{art} [mg/l] is the PERC concentration in the artertial blood (coming from the lungs) and C_{out,s} [mg/l] is the PERC concentration in the blood exiting the compartment. Assuming equilibrium at the compartment exit we get

C_{out,s}=C_s/P_s,

which together with the mass balance yields the original ODE.

There are now still three parts missing. We still need

an expression for C_{art} [mg/l] (the PERC concentration in the blood exiting the lungs),
an expression for C_{venous} [mg/l] (the PERC concentration in the mixed blood leaving the compartments and entering the lungs) and
an expression for C_{exhaled} [mg/l] (the PERC concentration in the exhaled air).

The latter two quantities get measured (with lognormal noise).

The simplest part is the expression for C_{venous}. This is just the concentration of the mixed blood flows exiting the compartments, i.e.

C_{venous} = \sum_s C_{out,s}F_s/F_{venous}

with the total blood flow F_{venous} = \sum_s F_s.

The concentration in the exhaled air can be derived from the assumption of equilibrium between “alveolar air […] and arterial blood” (section 2.1). Therefore we just get

C_{exhaled} = C_{art} / P_{blood/air}.

Finally, C_{art} we get again from conservation of mass. The mass flow of PERC entering the lungs has to be equal to the mass flow exiting the lungs. Crucially, there are two ways that PERC can enter and exit the lungs, via the inhaled air+the incoming venous blood or via the exhaled air+the outgoing venous blood. The mass flow [mg/min] is just the concentration [mg/l] times the volume flow[l/min], hence we get

C_{inhaled}F_{alveolar}+C_{venous}F_{venous}=C_{exhaled}F_{alveolar}+C_{art}F_{art}.

With F_{venous}=F_{art} (volume flow of PERC is negligible compared to volume flow of blood) and plugging in C_{exhaled} = C_{art} / P_{blood/air} we get

C_{art} = \frac{C_{inhaled}F_{alveolar}+C_{venous}F_{venous}}{F_{alveolar}/P_{blood/air}+F_{venous}}.

There are actually still some things missing.

First, F_{alveolar} gets computed from the given pulmonary volume flow via the magical constant .7 as F_{alveolar}=.7F_{pulmonary}.

Furthermore, F_{venous} gets computed using the Ventilation/perfusion ratio (VPR) from F_{alveolar}
as F_{venous}=F_{alveolar}/(VPR).

Stay tuned or not while I update this post.

There are two more magical things we have to know before we can hopefully reproduce the results from the paper.

Apparently, the V_{max} [mg/min] from the equations is not the maximal metabolic rate from table 1. Instead, VMI from table 1 has units [mg/min / kg^(-.7)], such that
V_{max} = \mathrm{VMI}*\mathrm{bodyWeight}^{0.7}.
The relative volumes of the four compartments excepting the fat sum to a magically precise .837 (relative to the lean body volume).

The first point makes me wonder whether I’ve got the units right for VMI ([mg/min]?) and KMI([mg/l] to match the C_s?).

As for reproducing the results from the paper, we’ll not do this (yet). For now, consider that

and that

Before we try to fit the full model it might be helpful to look at the ODE a little bit more closely. The whole ODE reads
dC_s/dt=(C_{art}-C_s/P_s)F_s/V_s-\delta_{s,4} V_{max}C_s/(V_s(K_m+C_s))
where s runs from 1 to 4 and \delta_{i,j}=\mathrm{int}(i==j).

There are two things which are not immediately obvious from this formulation. Neglecting the Michaelis-Menten term

this is a linear ODE and
there should be a perfect anticorrelation between the volumes V_s and the partition coefficients P_s.

Claim 1:
The only term in the ODE without the Michaelis-Menten term that could potentially be nonlinear is C_{art}. However, it’s not:
C_{art}=\frac{C_{inhaled}F_{alveolar}+C_{venous}F_{venous}}{F_{alveolar}/P_{blood/air}+F_{venous}}
The only term that depends on our time-dependent state C_s is C_{venous}, which in turn is just
C_{venous}=\sum_s C_{out,s}F_s/F_{venous}=\sum_k C_{k}F_k/(F_{venous}P_k).
All in all we get
dC_s/dt = (\frac{C_{inhaled}F_{alveolar}+\sum_k C_{k}F_k/P_k}{F_{alveolar}/P_{blood/air}+F_{venous}}-C_s/P_s)F_s/V_s
which is linear in the state C_s. We may write the ODE as
dC_s/dt = \sum_k A_{sk}C_k+f_s
with constant source term (which is zero after exposure)
f_s = \frac{C_{inhaled}F_{alveolar}}{F_{alveolar}/P_{blood/air}+F_{venous}}F_s/V_s
and matrix
A_{sk}=\frac{F_s}{V_s}(\frac{F_k}{P_k(F_{alveolar}/P_{blood/air}+F_{venous})}-\frac{\delta_{s,k}}{P_s}).
Claim 2:
The second claim becomes clear once we reformulate the ODE in terms of C_{out,s}=C_s/P_s. We get
dC_{out,s}/dt=dC_s/dt/P_s = \sum_k B_{sk}C_{out,k}+g_s
with new source term
g_s = f_s/P_s = \frac{F_s}{V_s P_s}\frac{C_{inhaled}F_{alveolar}}{F_{alveolar}/P_{blood/air}+F_{venous}}
and new matrix
B_{sk}=A_{sk}\frac{P_k}{P_s}=\frac{F_s}{V_s P_s}(\frac{F_k}{F_{alveolar}/P_{blood/air}+F_{venous}}-\delta_{s,k}).

Stay tuned for the next part where we reveal

the exact solution of the linear ODE
an attempt at fitting the linear ODE with the original parametrization
a hopefully successful attempt at fitting a reduced linear ODE
a tool to quickly explore the full, nonlinear ODE and
a hopefully successful and quick fit of a reduced nonlinear ODE.

andrewgelman · June 8, 2021, 1:54am

I’d be soooo happy if someone could get the Monster model running in Stan. It’s a great example but it would be so much better if it were runnable. Also it’s buggin the hell out of me that we could it just fine using Metropolis 25 years ago but now we’re getting stuck when using a much better algorithm on computers that are a zillion times faster. So I’d really really really like to get this one straightened out. Anyone who can do this, I promise you a free homemade loaf of bread, an advance look at some blog posts 6 months in the future, and pretty much whatever else you want from me.

Funko_Unko · June 9, 2021, 9:23am

Hm, just a short, not-yet-bread-worthy update. I’ll have to work on something else for now, but will revisit this at some later point. Everything that follows has been tested with just a few simulated data sets with the true parameters drawn from the population posterior as specified in the original paper, table 1, so take it with a grain of salt:

The linear ODE has a closed form solution, see e.g. Wikipedia
The linear ODE with the original parametrization actually fits (almost) fine with and without truncation. For large measurement errors, things are significantly easier than if the measurement is precise. However, for “precise” (log-std < .1) measurement, there is some unexpected “true” multimodality in the population-std parameters. “True” in the sense that we observe different modes in the marginal distributions, chains get stuck in those modes, but Rhat(lp__) is fine (<1.01). There may be a bug in my code.
I’m actually not sure whether the reduced linear ODE fits better or worse than the original ODE. It fails or works for the same problems. But I think I remember that it worked better than the original ODE, maybe it was just lucky before?
Both linear ODEs fit in the order of seconds or minutes, depending on the magnitude of the measurement noise. The difference in runtime is due to much larger n_leapfrog__ for precise measurements, as then the posterior is “more constrained”.
For any of the ODEs, the marginal distributions of many of the parameters do not change much.
To cheaply fit the full nonlinear ODE we can in principle make use of a Strang splitting. In practice, there are some floating point overflow issues, so instead of a splitting of the flow into two exact flows, I currently use the exact flow of the linear part + an approximation of the flow of the Michaelis-Menten part obtained via a single step of the implicit Euler method, which can be written down explicitly
We can monitor the convergence of our splitting method by recomputing lp__ (e.g. using `log_prob_grad` method for CmdStan mimicking its `generate_quantities` method) for the whole fit with more sub-steps. With this we can estimate neff and decide whether we need more sub-steps. For the whole of the prior we need quite a few sub-steps to be sufficiently precise, but the prior is very wide. For the whole of the posterior we need significantly fewer steps. How many steps we will need depends on the parameter values (larger VMI and KMI require more sub-steps because a linear approximation gets worse) and on the magnitude of the measurement noise (larger measurement noises allow coarser approximations).

There are a few more ideas I’ll try when I come back to this problem.

Ah, I forgot this:

To cheaply explore the qualitative behavior of the posterior of the nonlinear model, we may just use a fixed number of sub-steps for the data-generation and fitting. Then fitting the approximation of the full nonlinear model which hopefully behaves qualitatively the same as the true nonlinear model takes again in the order of seconds to minutes.

Oh, I also forgot to attach the obligatory excessively large picture:

Blue/Red: Prior/Posterior
The population-std of Pba misbehaves.
There are two divergences that can be eliminated, but I broke the part of my code that does this.

Almost everything should be as in the paper, except for

the prior on the population-stds, which is lognormal with std 1/2. The picture looks slightly worse with the original priors.
All timepoints are equidistant, including the exposure start and stop time (this allows us to reuse the matrix exponential) and at all timepoints blood and air concentration get measured
The data are synthetic, obtained via bdf with abstol=reltol=1e-12.
There’s no truncation.

(The prior on the measurement noise appears different than in the paper because we sample from the prior, but this doesn’t work with the original prior. For fitting we use the original prior.)

We used 8 sub-steps, which appears to have been enough. It looks like the true parameters (black vertical lines) are recovered quite nicely.

Fitting took ~ 3 minutes. Rhat(lp__) < 1.01, E-BFMI ~= 1.

Funko_Unko · June 10, 2021, 6:45am

Turns out there was a bug in my code. Namely, I accidentally constrained the geometric standard deviation population_eS to be positive, instead of greater than 1. With the personwise parameters obtained from a unit normal via

person_params[person] = (
  population_eM .* pow(population_eS, unit_log_person_params[person])
);

this indeed led to the observed non-identifiability, as

pow(population_eS, unit_log_person_params[person]) == pow(1/population_eS, -unit_log_person_params[person])

Fixing this, the population-gstd priors and slightly changing the parametrization such that

vector<lower=1>[no_latent_params] population_eS = exp(
    exp(log_log_population_eS_mu + unit_log_population_eS ./ population_eS_nu)
  );

makes the model run quite nicely:

Observe that the average number of leap frog steps dropped “dramatically” from ~316 per transition to only 61. All Rhat values are fine (<1.01), there are 8 divergences but this (I believe) is only because my timestep adaptation for the sampling is overly optimistic, and the divergences should be able to be eliminated.

I fixed some other thing (forgot the scaling of VMI) and now we also observe the shift towards the true lower values for VMI. There is one more thing that confuses me, namely the units of KMI. Figure 4d in the paper makes it look like it has units mg, but to match the equation from the paper it should have the same units as a concentration (mg/l)?

Maybe @andrewgelman could clarify what’s going on with KMI? I think I’m missing something there, without which I can’t build the original model.

hyunji.moon · June 17, 2022, 1:21pm

I am working on a hierarchical differential equation model but the quality of sampling is uncertain. I wish to verify this using simulation-based calibration which may have its unique value on implicit models like ODE. Progress is being updated in this issue of SBC package starting with current literature on SBC + Hierarchical ODE.

Could anyone share the current process of the research in this thread, hierarchical ODE and monster models? @charlesm93 @Funko_Unko

@charlesm93, if you could invite me to the repository, it would be helpful to learn the status of the calibration object. I discovered using SBC is listed in further todos (preview of part 2) in your tutorial part 1 of Stan and Torsten, but couldn’t find part2. Thanks!

Niko · June 21, 2022, 6:15am

Fitting the Monster model once took me roughly one hour with a custom (“precision”-adaptive, incremental and parallel) warm-up for an estimated minimal ESS of roughly 700. Stan’s regular warm-up (with the “oracle” configuration determined from my warm-up) took roughly 4 hours for an estimated minimal ESS of 83. For this specific problem, there were 6 persons going through two rounds of measurements each, with 15 personwise parameters (+the corresponding population parameters).

Generally, precision requirements can (and probably will) vary across draws from the prior (and the subequent inference during SBC). If you want to do SBC, you should probably look into automatically tuning the precision as sketched e.g. here by @jtimonen or you might also directly try to use his odemodeling package.

For the Monster model, the posterior “happened” to have most of it’s mass in parts of the parameter space where the ODE was rather linear. But the prior also included quite non-linear configurations, which require more computational work to solve accurately. So, if you were to do SBC for this specific model, you could probably expect the average runtime per prior draw to be more than four hours.

For different hierarchical ODE models, the computational cost may be higher or lower. AFAIK, fitting the model described in section 4 of @charlesm93, @yizhang and @billg’s preprint takes roughly two hours (I don’t know the ESS), but that’s just for a single person.

If you have more questions or suggestions, don’t hestitate to ask.

hyunji.moon · November 5, 2022, 2:08pm

I am looking into your stanfile, and if learning stan syntax for hierarchical dynamic model such as ragged array is the goal, would you recommend starting from this ragged.stan file? I see you have created many stan files and your insight on which syntax would be most general (e.g. with more complex hierarchy structure) and efficient would be helpful.

Also, another question is dt in datablock from this line of yours. Some dynamic modelers call this Time_step and perceive it as a hidden variable that shouldn’t affect the model. They decrease by half until it doesn’t affect state time-series too much. However, from @andrewgelman’s The Pinocchio Principle: A model that is created solely for computational reasons can take on a life of its own, receiving time_step in data block (as here) was my first cut which couldn’t get much support. How did you set dt and is this relevant to precision?

tagging @OriolAbril and @ahartikainen who I’m collaborating with for transplanting the above output to InferenceData with (prior_draws, chain, draws, hierarchy_idx) coordinate.

Niko · November 7, 2022, 6:02am

I’d think that the ragged.stan file you linked shoud be the most general implementation of that model. If I remember correctly, at some point new data surfaced, which did not fit the previous format.

Yes, the two data variables dt and no_fit_sub_steps directly determine the precision of the solution of the ODE. They were tuned during warm-up until PSIS diagnostics indicated that we could reliably and efficiently sample from the true posterior. IMO, such discretization parameters should never be regarded as anything else but that, namely discretization parameters which need to be (automatically) tuned until one can faithfully sample from the true posterior.

Thanks for the interesting link!

What do you mean by that?

hyunji.moon · November 7, 2022, 12:51pm

I was trying to express the difficulty of communicating the importance of time_step, which some dynamic modelers around me regard as nuts and bolts and should not asymptotically impact the outcome. So I wanted to ask how you choose your time_step. From your answer, it seems to be adaptively chosen, but for now, the simulation in dynamic modeling software (called Vensim) does not support changing time_step. Experts have heuristic of halving the time_step until state time-series matrix doesn’t change much.

In Stan community, \frac{t_{est}}{t_{gen}} > 1, but in dynamic simulation community, \frac{t_{est}}{t_{gen}} << 1 where t_{gen}, t_{est} are time spent in designing generator (parameter scaler values to state time-series matrix) and estimator (reverse of generator). From trustable model perspective, \frac{t_{est}}{t_{gen}} \sim 1 and symmetric generator and estimator are ideal. However, I discovered some model components that were asymmetrically focused in two communities. Topic of adaptable time_step, identifiablity, diagnostics are much valued in estimator-focused community where as process noise, aggregation get greater attention in generator-focused community. The difficulty of SBC lies in matching these asymmetry which were not surfaced when only one side of the joint model (joint of generator and estimator which forms joint distribution of parameter and observation) was the focus.

Tagging @maxbiostat as we had similar conversation (difficulty of communicating the importance of nuts and bolts). For instance, S, M, N, P, Q, R from the mapping below affects SBC result greatly (what the new SBC paper partly addresses.

Reversible mapping between \tilde{\theta}_{P, S}, Y_{Q, N}, \theta_{P, SM}

	command in stanify	in Bayes-inference community (software: Stan)	in dynamic simulation community (software: Vensim)	example
stanify command
	`draws2data(model, S)`	generate: map from parameter \theta region to observation Y region	run
	`data2draws(model, M)`	estimate: map from observation Y region to parameter \theta region	MCMC
	`draws2data2draws(model, S, M, N)`	composition of `draws2data` and `data2draws`	x
	`model.set_prior()`	`estimated_parameter` and its prior distribution	`.voc` has names of `estimated_parameter` and range	model.set_prior(“prey_birth_frac”, “normal”, 0.8, 0.08, lower = 0), model.set_prior(“pred_birth_frac”, “normal”, 0.05, 0.005, lower = 0)
	`model.set_numeric()`	assign numeric vector to `driving data`
	`model.update_numeric()`		assign numeric scalar to `assumed_parameter`, assign numeric vector to `driving data`	express difference between generator and estimator
model assumption
	`vensim`	mdl filepath	.	`vensim_files/prey_predator.mdl`
	`setting`	names of `estimated_parameter`, `target_simulated_stock`, `driving_data`	binding parameter, target, driving	{ ‘estimated_parameter’:(‘prey_birth_frac’, ‘pred_birth_frac’,), ‘driving_vector_name’: ‘process_noise_uniform’, ‘target_simulated’: (‘prey’, ‘predator’)}
	`numeric`	prior information for estimated parameters		{‘prey_birth_frac’: (0.8, 0.08, ‘normal’),‘predator_birth’: (0.05, 0.005, ‘normal’)}
classification settings
	`estimated_parameter`		parameters in `.voc`	prey_birth_frac, pred_birth_frac
	`assumed_parameter`		all parameters in `.mdl` except `estimated_parameter`	prey_death_frac: .05, pred_death_frac: .8
numeric settings
	`S`	# of draws from prior	# of synthetic datasets (sensitivity check run)	1**
	`M`	# of draws from posterior (# of chains * # of draws from each chain)	# of effective MCMC samples	4 * 100
	`N`	length of `driving_data`	(final_time - initial_time)/time_step	200
	`P`	# of `estimated_parameter`	# of parameters in `.voc` file	2 (prey_birth_frac, pred_birth_frac)
	`Q`	# of `target_simulated_stock`	# of time series vectors to be matched	2 (prey, predator)
	`R`***	# of subgroups for hierarchical Bayes	elmcount(subscript)	2 region: R1, R2
noise settings
	`p_noise`	process noise
	`m_noise`	measurement noise, additive

hyunji.moon · November 8, 2022, 6:34pm

Sorry, @Niko, could you kindly elaborate the meaning of this data variable? Also, what values you used for ‘dt’ and ‘no_fit_sub_steps’?

Niko · November 9, 2022, 7:54am

To me, after inspecting main.py, first it looks as if the model file used for fitting is unconstrained_monster.stan, which doesn’t actually have the dt data variable. I believe the json file passed to Stan’s regular sampling method can be found at monster/serial_regular_data.json at master · nsiccha/monster · GitHub, where it says "no_sub_steps": 128. For the fitting, I believe the actual time step size is being set here, where it gets set to dt = times[1] / no_sub_steps, where times is the vector of observation times, with times[1] AFAIK being the time after exposure at which the first measurement happens.

Topic		Replies	Views
Code review of slow hierarchical (ode) model Modeling fitting-issues , performance , epidemiology	3	783	November 19, 2020
Model Calibration Modeling fitting-issues	4	914	May 15, 2017
Slow hierarchical ODE model Modeling rstan	3	532	January 25, 2023
Improving the performance of a model with lots of iteration Modeling specification , performance	14	1986	January 14, 2022
Running hierarchical model consisting of ODE Modeling	4	1272	September 27, 2017

The Monster Model | Hierarchical ODE models

Reversible mapping between \tilde{\theta}_{P, S}, Y_{Q, N}, \theta_{P, SM}

Related topics