Hi,
Thanks for developing this language and making it available!
I am rather new to Stan and using it for my masters thesis.
My project aims to estimate parameters of a 1d model for transient heat in fluid flow in pipes. To fit the model I have data from a number of experiments where the inputs and states (flow rates, fluid type, temperature and more) to the system are varied between experiments.
My question regards how i should go about inferring parameters of the model from multiple experiments.
My initial thought has been to randomly choose an experiment at each iteration in Stan. I believe this might explain it:
p(\theta | y_{1:N}, u_{1:N}) \propto p(\theta) p(y_{1:N}| \theta, u_{1:N}).
Below I’ve added a non-working model with some pseudo-code just to try to illustrate what im getting at.
The number of rows in each dataset vary quite alot, producing a ragged data structure. To get it all into Stan I stack the data vertically and add a vector of indexes to access each experiment in Stan.
functions {
real ode_fun(real t, vector y, theta, ... )
real dydt = f(t, y, theta);
return dydt;
}
data {
int N // Number of experiments
int K // Sum of timesteps in all experiments
vector[N] N_T // Number of timesteps for each experiment
vector[N] N_i // Indexes to each experiment (cumsum(N_T))
vector[K] T // Timesteps for all experiments stacked vertically
vector[K] Z // All experimental data stacked vertically
}
parameters {
real<lower = 0> theta;
}
model {
theta ~ normal(1, 0.5);
choose n uniformly from <1 : N>
// Extract time and experimental data
int n_T = N_T[n];
int n_i = N_i[n];
vector[n_T] t = T[n_i - n_T + 1 : n_i];
vector[n_T] z = Z[n_i - n_T + 1 : n_i];
vector[1] y[n_T] = ode_bdf_tol(ode_fun, ... , theta, ... )
z ~ normal(y, 1);
}
If this is a viable approach, how do i go about randomly selecting an experiment inside the stan model?
My current and loosely defined plan can be summed up like this:
- Infer posterior for all experiments (bag optimize)
- Infer posterior for groups of experiments based on similar system states
- Infer posterior for each individual experiment
- Use cross validation to validate the MAP estimate on groups of experiments.
I’m happy for any input that can be offered. I struggled a bit presenting this in a concise way while also explaining why i want to do what i do. So if anything is unclear, please point it out.