Ideas and doubts for a simple model

uh1 · January 31, 2024, 4:08pm

Hi!
I have recently started using stan and have some questions.

I have 2 datasets that solves 2 different equations but with a common parameter, m1.
Eq1: y1 = m1x1+q1
Eq2: y2 = m1x2 + m2*x3 + q2

my parameters are: m1,m2,q1,q2
and my known data: y1,y2,x1,x2,x3.

In eq1 m1 is more identifiable so I want to solve eq1 first and use its posterior as prior.
How can I proceed?
Do I need to write 2 stan codes (read the output of the first one and use them as input to the second), one for each equation?

And another question: my data has an unknown uncertainty, should I consider it with a new parameter?

Thank you

jonah · January 31, 2024, 9:42pm

It sounds like the right approach might actually be to fit both in the same Stan program, which is pretty straightforward. For example,

y1 ~ normal(m1 * x1 + q1, sigma_1);
y2 ~ normal(m1 * x2 + m2 * x3 + q2, sigma_2);

I would give something like that a try before trying anything like fitting them sequentially. It’s not always straightforward to feed the posterior from one model in as a prior to another model because the posterior is represented as thousands of draws that don’t always correspond to any named distribution (except in special cases). That said, you could search on the forum and find different approaches for doing that. I don’t have time right now to do the search myself, but I know there have been discussions on the forum about multiple different ways of doing it.

It sounds like maybe you want to estimate a residual standard deviation, like in a linear regression model? If that, then you can just use a parameter like you said. That would be like the sigmas in my example above (declare them with <lower=0> in the parameters block because they are constrained to be positive).

Or do you mean that each data point has nontrivial measurement error? That’s a little bit more complicated but definitely doable. The Stan User’s Guide has a chapter on measurement error that gives some examples.

uh1 · February 9, 2024, 3:16pm

Thank you @jonah for you kind reply!

In this way (by solving the 2 equations in the same code) I am considering a joint likelihood, right?
Is this affected if the numbers of data I have in the 2 datasets (6 for eq1, and 54 for eq2) is different?

And as for the second question, yes I was asking about sigma, it is clear now, thank you.

jonah · February 9, 2024, 9:01pm

That shouldn’t be a problem. You can fit equation 1 using 6 observations and equation 2 using 54 in the same model. Roughly something like this:

data {
  int N1; // will be 6
  vector[N1] x1;
  vector[N1] y1;

  int N2; // will be 54
  vector[N2] x2;
  vector[N2] x3;
  vector[N2] y2;
} 
parameters {
  real m1;
  real m2;
  real q1;
  real q2;
  real<lower=0> sigma1;
  real<lower=0> sigma2; 
}
model {
  y1 ~ normal(m1 * x1 + q1, sigma_1);
  y2 ~ normal(m1 * x2 + m2 * x3 + q2, sigma_2);
  // also probably add priors for the parameters here
}

uh1 · February 12, 2024, 2:57pm

Thank you. However the results are better if I separate the problems (2 different stan codes).
Do you think it is feasible to accept the posterior distribution of m1 as the final distribution and use it in model 2 to solve y2 equation?
Basically, I want to fix the m1 distribution in order to determine only the distributions of m2 and q2 from the second equation. In this way I can’t define m1 in the parameter block.
Do I have to declare it in the data block? How is defined the likelihood then?

Thank you

Bob_Carpenter · February 15, 2024, 9:42pm

@Jonah is right that you should just fit a joint model. But I’d be careful about what it means to be the “same parameter”. They may have a similar role, but regression coefficients change meaning in the presence of other regression coefficients.

There’s no way to capture a posterior and use it as a prior in Stan unless everything is conjugate and you unfold it by hand.

On the other hand, as @Jonah suggested, fitting the joint model is equivalent to using whatever ad hoc posterior comes from the first model as the prior to the second model. If your first likelihood is p(y_1 \mid \theta) with a prior of p(\theta) and you want to use the posterior as the prior for a regression of p(y_2 \mid \theta, \phi), then that is

p(\theta \mid y_1) \cdot p(\phi) \cdot p(y_2 \mid \theta, \phi) \propto p(y_1 \mid \theta) \cdot p(\theta) \cdot p(\phi) \cdot p(y_2 \mid \theta, \phi).

Topic		Replies	Views
Multiple priors for the same parameter Modeling	10	3530	October 29, 2021
Combining data from multiple sources to model the same parameter Modeling	1	1024	November 29, 2017
Specifying prior parameter as input in data section? Modeling rstan	2	468	November 18, 2021
I would like to find posterior with jeffreys' Modeling	1	365	September 17, 2019
Dealing with unknown data with rstan/ODE Modeling rstan , ode	2	435	November 10, 2020

Ideas and doubts for a simple model

Related topics