Ideas and doubts for a simple model

I have recently started using stan and have some questions.

I have 2 datasets that solves 2 different equations but with a common parameter, m1.
Eq1: y1 = m1x1+q1
Eq2: y2 = m1
x2 + m2*x3 + q2

my parameters are: m1,m2,q1,q2
and my known data: y1,y2,x1,x2,x3.

In eq1 m1 is more identifiable so I want to solve eq1 first and use its posterior as prior.
How can I proceed?
Do I need to write 2 stan codes (read the output of the first one and use them as input to the second), one for each equation?

And another question: my data has an unknown uncertainty, should I consider it with a new parameter?

Thank you

It sounds like the right approach might actually be to fit both in the same Stan program, which is pretty straightforward. For example,

y1 ~ normal(m1 * x1 + q1, sigma_1);
y2 ~ normal(m1 * x2 + m2 * x3 + q2, sigma_2);

I would give something like that a try before trying anything like fitting them sequentially. It’s not always straightforward to feed the posterior from one model in as a prior to another model because the posterior is represented as thousands of draws that don’t always correspond to any named distribution (except in special cases). That said, you could search on the forum and find different approaches for doing that. I don’t have time right now to do the search myself, but I know there have been discussions on the forum about multiple different ways of doing it.

It sounds like maybe you want to estimate a residual standard deviation, like in a linear regression model? If that, then you can just use a parameter like you said. That would be like the sigmas in my example above (declare them with <lower=0> in the parameters block because they are constrained to be positive).

Or do you mean that each data point has nontrivial measurement error? That’s a little bit more complicated but definitely doable. The Stan User’s Guide has a chapter on measurement error that gives some examples.

1 Like

Thank you @jonah for you kind reply!

In this way (by solving the 2 equations in the same code) I am considering a joint likelihood, right?
Is this affected if the numbers of data I have in the 2 datasets (6 for eq1, and 54 for eq2) is different?

And as for the second question, yes I was asking about sigma, it is clear now, thank you.

That shouldn’t be a problem. You can fit equation 1 using 6 observations and equation 2 using 54 in the same model. Roughly something like this:

data {
  int N1; // will be 6
  vector[N1] x1;
  vector[N1] y1;

  int N2; // will be 54
  vector[N2] x2;
  vector[N2] x3;
  vector[N2] y2;
parameters {
  real m1;
  real m2;
  real q1;
  real q2;
  real<lower=0> sigma1;
  real<lower=0> sigma2; 
model {
  y1 ~ normal(m1 * x1 + q1, sigma_1);
  y2 ~ normal(m1 * x2 + m2 * x3 + q2, sigma_2);
  // also probably add priors for the parameters here

Thank you. However the results are better if I separate the problems (2 different stan codes).
Do you think it is feasible to accept the posterior distribution of m1 as the final distribution and use it in model 2 to solve y2 equation?
Basically, I want to fix the m1 distribution in order to determine only the distributions of m2 and q2 from the second equation. In this way I can’t define m1 in the parameter block.
Do I have to declare it in the data block? How is defined the likelihood then?

Thank you

@Jonah is right that you should just fit a joint model. But I’d be careful about what it means to be the “same parameter”. They may have a similar role, but regression coefficients change meaning in the presence of other regression coefficients.

There’s no way to capture a posterior and use it as a prior in Stan unless everything is conjugate and you unfold it by hand.

On the other hand, as @Jonah suggested, fitting the joint model is equivalent to using whatever ad hoc posterior comes from the first model as the prior to the second model. If your first likelihood is p(y_1 \mid \theta) with a prior of p(\theta) and you want to use the posterior as the prior for a regression of p(y_2 \mid \theta, \phi), then that is

p(\theta \mid y_1) \cdot p(\phi) \cdot p(y_2 \mid \theta, \phi) \propto p(y_1 \mid \theta) \cdot p(\theta) \cdot p(\phi) \cdot p(y_2 \mid \theta, \phi).