"Interference" when simultaneously estimating regression model and co-variance between variables

I am anticipating that my question exposes some basic ignorance about how mcmc works, but here we go:

In an attempt to deal with missing data I am trying to simultaneously estimate a regression model as well as the co-variance of the variables in the regression model and some auxiliary variables.

To be specific: Assume matrix X has the columns y,e,m,el, the last entry being the product of variables e and l. Column y has missing values.

Now I want to estimate the effect of e on y^i and to deal with missing data I also estimate the co-variance matrix of X^i, where y^i and X^i are vector/matrix with imputed values.

A simple stan model looks like this:

data {
  int N;
  int K;
  vector[N] y;
  vector[N] m;
  vector[N] e;
  vector[N] el;
  int nmiss;
  int missidx[nmiss];
}
parameters {
  real intercept;
  real b;
  real<lower=0> sigma;
  corr_matrix[K] Omega;
  vector[K] mu;
  vector<lower=0>[K] tau;
  vector[nmiss] yimp;
}
transformed parameters {
  cov_matrix[K] Sigma = quad_form_diag(Omega,tau);
} 
model {
  // omitting priors to keep this shorter
  vector[N] yl = y;
  yl[missidx] = yimp;
  
  target += normal_lpdf(yl | intercept + e*b,sigma);
  for (n in 1:N) {
    vector[K] vec = to_vector({yl[n],e[n],el[n],m[n]});
    target += multi_normal_lpdf(vec | mu, Sigma);
  }
}

When I try to estimate this model, I can’t simultaneously estimate the regression weight for e and the covariance matrix. There is some “interference” (for the lack of a better term) which makes that the regression weight for e and the variance of e in the variance-covariance matrix for X are underestimated.

I can “hack around” this problem by multiplying multi_normal_lpdf(vec | mu, Sigma) with a large number. This works intuitively because it gives precedence to the estimation of the co-variation between all relevant variables and thus estimates the effect of e on y based on imputed data with the right covariance.

However, I assume others must have thought deeper about this and would be glad about any pointer or comment.

Related post with an implementation in brms here.

Have you looked at the posterior covariance between b and the estimated covariance between e and y? They should have a negative correlation as they are competing explanations for the same phenomenon. You could exclude e from the covariance model and get something closer to what you want, I think. I’m having trouble coming up with the relevant numerical intuition here so my answer might be off base.

Thanks for looking at this Krzysztof.
The cov(e,y) and b are indeed correlated.

I tried removing cov(e,y) from the covariance model, but this did not work.

I should also add that if I use FIML (full information maximum likelihood) in lavaan (R package to fit SEM models) I can recover the correct parameters (see section " Missing data and non-linear effects" here).