Is the brms() measurement error model doing the 'right' thing?

Hi,

Just checking that the brms() measurement error (ME) model is properly specified. My concern is two fold:

  1. Most ME books seem to put a prior on \tau, the variance of the measurement error model, which relates the mismeasured surrogate x* to correctly measured x. See for example books by Gustafson* (2004) and Caroll et al. (2006, 2nd edition)**. Chapter 9 of Caroll et. al. deals explicitly with Bayesian methods, see section 9.4. for example. Gustafson (chapter 4) is somewhat unclear: cf pg.70 with section 4.3, for example, but overall seems to put priors on \tau. Is the Stan code by default putting a flat prior on \tau?

  2. Is the brms() code creating a fully elaborated conditional exposure model in its Stan code? Assume one has an additional covariate z which is correctly measured. The conditional exposure model of f(x |z) would seem to follow from Bayes rule in the factorisation of f(x*,y, x, z) into a measurement error model, an outcome (or response) model and a conditional exposure model [part of the joint distribution of f(x,z)]:

f(x*,y, x, z) = f(x* | y,x,z) f(y| x,z) f(x|z) f(z).

I wasn’t sure just by looking at the Stan code for the brms() ME model if this was in fact being done. The Stan guide on measurement error models does not contain additional properly measured covariates z. Just making sure.

Thank you!


* Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments
** Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition

1 Like
  1. I don’t have these books available, so you need to write down the model which you would consider as “right” and we can then compare it to what brms does.

  2. In brms, the distribution of x is not conditional on some other variable z by design. In other words, they are assumed conditionally independent. Can you write down a model that you have ind mind, which follows your factorization structure? That would help me in understanding where you are coming from.

1 Like

For the above mentioned books please see here: (I do not support violation of copyright and only endorse downloading a book if its copyright will not be violated): [edit: link removed]

As you can tell I am not an expert in this field, hence my posts. With that in mind:

  1. We have a model of the investment rate of firms, y, being a function of their book-to-market value (x*) - also called Tobin’s Q -, and their cashflow rate (z). The literature and our data shows a definite correlation between x* and z. This seems important to note. Other measurement error estimation results on this exact same model found notable attenuation bias in x* and false apparent significance in z (see here).
  • In addition, we include some firm-level control variables and industry fixed effects. We run this as a log-log model (also to aid in tractable sampling), log(Investment) = f( log(Book-to-Market) + Cashflow rate). Our model is somewhat more complex than this since: we provide a `random effects’ structure to our Book-to-Market and Cashflow variables, using a non-nested membership of years and countries fic:fyear. In addition, we look at the impact of cashflow across two different types of firms, and so interact it with a dummy variable. The brms() code is this:
fit_T_ME_Full <-  brm(bf(log(inv_rf) ~ me(log(Q_bv),0.5) + (1 + me(log(Q_bv),0.5) + cashflow_rf:external_fincf | fyear:fic) 
        + cashflow_rf:external_fincf + capital_output + sic_one + bin_capital_stock10, decomp="QR"),
               data = reg_data, family = student(), control = list(max_treedepth = 10, adapt_delta=0.8),
     prior = c(set_prior("normal(0,1)", class = "b"),
                     set_prior("cauchy(0,2)", class = "sd"),
                         set_prior("lkj(2)", class = "cor")), warmup = 1000, iter = 2000, chains = 3, cores = 3)
  1. The measurement error model is generally factorised into a measurement model', an outcome model’, and an `exposure model’:

image

This approach follows from Clayton 1992; Richardson and Gilks 1993.

  • That being said, the above authors were working in the context of health and epidemiology so I do wonder if having such an ‘exposure’ model makes sense, say for the above economics model??

  • Is such a factorization strictly necessary? Stan guide seems to indicate not:

  • In addition, Gustafson (pp83-86) finds correct specification of the exposure model does not seem that important to correct inferences about x*.

  • But, on the other hand, by reducing f(x|z) - the so-called `exposure model’ - to f(x) - a simple prior on the unobserved but correctly measured x - is one not implicitly assuming conditional independence between x and z - which somewhat works against the reason why people often use ME models in the first place - including ours!; namely that the mismeasured variable not only creates attenuation bias in the mismeasured regressor x*, but also that this then creates apparent significance in the correctly measured regressor z, which actually might have little or no true impact on response variable y.

Thanks again,

Taking a second - more succinct? - pass at this after reading Grace Y Yi’s ME textbook (2017): the question seems to be if, after assuming nondifferential measurement error, to factorize f(x,z) further, or instead to leave this term unmodeled as a nuisance function (as brms() and Stan() does by default). Put differently: when should the probability distribution of the true covariates be treated as ‘fixed’ (so-called ‘functional’ approach) vs. further elaborated such that f(x,z) = f(x|z) f(z). The latter (`structural’) approach is by far the generally adopted approach in the Bayesian ME literature, for whatever reason – perhaps because it is easily incorporated into the Bayesian framework?

  • In our specific case, theoretically, where x is correctly measured Tobin’s Q or the book-to-market value of the firm; and z is the firm’s cashflow rate, we know that these two variables are not independent. Moreover, we know that the direction of the dependancy is more likely to be: x = f(z). The dependance follows - I think - from the fact that x^* and z are not independent. We know that attenuation bias in x^* is likely to impact z. If we ignore this dependance, by not including a full exposure model, of the sort f(x,z) = f(x|z) f(z), then we are going to have less correct results?

  • A second, practical, issue for us is computational. Our model is already complex, with many group-level parameters (over 1,300) and 300,000 data points. Adding another regression, by adding in an `exposure model’ which is not essential, seems like a bad idea. But still then need to justify this theoretically.

There is a lot of information to process, so not sure if I respond correctly to all of this, but I have at least a suggestion to build f(x | z) into the model while accountng for measurement error in x. The approach works by using mi() instead of me() terms. The former were originally designed for missing value estimation but can incorporate measurement error as well. Here is my simplistic proposal:

form <- bf(x | mi(se_x) ~ z) +
  bf(y ~ mi(x) + z) +
  set_rescor(FALSE)

The first line models x as having measurement error as well as depending on z. The second line models y as dependng on x and z and the third line models the two univariate models as conditionally independent. Does this go into the direction you had in mind?

Thanks for this. Yes - looks good. But how much measurement error is being assumed here? Ie what is the underlying measurement error model.

The Stan code of the model should describe what exactly is happening. How much error is assumed depends on how much you pass via the the variable se_x in my example.

OK, great. Thanks.

@Ilan_Strauss This post was flagged, so I removed the offending link. Please do not post instructions on how to pirate copyrighted materials (even with disclaimers). Thanks.

2 Likes

Onto the one unresolved issue if you don’t mind: having a prior for \tau. Currently the default ME model in brms() does not have one. The textbooks seem to use one. What is the rational for not including a prior on \tau? Should I post this instead on the Stan modelling forum? Thank you.

As I understand it, by default brms uses uniform priors over the support. That’s maybe not the “right” choice for any given particular model and user, but a defensible default position. The rationale is explained pretty clearly in the package documentation and vignettes. You can always pass in your own priors if you want something different.

1 Like

I do not think that this is correct in this case. If you look at the back-end Stan code for the brms() ME model, \tau is treated as data, it is not treated as a random variable. As a result it has no prior on it by default, I think.

Yes indeed tau is treated as data as without additional information you will likely not be identify x per observation as well as tau (this will make N + 1 observerations for N observations)

OK, great. And I assume the number of ME parameters – N + the 2 parameters of the population distribution for true X – does not change if the measurement error is applied to say, a ‘random effect’ (i.e. a group-level parameter) estimated across time say?

If we have one random effect per observation than we keep the same number of parameter i.e. N + 1 or 2.

For what it is worth, I was reviewing this paper re measurement error in INLA this weekend and it seemed to be recommending a similar approach to what you are discussing here: https://arxiv.org/abs/1302.3065.

Is it then the case that the equations

form <- bf(x | mi(se_x) ~ z) +
  bf(y ~ mi(x) + z) +
  set_rescor(FALSE)

define a model for mediation of z’s effect on y via x - taking into account measurement error in x? That is, is mediation another way of describing a ‘conditional exposure model’?

If so, then would it be possible to generalise this approach to construct more complex path models (multiple mediation/moderation, taking measurement error into account)?

If so, then is this approaching a method for structural equation modeling, where latent factors - measured with error - have single indicators?

Thanks for your thoughts

I think you are right with all your thoughts. Indeed, this is kind of what one does in SEMs when only having a single indicator and manually specifying the measurement error of the sinlge indicator.

Thanks,

The power and generality of brms is wonderful. Many thanks for creating it.

1 Like