Capturing difference in within-subject variance in multilevel model with map2stan

mjiwa · November 20, 2020, 10:42am

Hi,

I’ve been using map2stan to fit a multilevel model to a dataset. I was hoping to model the difference in variance between individuals, and am having difficulty achieving this.

Here’s a dummy example of the approach I have been taking:

nullModel <- ulam(
   alist(
      outcome ~ dnorm(mu, sigma),
      mu <- a[ID] + beta[ID] * predictor,

      # Individual-level priors
      a[ID] ~ dnorm(a_mu, a_sig),
      beta[ID] ~ dnorm(beta_mu, beta_sig),
      sigma ~ dcauchy(0, 2),

      # Hyperparameter priors
      a_mu ~ dnorm(0, 2),
      a_sig ~ dcauchy(0, 1) & T[0,],
      beta_mu ~ dnorm(0, 2),
      beta_sig ~ dcauchy(0, 1) & T[0,]

), data = d )

The issue arises from the fact that I expect some difference between individuals in the variance of their scores. My question is, how can I specify the model such that I end up with individual a, beta, and sigma parameters for each individual in the grouping structure?

My initial attempt involved applying the same approach to the sigma parameter, like so:

      sigma ~ dcauchy(sigma_mu, sigma_sig),

      sigma_mu ~ dnorm(0, 2),
      sigma_sig ~ dcauchy(0, 1)

However, this returned the error message:
invalid class “map2stan” object: invalid object for slot “coef” in class “map2stan”: got class “NULL”, should be or extend class “numeric”

Am I going about this the wrong way? Any advice is hugely appreciated.

Sorry for any naïvety in my question, too!

torkar · November 20, 2020, 10:59am

Hi and welcome to the community!

First,

Do you set a prior on intercept?

Second, I believe that the dcauchy() in rethinking is already truncated?

Third, if you use individuals’ within measurement then I believe you can use this notation in rethinking (if you have a column ID for each individual):

a[ID]

so one example could be,

m <- ulam(
  alist(
    outcome ~ dnorm(mu, sigma),
    mu <- a + b * x + a[ID],
    a ~ dnorm(0,10),
    b ~ dnorm(0,2),
    a[ID] ~ dnorm(0,5),
    sigma ~ dexp(1)
  ), data = d, cores = 4, chains = 4
)

where x is a predictor and a[ID] is a varying intercept for each individual (but this would require a few measurements per individual I guess).

mjiwa · November 20, 2020, 11:24am

Thanks for the welcome! And thanks for your help!

Whoops, that’s an error in the dummy code. intercept should just be replaced with a. I’ll update the original post to reflect that, for posterity.

This is, in part, what I the initial post should already do. The original code above provides an estimate of individual subjects’ intercept and slope, but doesn’t provide an estimate of the variance around those estimates. This is what I was hoping to achieve.

Hope this clarifies the question somewhat!

torkar · November 20, 2020, 1:48pm

I think you should check out the chapter on categorical variables in the book.

If you want to estimate sigma for each ID I believe you could do, for example, something like this,

mu <- ... + a[ID],
a[ID] ~ dnorm(0, sigma_id),
sigma_id ~ dexp(2)

mjiwa · November 20, 2020, 2:59pm

Unless I’m missing something, this would provide an estimate of sigma of the mean of the unique intercepts for each ID, which is subtly different from an estimate of the sigma of the observations themselves within that ID.

The a[ID] part of the definition of mu exists within the original code (missing the [ID] part in the definition, I’ll add that to the original code though it doesn’t affect it).

I have a feeling I’m misunderstanding something very straightforward here, sorry if that’s the case!

torkar · November 20, 2020, 3:05pm

You’re right - could it be that what you want is really,

mu <- a[id] + b[id]*x
c[a,b][id] ~ multi_normal(c(a,b), Rho, sigma_id)

i.e., a varying effects model?

mjiwa · November 21, 2020, 1:52am

I believe this would offer an improvement on the current model insofar as it would allow for a prior to be set over the correlation matrix (and make the model specification more succinct), but, to my understanding, wouldn’t provide an estimate of the within-individual variance of scores.

mjiwa · November 21, 2020, 6:33am

Adding here that there’s a snippet that solves a similar problem here. The goal within the present framework would be analagous to finding log_sigma_id in the linked code – essentially reverse-engineering the sigma_id values that were created to simulate the data. The main difference is that there would only be one observation per individual within each nation.

torkar · November 21, 2020, 10:29am

Well if you only have one observation per individual I have a hard time seeing how to estimate the within variance? Maybe I’m misunderstanding you now :)

mjiwa · November 22, 2020, 5:52am

Sorry, that’s a confusing statement on my part.
The “one observation per individual” was referring to the linked code solving a similar problem, but using a data structure involving a number of individuals nested within countries.
The data in the problem I’m trying to solve involves a number of observations nested within individuals.

torkar · November 22, 2020, 8:27am

But if you have a multilevel model you can get the individual’s variance from the posterior? Please see Chapter 13 in McElreath’s book, and in particular Figure 13.4 (Figure 12.4 in the first edition).

Topic		Replies	Views
Truncation of outcome variable in map2stan Modeling specification , map2stan	1	343	December 5, 2020
Convergence Error (Multilevel Model) Modeling	4	469	October 9, 2020
Posterior variance of hyperparameters in multilevel model with hard constraints Modeling	2	794	August 27, 2018
Interpretation of hierarchical model results - difference in group level parameter mean Modeling interpret-results	1	1144	March 19, 2018
How to model individual specific variance of error-term in multilevel model Modeling	2	329	April 30, 2019

Capturing difference in within-subject variance in multilevel model with map2stan

Related topics