How to synthesize results from high dimensional model?

Can someone offer some advice for summarizing a hierarchical model with thousands of parameters?

I fit a model with 3000+ parameters in RStan. I have multiple varying intercepts and slopes, but obviously using print() for a summary of the posterior is impractical.

Any resources on how to handle this will be much appreciated!

1 Like

In a hierarchical model, you can summarise the dispersion of the cluster-varying parameters by the standard deviation of their distribution. It’s also common to compute the total variance (i.e. the variance of the cluster-varying parameters + residual variance), and define ratios of variance components, e.g. the varying-intercept variance divided by the total variance provides the classic intraclass correlation coefficient


It is also often useful to summarize a model via it’s predictions (e.g. how would the predictions if all data were assigned to a single condition, keeping all other predictors the same)


The above answers are good suggestions for how to adequately summarize the 3000+ parameters (+1 to both).

You also might be asking a practical question about how to see summaries of the marginal posterior distributions for a subset of parameters of interest without dumping summaries of 3000+ parameters into your R console. In this case, try something like:

variables_of_interest <- c("variable_name_1", "variable_name_2")
draws_summary <- posterior::summarize_draws(your_draws_array)
print(draws_summary[draws_summary$variable %in% variables_of_interest,], n = 30) # n=30 prints (up to) 30 rows of the tibble

# or if you want to get every element of a particular random effect you can do
print(draws_summary[grep("var_name\\[", draws_summary$variable),], n = 30)

# or you can do things like
print(draws_summary[draws_summary$rhat > 1.01,], n = 30)

This is what I was looking for; thanks a lot!