How do I reproduce the fit from conditional_effects function of brms package when using the Stan fit myself?

GabrielSass · October 6, 2023, 8:35pm

When I use the brms function to do a hurdle model stan fit and then plot the obtained fit with conditional_effects, it is never the same as when I use the actual stan code and stan function to do the fit. It seems as if brms does some black box transformation on the parameters that cannot be reproduced by using the stan function, is this normal?

I am essentially using the following hurdle fit:

hurdlefit <- brm(brmsformula(Y ~ X, hu ~ X),
                 family="hurdle_lognormal", data = data_fitting)

And the following stan fit:

N <- length(Y)
K <- 2
X <- cbind(rep(c(1), each=N), X)
K_hu <- 2
X_hu <- X
prior_only <- 0

hurdlefit <- stan('hurdle_model.stan', data = list(N=N, Y=Y, K=K, X=X, K_hu=K_hu, X_hu=X_hu, prior_only=prior_only))

Where the file hurdle_model.stan was simply generated by doing this:

make_stancode((brmsformula(Y ~ X, hu ~ X),
                 family="hurdle_lognormal", data = data_fitting)

Now afterwards, when I obtain the best-fit parameters from the stan fit and try to plot the curve, it never follows the data and never looks like the same curve as the brms fit. Why is this?

joels · October 6, 2023, 10:58pm

Can you show the code you’re running to go from the hurdlefit object to the plot that’s intended to mimic the output of conditional_effects?

GabrielSass · October 7, 2023, 11:51pm

Yes. I will first explain the reasoning to better understand my code. I start by storing the hurdlefit object. Afterwards, I store the simulation attributes in the object summary_data. With this, I essentially take the mean of every parameter prediction over all 4 chains of simulations. I do this for all 4 parameters, I call them beta0, beta1, gamma0, gamma1. Since the gamma parameters come from a lognormal regression, I take the exp() of them.

hurdlefit <- stan('backup.stan', data = list(N=N, Y_obs=Y_obs, K=K, X=X, K_hu=K_hu, X_hu=X_hu, prior_only=prior_only))

beta0 <- median((c(summary_data[[1]][[1]]$b_hu_Intercept, summary_data[[1]][[2]]$b_hu_Intercept, summary_data[[1]][[3]]$b_hu_Intercept, summary_data[[1]][[4]]$b_hu_Intercept)))
beta1 <- median((c(summary_data[[1]][[1]]$`b_hu[1]`, summary_data[[1]][[2]]$`b_hu[1]`, summary_data[[1]][[3]]$`b_hu[1]`, summary_data[[1]][[4]]$`b_hu[1]`)))
gamma0 <- median(exp(c(summary_data[[1]][[1]]$b_Intercept, summary_data[[1]][[2]]$b_Intercept, summary_data[[1]][[3]]$b_Intercept, summary_data[[1]][[4]]$b_Intercept)))
gamma1 <- median(exp(c(summary_data[[1]][[1]]$`b[1]`, summary_data[[1]][[2]]$`b[1]`, summary_data[[1]][[3]]$`b[1]`, summary_data[[1]][[4]]$`b[1]`)))

Afterwards, it is a little strange, but I prefer plotting in python so I simply save these values to a file and load them up in python. After which I plot them using the theoretical equation for a lognormal hurdle model, which if I’m not mistaken is:

\frac{1}{1+e^-(\beta_0+\beta_1 x)}*(\gamma_0 + \gamma_1 x)

Where the first part of the equation is the logistic portion and the second is the linear portion.

Here is the code for plotting:

# logistic function
def logit(x, beta0, beta1):
    return 1/(1+np.exp(-(beta0 + beta1*x)))
  
# log_linear function
def linear(x, gamma0, gamma1):
    return gamma0 + gamma1*x
  
# hurdle model functional form
def hurdle(x, beta0, beta1, gamma0, gamma1):
    return logit(x, beta0, beta1)*linear(x, gamma0, gamma1)

sigma_plot_values = np.linspace(1, 2.8, num=100)

fitted_line = hurdle(sigma_plot_values, beta0, beta1, gamma0, gamma1)

fig = plt.figure(figsize=(10,8), layout='constrained')
plt.xlim(1.0, 2.8)
plt.ylim(-0.5, 11)

plt.plot(sigma_plot_values, fitted_line, color='black')
plt.show()

And for some reason, this is never the same curve as the one obtained from doing conditional_effects() with the brms object, which should be the same.

LucC · October 8, 2023, 6:42am

I suspect that brms conditional effects plots are the mean of the function of the parameters, whereas you are plotting A function of the mean of the parameters. You should first calculate the conditional effects for each sample of the posterior,and then take the mean of the result.

GabrielSass · October 8, 2023, 1:26pm

Thanks for the info, this could be the difference. I was wondering, how would you take the mean of the function? I’m not sure how you could find this computationally. Also, how would you extract the parameter values after? Because a brms fit outputs the parameter values. And my last question is: wouldn’t this change the intrinsic function?

LucC · October 8, 2023, 2:42pm

It’s quite straightforward. Instead of using the mean of each parameter, use the full posterior (a vector of values) for each parameter to compute the (full posterior distribution of the) conditional effects. Then take the mean of that posterior distribution of conditional effects. This is what I mean by “take the mean of a function of the posterior distribution”. You can extract the posterior samples of each parameter directly from your stanfit object. See also the rstan documention (assuming you’re using stan, like you indicate above): Extract samples from a fitted Stan model — extract • rstan

joels · October 8, 2023, 4:32pm

In case it might be helpful, in this answer I showed how to reproduce what conditional_effects is doing using tidybayes::epred_draws to extract the expected values from the posterior predictive distribution and summarize them to get the median and credible intervals.

The other thing to be aware of is that for each predictor variable in the model, conditional_effects creates a data frame of values at which to calculate the conditional effect of that predictor variable. For a numeric variable, this data frame will have 100 rows of values spanning the range of that variable. For a categorical variable, the number of rows will equal the number of categories. If there are other predictor variables in the model, they will be set to their mean if numeric or the reference category if categorical.

For example, for the regression in your initial post, you can see the data frame conditional_effects creates for X by running:

ce = conditional_effects(hurdlefit)
ce[["X"]]

ce is a list of data frames, one data frame for each of the predictor variables, and it also of course includes the median and credible intervals for each row, since conditional_effects takes care of summarizing the posterior expectations. You would need to create a similar data frame (or data frames, for multiple predictor variables) to pass to epred_draws.

GabrielSass · October 9, 2023, 9:07pm

Thank you very much for your help, but if I understand correctly, this would translate to doing this:

testing <- extract(hurdlefit)

beta0 <- median(testing$b_hu_Intercept)
beta1 <- median(testing$b_hu)
gamma0 <- median(exp(testing$b_Intercept))
gamma1 <- median(exp(testing$b)

To obtain the best-fit values of the 4 parameters, but this yields the same result as what I was doing before:

hurdlefit <- stan('hurdle_model.stan', data = list(N=N, Y=Y, K=K, X=X, K_hu=K_hu, X_hu=X_hu, prior_only=prior_only))
    

summary_data <- attr(x=hurdlefit, which='sim')

beta0 <- median((c(summary_data[[1]][[1]]$b_hu_Intercept, summary_data[[1]][[2]]$b_hu_Intercept, summary_data[[1]][[3]]$b_hu_Intercept, summary_data[[1]][[4]]$b_hu_Intercept)))
beta1 <- median((c(summary_data[[1]][[1]]$`b_hu[1]`, summary_data[[1]][[2]]$`b_hu[1]`, summary_data[[1]][[3]]$`b_hu[1]`, summary_data[[1]][[4]]$`b_hu[1]`)))
gamma0 <- median(exp(c(summary_data[[1]][[1]]$b_Intercept, summary_data[[1]][[2]]$b_Intercept, summary_data[[1]][[3]]$b_Intercept, summary_data[[1]][[4]]$b_Intercept)))
gamma1 <- median(exp(c(summary_data[[1]][[1]]$`b[1]`, summary_data[[1]][[2]]$`b[1]`, summary_data[[1]][[3]]$`b[1]`, summary_data[[1]][[4]]$`b[1]`)))

But maybe I am misunderstanding what you meant.

GabrielSass · October 9, 2023, 9:09pm

Thank you very much for the detailed explanation. The only problem is that for this method, it is necessary to create the model using brm() to use the function:

tidybayes::epred_draws

But in my case, I cannot use brm as I modified the Stan code directly and as of my knowledge, there is no way to obtain my exact Stan code using a brm model. Is there a way around this?

JLC · October 9, 2023, 9:20pm

If I understand your goal correctly, you’ll have to calculate the conditional effects (mu) for new data (population effects fixed at their means) in the generated quantities block or, calculate them outside of the Stan program using draws of parameter estimates and multiplying by the means of your predictors .

GabrielSass · October 9, 2023, 10:03pm

Ignore my last response, I think I understand better now.

If I am not mistaken, your method would be to essentially take the vector of posterior samples for each parameter. Generate a predicted Y for a certain range of X (for each x entry in X vector) with each entry of the vector of posterior samples. Take the mean of this Y vector for each given x entry to get a smoothed best-fit curve.

The only issue I am having with understanding this method is: After obtaining this smoothed best-fit curve, how do I obtain the 4 parameter values that generate this exact curve? Would I have to refit this curve with another method (using Stan again or a Python curve fit, etc)?

LucC · October 10, 2023, 6:55am

Glad it’s clearer now.

You don’t. The source of this curve is the full posterior. If you want readers of your paper to be able to reproduce it, you can simply include the full set of posterior samples of parameter values as a supplemental file.

GabrielSass · October 10, 2023, 2:13pm

Thanks this makes total sense. It would seem as if the linear portion of my hurdle model underfits the data when compared to the brms fit that perfectly fits the data. Do you know if brms does a certain transformation behind the scenes to the posterior samples of the linear parameters?

LucC · October 11, 2023, 5:58am

I don’t know, but my first intuition would be to recheck whether I have calculated the linear portion correctly. For instance, assuming that you have a log link function, you should make sure that you exponentiate the linear predictor before taking the mean of posterior distribution of conditional effects. If you don’t, you are essentially plotting the geometric mean instead of the arithmetic mean of the posterior, which is closer to zero (explaining your reported pattern).

Topic		Replies	Views
How do we obtain the response value estimates from a stanfit object just as we would get from the function conditional_effect() with a brmsfit object? brms rstan , stan , conditional-effects , brms	0	554	June 6, 2022
Choosing priors for a brms hurdle model brms specification , prior-choice , priors	2	1749	July 19, 2020
Using BRMS Functionality for Data Generated in CmdStan brms	4	1346	September 5, 2019
Hurdle Prior or Piecewise Function? Modeling specification , brms	1	39	September 8, 2024
Brms: discrepancy between conditional effects plots and calculations based on posterior Modeling	1	836	September 25, 2020

How do I reproduce the fit from conditional_effects function of brms package when using the Stan fit myself?

Related topics