Apologies if this is a daft question, still trying to learn! I am trying to fit a distributional model in brms. I wanted to start with a smaller/easier example where I have generated the data to that I want to model. I cannot seem to get the results I expect.
It has been easier working with simpler models where the variance varies as a function of X. But I am having difficulty as soon as I introduce “skew” parameters.
library(brms) library(ggplot2) library(gamlss.dist)
Data is generated as follows
set.seed(404) N <- 200 X <- cbind(1, runif(N, 0, 100)) # Generate N uniform random numbers between 0-100 # Setting Distributions Params for Coefficients B_mu <- c(10, 0.2) B_sigma <- c(0.1, 0.02) # Note it is nu for gamlss.dist, but alpha for brms B_nu <- c(2, 0.5) # Generating Coefficients for X values mu <- X %*% B_mu sigma <- exp(X %*% B_sigma) nu <- X %*% B_nu # Generating y-values y <- gamlss.dist::rSN1(n=N, mu = mu, sigma = sigma, nu = nu) sim_data <- data.frame(y, X)
lss_fm <- bf( y ~ X2, sigma ~ X2, alpha ~ X2 ) lss_brm <- brm(lss_fm, data = sim_data, family = skew_normal( link = "identity", link_sigma = "log", link_alpha = "identity") ) # Summarising fit summary(lss_brm) # Prediction pred_df <- data.frame(X2 = seq(0, 100)) lss_post_pred <- cbind(pred_df, predict(lss_brm, newdata = pred_df)) ggplot() + geom_point(data=sim_data, aes(y=y, x=X2, shape="Data Point"))+ scale_shape_manual("",values=c(16))+ geom_ribbon(data=lss_post_pred,aes(x=X2, ymin=Q2.5, ymax=Q97.5,fill = "Confidence Interval"), alpha=0.5)+ scale_fill_manual("",values="grey12") + geom_line(data=lss_post_pred,aes(x=X2, y=Estimate, color="Estimate"))+ scale_colour_manual(c("",""),values=c("black","black"))
Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. Warning: There were 2127 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup Family: skew_normal Links: mu = identity; sigma = log; alpha = identity Formula: y ~ X2 sigma ~ X2 alpha ~ X2 Data: sim_data (Number of observations: 200) Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup draws = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS Intercept 14.04 9.24 3.14 28.65 7.23 4 NA sigma_Intercept -1.73 14.42 -23.33 16.12 10.30 4 NA alpha_Intercept 5.19 67.12 -92.74 82.94 7.82 4 4 X2 -0.28 0.21 -0.60 -0.04 7.93 4 4 sigma_X2 0.06 0.29 -0.29 0.48 10.25 4 NA alpha_X2 -0.10 1.31 -1.64 1.80 7.84 4 NA Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS and Tail_ESS are effective sample size measures, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
If anyone has time to give advice/guidance, that would be really helpful! Thanks for developing such a fantastic tool!