Can Bayesian predict cumulative-quantity?

Wang_MinJie · May 1, 2020, 6:49am

there is great difference

bbbales2 · May 1, 2020, 11:12am

Well I guess this is a simple question but I actually have about zero experience. Here is the situation as I understand it though:

Selectively leaving out data points is the path to tricking ourselves about our model fit
If there are outliers, our first intention should be to think about why that point might be so different from the rest and model it
If we do not have the data to model this (extra covariates or whatever that could be driving this), then we’ll just need to model outliers by changing our likelihood to something else. A student-t is the usual thing. Search for “outliers” here: http://sumsar.net/files/posts/2017-01-15-bayesian-computation-with-stan-and-farmer-jons/stan_exercise.html . I think that explanation seems fair.

bbbales2 · May 1, 2020, 2:41pm

Regardly outliers, there’s a post on the Gelblog if you’re bored: https://statmodeling.stat.columbia.edu/2020/05/01/hey-you-yeah-you-stop-what-youre-doing-right-now-and-read-this-stigler-article-on-the-history-of-robust-statistics/

It’s on robustness, but whenever outliers come up so does robust regression. The Gelblog can be super confusing to read, but there are links to previous discussions on robustness and whatnot. It can be interesting to skim just to see what people are vaguely talking about.

Wang_MinJie · May 2, 2020, 3:01am

Thank you very much for your kind guidance.
i am overwhelmed with appreciation and inspired to continue learning，although the road is long.

martinmodrak · May 5, 2020, 10:47am

Just to add a small thing for perspective - there is relatively big literature trying to discover what is a good response/distribution to model citation counts and they mostly agree that negative binomial is too concentrated and doesn’t do a good job at capturing the rare papers with very huge citation counts. This is possibly what is happening with the year 2018 in your data - some paper got cited hugely.

The sad part is that Stan doesn’t have any of those less known, so called “overdispersed” distributions implemented (and I am actually slowly working on implementing them, but it turns out it is not trivial). So you can’t easily improve your model much. The best you can do to model this “overdispersion” easily would be to add a varying intercept for each measurement (e.g., + (1 || year)).

If you want to go deeper, a good reference is IMHO the paper on the Generalized inverse-Gaussian Poisson distribution (also called the Sichel distribution) which references some previous work and is referenced by others. But this might be just a distraction - your model seems to work good enough for many purposes.

The only package I am aware of that allows for fitting models using the Sichel distribution is gamlss

Best of luck with your Stan adventures! :-)

Wang_MinJie · May 11, 2020, 3:08pm

@martinmodrak, thank you for your advice. I recode the multilevel model and it seemed to work well

data {
  int<lower=1> N;                // number of observations
  int Y[N];                      // response variable
  real X[N];                     // predictor variable
  int<lower=1> M;                // number of grouping levels
  int<lower=1, upper=M> J[N];    // grouping indicator per observation
}

transformed data {
 real Xc[N];
 real means_X = mean(X);
 for (i in 1:N) {
    Xc[i] = X[i] - means_X;
  }
}


parameters {
  real beta;                  // fixed slope
  real Intercept;             // fixed intercept
  vector[M] u;                // varying intercept
  real<lower=0> phi;          // shape parameter
}


model {
  real mu;
  
  target += student_t_lpdf(Intercept | 3, 5, 10);
  target += gamma_lpdf(phi | 0.01, 0.01);
  target += normal_lpdf(u | 0, 1);
  target += normal_lpdf(beta | 0, 10);

  for (n in 1:N) {
    mu = Intercept + u[J[n]] + Xc[n] * beta;
    Y[n] ~ neg_binomial_2_log(mu, phi);
  }
 
}


generated quantities {
// actual population-level intercept
  real b_Intercept = Intercept - means_X * beta;
   
}

and then

stan_dat <- list(
  N = nrow(d_eng),
  Y = d_eng$n_cited,
  X = d_eng$year,
  M = nlevels(as.factor(d_eng$year)),
  J = as.integer(as.factor(d_eng$year))
)


fit <- stan(
            file = "nb_model.stan", 
        	data = stan_dat,
			iter = 4000, 
			chains = 4, 
			warmup = 2000, 
			cores = 4,
			control = list(
				adapt_delta = 0.99,
				max_treedepth = 15
			))

fit

stan_hist(fit, pars = c("b_Intercept", "beta", "phi"))
stan_hist(fit, pars = c("u"))

But now I have two new problems. In this Stan model

I don’t know how to do Posterior Predictive Checks, like this?

generated quantities {

  int y_rep[N]; 
  for (n in 1:N) {
   y_rep[n] = neg_binomial_2_rng(exp(Intercept + u[J[n]] + X[n] * beta), phi);
  }
    
}

In multilevel model，how to predict given new year, for example 2020.

martinmodrak · May 12, 2020, 12:27pm

Yes, that looks like a good basis for PP checks.

If year has a varying effect/intercept, the most common option is to sample the hyperprior…

Nevertheless, your model doesn’t really look like a multilevel model, since you don’t have any “hyperprior”, a classical multilevel model would have a parameter (say tau) that works like:

parameters {
  real tau;
  vector[M] u_raw; 
 // .... and the rest as you have it
}

transformed parameters {
  vector[M] u = u_raw * tau; 
}

model {
  target += normal_lpdf(tau | 0, 1);
  target += normal_lpdf(u_raw | 0, 1);
 // .... and the rest as you have it
}

Anyway, I am slightly puzzled why you opted to move away from brms as the model you are using is well within its capabilities. Obviously if your goal is to get better at Stan, than this is a good approach :-)

Wang_MinJie · May 12, 2020, 2:49pm

I think Stan can help me understand Bayes better, although I only know a little

bbbales2 · May 12, 2020, 9:07pm

I am curious how this works.

It might be more straightforward in the end, but Stan code is fun!

martinmodrak · May 13, 2020, 5:47am

Not sure what you’re trying to say - is there a mistake in my code/explanation? This is well possible, but I can’t see it, so I’d need more details :-)

bbbales2 · May 13, 2020, 9:55am

I am just curious to see the difference. I agree with everything you said!

Topic		Replies	Views
Help with understanding of bayesian inference using GLM and BRMS brms	3	1239	November 13, 2018
Differing posterior predictive checks for logistic binomial model with and without addition-terms brms fitting-issues , brms	5	1813	March 1, 2022
Posterior predictive output when fitting a multilevel nonlinear model with brms differs from output when using nlmer (lme4) brms	7	974	May 2, 2019
Bayes factor favours full model while the credible intervals of the predictor of interest include 0 Modeling brms	2	404	September 6, 2022
Bayesian Binary Logistic Regression brms	4	951	February 12, 2021

Can Bayesian predict cumulative-quantity?

Related Topics