Diminishing returns on wall time vs sample size / inefficient way of writing mixture model?

bmfazio · November 3, 2018, 12:27pm

I was playing around with a simple binomial regression model verifying parameter recovery for some simulated data at different sample sizes and there’s a clear relationship between that and fit wall time. I haven’t thoroughly examined that but I would assume its O(n)?

In that case, can it be reasonably said that using too large a sample could be overkill even if available? Would taking a subsample be a recommended approach when it provides estimates that are accurate enough?

This is model I’m using:

data {
  int<lower=1> N; // sample size (data.frame rows)
  int<lower=1> Kx; // number of covariates for mean
  
  int<lower=1> n[N]; // # of attempts (binomial parameter)
  int<lower=0> y[N]; // # of successes (outcome)
  
  matrix[N, Kx] x; // covariate matrix for mean
}

parameters {
  vector[Kx] bx; // coeffs for beta mean
}

model {
  real mu_beta;

  for (i in 1:N) {
    mu_beta = inv_logit(x[i]*bx);

    target +=
    log(
      exp( binomial_lpmf(y[i] | n[i], mu_beta) )
      );
  }
}

I realize the last loop where the LL is being incremented is very inefficient for this toy model but in my real application I’m using a mixture to introduce 0 and n inflation on the binomial, which looks like this:

target +=
    log(
      ymin[i]*p[1] + ymax[i]*p[3] + p[2]*exp( beta_binomial_lpmf( y[i] | n[i], mu_beta/rho, (1-mu_beta)/rho ) )
      );

So it would remain an issue down the line, unless there’s a better way to specify the above.

Topic		Replies	Views
Model sampling using built-in beta_binomial family is far slower than the custom one brms fitting-issues , performance , brms	8	73	May 10, 2025
Beta binomial truncated takes 1000x more time than the non truncated version Modeling fitting-issues	10	776	May 28, 2021
Recommended action when fitting issues are known to be caused by limited sample size Modeling	0	382	December 27, 2018
Speeding up a Binomial model with count likelihood Modeling techniques , specification	5	232	February 2, 2024
Successfully implemented model, wondering if it can be optimized Modeling techniques , performance	1	568	March 25, 2018

Diminishing returns on wall time vs sample size / inefficient way of writing mixture model?

Related topics