Case Study: Stochastic Production and Cost Frontiers

I’m using Stan for some different projects and decided that it might be useful to write out the basic stochastic frontier models as a case study. These are “composed error models” and used for efficiency analysis in econometrics. The models are not that complicated but I would appreciate any feedback you can provide. In particular, is there a more elegant way of combining a user-defined function and using the loo package? I’ve found Stan to be a useful piece of software for my work, but I’m never quite sure that I’m using best practices.

Check the repository for the case study at
https://bitbucket.org/aframsey/stancase_frontier/src/master/
or view a PDF of the final output. The PDF is a little wonky.
spfstan.pdf (144.4 KB)

If there’s interest in this case study, I may update with a few of the more advanced models for panel data. Or hierarchical models. But that might generate a whole new set of questions for the Stan forums!

2 Likes

Just for concreteness, I’m pasting in one of the models from the linked repo:

functions {
   
   real hnormal_lpdf(vector y, real sigma) {

      int N = rows(y);

      vector[N] prob;
      real lprob;
      for (i in 1:N) {
         prob[i] = log(2*sigma)-log(pi())-(y[i]^2*sigma^2)/pi();
      }
      lprob = sum(prob);
      return lprob;
   }

   real nhnormal_pdf(real y, row_vector x, real alpha, vector beta, real sigma, real lambda) {
      real prob;
      real sigs = sqrt(sigma^2 + lambda);
      real delt = sqrt(lambda)/sigma;
      real epsi = y - alpha - x * beta;
      prob = (0.5) * log(2 / pi()) - log(sigs) + normal_lcdf(epsi*delt/sigs|0,1) - (epsi^2)/(2*sigs^2);
      return prob;
   }
}

data {
   int<lower=0> N; // Number of observations
   int<lower=0> K; // Number of factors of cost function
   matrix[N, K] x; // Matrix of factors of cost function
   vector[N] y;    // Vector of costs
}
parameters {
   real alpha;            // Intercept
   vector[K] beta;        // Coefficients of the cost function
   real<lower=0> sigma;   // Error standard deviation
   real<lower=0> lambda;  // Parameter of half-normal inefficiency distribution
   vector<lower=0>[N] u;  // Inefficiency terms
}
model {
   alpha ~ normal(0, 100);
   beta ~ normal(0, 100);
   sigma ~ inv_gamma(0.001, 0.001);
   lambda ~ gamma(1, 1/37.5);
   u ~ hnormal(lambda);
   y ~ normal(alpha + x * beta + u, sigma);
}
generated quantities {
   vector[N] eff;
   vector[N] log_lik;
   eff = exp(-u);
   for (i in 1:N) {
      log_lik[i] = nhnormal_pdf(y[i], x[i], alpha, beta, sigma, lambda);
   }
}

First, we don’t recommend those diffuse gamma priors—there’s a discussion in the Stan manual in the chapter on regression. We generally recommend at least weakly informative priors that indicate the scale of the parameters.

You can vectorize for efficiency and you don’t need all those intermediates. This,

   real hnormal_lpdf(vector y, real sigma) {

      int N = rows(y);

      vector[N] prob;
      real lprob;
      for (i in 1:N) {
         prob[i] = log(2*sigma)-log(pi())-(y[i]^2*sigma^2)/pi();
      }
      lprob = sum(prob);
      return lprob;
   }

can be reduced to a much more efficient one liner

real hnormal_lpdf(vector y, real sigma) {
  return rows(y) * (log(2 * sigma) - log(pi()))
     - dot_self(y) / pi() * square(sigma);
}

with only N + 3 multiplications, one division, and two subtractions. I took the additional measure of guessing that y might be data, so that the dot-self and division by pi() are together so as to minimize operations involving parameters (which is where the real cost is because of the need to store the expressions and compute derivatives).

2 Likes

Hi

I know that your post about Stochastic Productions frontiers is too old, but I’m interested in this kind of models. Are your project alive?

I am also planning to use Stan for estimating stochastic frontier models.

@xbarber and @aframsey: Did you proceed with your projects and do you want to share your experiences, challenges, solutions, lessons learned, successes, etc.?

Dear all,

I have been experimenting with stochastic frontier / efficiency models in Stan in the context of Data Envelopment Analysis (DEA), and thought it might be useful to briefly share my experience.

In my case the frontier is specified as a regression model with an additive noise term and a one–sided inefficiency term, and efficiency is recovered from the posterior of the inefficiency. Concretely, I use a model of the form

yi=α+xi⊤β−ui+εiy_i = \alpha + x_i^{\top}\beta - u_i + \varepsilon_iyi=α+xi⊤β−ui+εi,

with εi∼Normal(0,σ)\varepsilon_i \sim \text{Normal}(0,\sigma)εi∼Normal(0,σ) and ui>0u_i > 0ui>0 following an exponential/gamma–type prior. Technical efficiency is then defined as exp⁡(−ui)\exp(-u_i)exp(−ui), which in Stan is implemented in the generated quantities block as something like:

text

generated quantities {
vector[N] eff;
for (n in 1:N)
eff[n] = exp(-u[n]);
}

In an applied example with 3 inputs and 1 output, I fitted both:

  • an INLA implementation using iid / clinear latent effects for the inefficiency term, and

  • a Stan implementation of the stochastic frontier as above,

and compared the resulting efficiency scores. The Stan model behaved well (after some tuning of adapt_delta and warmup) and produced efficiency estimates that were reasonably consistent with the INLA version.

The main practical issues I encountered were:

  • choosing a parameterization and priors for the inefficiency distribution that avoid strong identifiability problems between uiu_iui and σ\sigmaσ;

  • the need to standardize covariates and sometimes log–transform the output to improve geometry and reduce divergences;

  • deciding between different one–sided distributions for uiu_iui (half–normal vs exponential vs gamma) depending on the application.

If there is interest, I can clean up my Stan code into a minimal working example (3 inputs, 1 output) and share it here, and I would also be very happy to discuss how to extend this to more standard stochastic cost/production frontiers or hierarchical settings.

Best regards,

Xavier

PS: actual Stan code

// Stochastic frontier / DEA-style model
data {
int<lower=1> N;           // Sample size
int<lower=1> K;           // Number of predictors
matrix[N, K] X;           // Predictors (ideally centered / scaled)
vector[N] y;              // Outcome (often log-transformed)
}

parameters {
real alpha;               // Intercept
vector[K] beta;           // Slopes

real<lower=0> sigma;      // Noise SD

real<lower=0> lambda;     // Global rate parameter for inefficiency
vector<lower=0>[N] u;     // Inefficiency terms
}

transformed parameters {
vector[N] mu;
mu = alpha + X * beta - u;
}

model {
// Weakly informative priors (adjust to your scale)
alpha  ~ normal(0, 5);
beta   ~ normal(0, 2);

sigma  ~ exponential(1);          // Prior on noise
lambda ~ exponential(1);          // Prior on mean inefficiency

// Inefficiency: gamma(1, lambda) == exponential(lambda)
u ~ gamma(1, lambda);

// Likelihood
y ~ normal(mu, sigma);
}

generated quantities {
vector[N] eff;
// Technical efficiency = exp(-u_i)
eff = exp(-u);
}