I am trying to model stochastic frontier in STAN. My purpose is to separate the persistent cost inefficiency that lies hidden in the random effects. The separation of random effects and persistent cost inefficiency is based on the differences in the distribution, the pure random effects in captured by normal distribution , while the inefficiency is captured by exponential or half normal distribution. The literature usually uses skew-normal distribution to capture both.

I have recently came across the 'Exponentially modified normal distribution in Stan. It works well, but I just do not know how I can extract inefficiency part. I get the combination of random effects and inefficiency from ‘p’ (as given below). Could some one tell me how can I get the exponential part please!
So here is my code:

data {
int<lower=1> N;// Number of observations
int<lower=1> G; // Number of groups
int<lower=1> P; // Number of predictors in regression
vector[N] Y; // dependent variables
matrix[N,P] X; // matrix of independent variables
int<lower=1,upper=G> dhb_id[N];
}

parameters {
real alpha; //constant term
vector[P] beta; //coefs on different inputs
vector<lower = 0>[G] p;
real<lower = 0> sigma; //measurement error variance
vector<lower = 0>[G] lambdap;

I’m not sure how many people will be familiar with this model so it may make sense to just state the model explicitly first. What is “persistent cost inefficiency”? what is a “stochastic frontier”? Is it a linear model? Can your write it down as formulae?

Hi Emiruz,
Thanks for replying. Yes it is a linear model in the following form:

Y ~ normal(alpha + X*beta + p[dhb_id] , sigma);

I am interested in parameter ‘p’, which is distributed as exp_mod_normal with parameter mu, tau and lambda. And my attempt was to find the mean value of exponential distribution. I just realised that if ‘p’ is exponentially distributed with parameter lambda then its mean is equal to 1/lambda. Do you agree? Thanks once again for your help. Looking forward to hear back from you.

Says the distribution is a sum of independent random variables (Gaussian and a exponential) with a mean mu + 1/lambda . So I think you’re right regarding the mean of the exponential component being the inverse of its rate (ie 1/lambda). But do test it … generate some data from a known distribution and see if you can recover the parameters as you’d expect.