Is it possible to build a multi-level extreme value analysis?

blakeobeans · May 30, 2020, 4:54am

I’ve built an extreme value analysis model using the “point over threshold” method, which utilizes a generalized pareto distribution. It’s similar to @avehtari’s case study, except I am using the POT package in R, which uses maximum likelihood to estimate model parameters.

More specifically to my research, I’m studying 22 different buildings and looking at energy use in each of them under extreme weather conditions. So I’ve actually built 22 different models, estimated 22 different sets of parameters, etc. (The CDF for all 22 models is below, with the dark line representing the mean of the parameters. It is not multi-level.)

Rplot01

I’m wondering whether it’s possible to do something very similar to Aki’s model, but to do so in a multi-level framework, thereby estimating hyper-parameters and having a single “average” estimate of parameters.

Is this possible? Could I build off of Aki’s code? If someone says it’s possible, I’d like to take it on in this thread and work through it with y’all.

Many thanks.

avehtari · May 30, 2020, 8:09am

Yes, it’s possible. Couple student groups have done this as part of their project in my Bayesian data analysis course at DTU. Sorry, I don’t have their code and I can’t remember the names of the students, but making the model is feasible.

blakeobeans · June 14, 2020, 9:42pm

Hi, I’ve fit the generalized Pareto PDF with a multi-level structure. I’ve included the Stan code and data. Note the data has a ragged structure because there are varying observations per group. There are some divergences but everything converges nicely. Data and r script included.

functions {
  real gpareto_lpdf(vector y, real ymin, real k, real sigma, real s) {
    // generalised Pareto log pdf 
    real N = s;
    real inv_k = inv(k);
    if (k<0 && max(y-ymin)/sigma > -inv_k)
      reject("k<0 and max(y-ymin)/sigma > -1/k; found k, sigma =", k, sigma)
    if (sigma<=0)
      reject("sigma<=0; found sigma =", sigma)
    if (fabs(k) > 1e-15)
      return -(1+inv_k)*sum(log1p((y-ymin) * (k/sigma))) -N*log(sigma);
    else
      return -sum(y-ymin)/sigma -N*log(sigma); // limit k->0
      }
}
data {
	//threshold
  real<lower=0> ymin;
   // number of observations
  int<lower=1> N;
  // number of groups
  int<lower=-1> K;
  //observations
  vector<lower=ymin>[N] y;
  // group sizes 
  int<lower=0> s[K];
  // ymax
  vector<lower=ymin>[K] ymax;
}

parameters {
//overall mean
real k_mu;
//overall mean's sd
real<lower=0> k_sigma;
//sigma mu
real sigma_mu;
//overall sd
real<lower=0>sigma_sigma;
//y-specific
vector<lower=0>[K] sigma; 
vector<lower=-sigma[K]/(ymax[K]-ymin)>[K] k; 

}
model {
	int pos;
	pos = 1;
	
  //priors
  k_mu ~ normal(0, 3);
  k_sigma ~ normal(0, 3);
  sigma_mu ~ normal(0,3);
  sigma_sigma ~ normal(0, 3);
  k ~ normal(k_mu, k_sigma);
  sigma ~ normal(sigma_mu, sigma_sigma);
  
  for(i in 1:K){
  segment(y, pos, s[i]) ~ gpareto(ymin, k[i], sigma[i], s[i]);
  pos = pos + s[i];
  }
}

rstan script.R (1.3 KB)
gpareto.stan (1.4 KB)
data (ccl.csv) (740.9 KB)

Topic		Replies	Views
Add generalized pareto distribution Developers	6	843	April 19, 2022
Fitting a multilevel GPD to negated data Modeling rstan	2	439	August 4, 2023
Help with defining the generalized pareto function Modeling rstan , techniques	0	505	February 14, 2022
Case study: Extreme value analysis and user defined probability functions in Stan Modeling	4	2301	November 15, 2017
Is there an easy way to add censored data to the generalized pareto distribution? Modeling	0	358	October 15, 2021

Is it possible to build a multi-level extreme value analysis?

Related topics