Case study on Loss Curves (Actuarial Science)

Bob_Carpenter · October 14, 2017, 7:52pm

We just added a new case study:

Modelling Loss Curves in Insurance with RStan

Mick Cooney

http://mc-stan.org/users/documentation/case-studies/losscurves_casestudy.html

Abstract
Loss curves are a standard actuarial technique for helping insurance companies assess the amount of reserve capital they need to keep on hand to cover claims from a line of business. Claims made and reported for a given accounting period are tracked seperately over time. This enables the use of historical patterns of claim development to predict expected total claims for newer policies.

In insurance, depending on the types of risks, it can take many years for an insurer to learn the amount of liability incurred on policies written during any particular year. So, at a particular point in time after the policy is written some claims may not reported or known about by then, or some claims are still working through the legal system so the final amount due is not determined.

Total claim amounts from a simple accounting period are laid out in a single row of a table, each column showing the total claim amount after that period of time. Subsequent accounting periods have less development, so the data takes a triangular shape - hence the term ‘loss triangles’. Using previous patterns, data in the upper part of the triangle is used to predict values in the unknown lower triangle, giving the insurer a probabilistic forecast of the ultimate claim amounts to be paid for all business written.

The ChainLadder package provides functionality to generate and use these loss triangles.

In this case study, we take a related but different approach: we model the growth of the losses in each accounting period as an increasing function of time, and use the model to estimate the parameters which determine the shape and form of this growth. We also use the sampler to estimate the values of the “ultimate loss ratio”, i.e. the ratio of the total claims on an accounting period to the total premium received to write those policies. We treat each accounting period as a cohort.

kaybenleroll · November 29, 2017, 2:52pm

I missed out on a few references that I would like to add to the case study.

I will add them tonight, would it be possible to get the updates added to the version on the website once that is done?

Bob_Carpenter · November 29, 2017, 6:46pm

Sure. Easiest for us if you can create a pull request on the web site’s repo on stan-dev. But if you don’t know how to use GitHub, it’s probably not worth learning for this and I can do it for you.

Bayesically_a_novice · May 26, 2023, 2:46pm

Hi! I’m brand new to Stan, trying to explore MCMC for estimating reserve variability. My primary goal is to create a model that imitates the traditional chain ladder method of reserve estimation while allowing for some judgmental input via prior distributions. So slightly different from the probability-distribution-as-a-loss-curve approach taken in this case study, but definitely a related concept. The model I’m aiming for was first introduced in this paper, but the coding was done in WinBUGS then, and I’m attempting to adapt the model for Stan.

My question reveals my naivety in Stan and modeling in general, but I feel like asking here is my best approach to getting an answer. I’m learning a lot through the user manual and different forum threads, but one thing I haven’t been able to understand yet is whether or not it’s possible to use a traditional loss triangle (an N x N array with NAs in the lower triangle) as my data for the model. I know Stan treats variables declared in the data block as known. But I’m wondering if the User’s Guide sections 3.1 or 3.3 are hinting at a workaround for this? If I set the lower triangle values equal to zero, can I effectively index the upper and lower triangles separately and then differentiate the two in the model block, and then just ignore the “fit” for the lower triangle? Or am I really asking for (unsupported) ragged arrays here? In the case study it looks like a 1 dimensional vector or array was used so maybe I’m asking for too much. Hopefully this question makes sense. Thanks in advance!

Stephenll · May 26, 2023, 11:41pm

I’ve found it best to use a matrix with the first column being AY index, second DY index, third CY index (I do lots with inflation modeling) then fourth and subsequent columns are the paid, incurred, etc values. Depending on the data/model these aren’t strictly indices but time from initial period to handle fractional amounts.

SinisterRobert · January 11, 2024, 4:13am

I agree with Stephenll that data in this “long” format is easier to model in Stan. I am also currently exploring these types of actuarial reserving models, although I am definitely still a relative beginner compared to many people here on this forum. I have found Markus Gessman’s blog (Correlated log-normal chain-ladder model), Glenn Meyer’s CAS Monographs, and Markus Gessman’s research paper Hierarchical Compartmental Reserving Models to be great beginner-friendly resources to get started in exploring some various Bayesian model structures.

I have a couple of questions I’ve come up with in the last few months of learning Bayesian modelling in this context:

Have people found more success in fitting to cumulative losses or incremental losses? My issue with modelling incremental losses is that these are often losses < 0, and I’m not aware of an appropriate or well-studied response distribution with support < 0 and a skewed right tail.
One of the most commonly used stochastic reserving models in the industry is to model incremental losses using a GLM with an Over-dispersed Poisson distribution. Has anyone found a resource that describes how to replicate this model in Stan or is there a Bayesian equivalency? Also I have the issue that the Poisson distribution has a count/integer variable as response, and Stan won’t allow you to use it for a continuous response.
If I have a vector of industry or benchmark CDFs for each line of business that I’m reserving for, is there a way for me to set up a function that gives credibility to the data in a way such that earlier development periods rely more on the data and the later development periods rely more on the industry/benchmark? Ideally I would like this distribution to converge to 1 fairly quickly for the triangles I work on.
I tried thinking about this two different ways: The first was an explicit weighting beta_weight = w1 * beta_ind + (1 - w1) * beta_data where w_1 goes from 0 to 1 over time. The other was trying to manipulate the priors of the beta_data in such a way that they have mean = beta_ind and decreasing variance so that the posterior distributions are drawn and more towards beta_ind as development increases. I’m leaning towards the second being a better approach but I’m not sure if it’s an appropriate way for me to model it and haven’t had time to implement and test it yet.

I would also be very appreciative if anyone could point me to any references for people that have used Bayesian models in a pricing context using more predictive variables (especially for WC, GL, AL lines of business).

Building out these types of models as someone with no formal higher level statistical training is quite challenging, but I really hope to be exploring this area more and more over the next few years.

SinisterRobert · January 12, 2024, 5:29am

I answered many of my own questions above by finding this excellent and well-written paper by David Clark. It looks tricky for me to implement in Stan but as I get some free time I’ll work my way through it.

Cheers to anyone that tries to implement Bayesian actuarial reserving models in their work - there are a lot of nuances to this type of data structure and time-dependent model that are really tricky for a beginner to implement without some provided Stan code to reference.

Topic		Replies	Views
Case Study in Insurance (comments welcome) Modeling	10	1167	October 14, 2017
Forecast 2 year ahead rstan General rstan	1	427	November 11, 2020
Stan for survival models Modeling	26	12307	July 18, 2018
Model diagnosis / comparison for survival models General	25	1461	November 11, 2018
Survival models in rstanarm Developers	120	12825	October 21, 2020

Case study on Loss Curves (Actuarial Science)

Related topics