Minimum Bias and Exponential Family Distributions

research

probability

minimum bias

glm

exponential family

Author

Stephen J. Mildenhall

Published

2020-10-21

Minimum Bias and Exponential Family Distributions

Introduction

This post introduces four short expository notes describing exponential family distributions that are often used to model insurance losses. The exponential family underlies generalized linear models (GLMs), a “near-ubiquitous” [1] insurance modeling technique. The notes have an unashamedly theoretical focus and aim to educate an advanced modeler or mathematically curious actuary. You don’t need to know how the car works for commuting to work, but it might be helpful if you are a race car driver—and at the cutting edge of pricing in a competitive market, you are in a race.

The four parts are as follows.

Part I reexamines the Bailey Simon minimum bias method. It presents an overview of how the actuarial approach to modeling has evolved since 1960 as it incorporated statistical models and GLMs. It explains why exponential family distributions are an ideal choice for modeling losses.
Part II describes nine different ways to define an exponential family distribution. Each approach reveals a different statistical property and the fact there are nine reflects the richness of the family. The variance function, relating variance to mean, emerges as new way to define a distribution.
Part III analyzes probability models for insurance losses, with an emphasis on ways to extend a compound Poisson distribution, how to embed a static model into a dynamic one, and how to create new distributions from old ones.
Part IV demystifies the Tweedie-power variance function families of distributions. These distributions are particularly important because they appear as limiting types for small and large expected losses from most exponential family distributions.

Each part contains its own introduction.

What will the actuary learn?

Why exponential family distributions are so useful and tractable.
How to identify and interpret the different components defining an exponential family distribution.
How to use the variance function to distinguish distributions that can be used to model losses.
How to discern the small and large expected loss behavior of a loss distribution from its variance function, and whether it is discrete, mixed or continuous.
How to incorporate prior knowledge by choosing an appropriate variance function. (Most GLM software allows for the use of custom error distributions defined by their variance functions.)
How to measure residual error at the observation level using the variance function.
How use an size of loss frequency modeling paradigm, similar to that used by catastrophe models, and how it extends the frequency and severity paradigm.
How to embed a static model into a dynamic stochastic (Lévy) process.
How to build a general Lévy process from its size of loss frequency distribution.
Why the distributions in the power variance function family appear in the order they do.
Why the Tweedie is a compound Poisson distribution with a gamma severity.
Which distribution can be regarded as the universal severity.

The four parts overlap and there is some duplication between them help make each stand-alone. The core material builds sequentially and any undefined term will appear in an earlier part.

Enjoy!

Abbreviations

Abbreviation	Meaning
CP	Compound Poisson
EDM	Exponential Dispersion Model
EF	Exponential Family
GHS	Generalized Hyperbolic Secant distribution
GLM	Generalized Linear Model
IACP	Infinite Activity Compound Poisson
ID	Infinitely Divisible
iid	Independent and identically distributed
MGF	Moment Generating Function
MLE	Maximum Likelihood Estimator
MVB	Minimum Variance Bound
NEF	Natural Exponential Family
PVF	Power Variance Family

The end of an example or exercise is marked with a square.

The gamma function is defined by \[\Gamma(\alpha):=\int_0^\infty x^{\alpha-1} e^{-x}\,dx.\] Integration by parts shows \(\Gamma(\alpha+1)=\alpha\Gamma(\alpha)\) and so \(\Gamma(n)=(n-1)!\) for integer \(n\).

References

There are many good introductions to GLMs available including general treatments: [2], [3], [4], [5], and [6], as well as specific actuarial applications: [7], [8], [9], [10], [11] [12], [13], and [1].

References

Goldburd, M., Khare, A., Tevet, D.: Generalized linear models for insurance rating. CAS Monograph Series (2016)

McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman; Hall, London; New York (1989)

Kaas, R., Goovaerts, M., Dhaene, J., Denuit, M.: Modern Actuarial Risk Theory. Springer (2008)

Dobson, A.J., Barnett, A.G.: An Introduction to Generalized Linear Models. Chapman & Hall/CRC (2008)

Heller, G.z., De jong, P.: Generalized Linear Models for Insurance Data. Cambridge University Press (2008)

Dunn, P.K., Smyth, G.K.: Generalized Linear Models With Examples in R. Springer, New York (2018)

Renshaw, A.E.: Modelling the Claims Process in the Presence of Covariates. ASTIN Bulletin. 24, 265–285 (1994). https://doi.org/10.2143/ast.24.2.2005070

Haberman, S., Renshaw, A.E.: Generalized linear models and actuarial science. Journal of the Royal Statistical Society Series D: The Statistician. 45, 407–436 (1996). https://doi.org/10.2307/2988543

Mildenhall, S.J.: A systematic relationship between minimum bias and generalized linear models. Proceedings of the Casualty Actuarial Society. LXXXVI, 393–487 (1999)

10.

Smyth, G.K., Jørgensen, B.: Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data: Dispersion Modelling. ASTIN Bulletin. 32, 143–157 (2002). https://doi.org/10.2143/ast.32.1.1020

11.

Wüthrich, M.V.: Claims Reserving Using Tweedie’s Compound Poisson Model. ASTIN Bulletin. 33, 331–346 (2003). https://doi.org/10.1017/S0515036100013490

12.

Taylor, G.: The Chain Ladder and Tweedie Distributed Claims Data. (2007)

13.

Meyers, G.: Predictive Modeling with the Tweedie Distribution Background – The Collective Risk Model Describe as a simulation algorithm. History. (2009)