A new inference strategy for general population mortality tables

,


Introduction
General population mortality tables are crucial inputs for actuarial studies as they provide estimates of mortality rates for several age classes at several periods in 1 Milliman R&D, 14 Avenue de la Grande Armée, 75017 Paris, France.Email: alexandre.boumezoued@milliman.com 2 CEREMADE, CNRS-UMR 7534, Universite Paris Dauphine, Place du maréchal de Lattre de Tassigny 75016 Paris, France.Email: homann@ceremade.dauphine.fr 3CEREMADE, CNRS-UMR 7534, Universite Paris Dauphine, Place du maréchal de Lattre de Tassigny 75016 Paris, France.Email: jeunesse@ceremade.dauphine.frtime.Since the publication of the rst mortality tables (attributed to John Graunt in 1662), the mathematical problem of providing consistent statistical estimates of mortality has fascinated mathematicians -for a brief history the reader is referred to the well documented dedicated part of the introduction of Daley and Vere-Jones (2003).Two centuries later, there was a huge development of graphical formalizations of life trajectories within a population by Lexis (1875) and his contemporaries.These rst demographers showed that it is crucial to address simultaneously two components: (1) Consider the fact that the death rate depends on both age and time (non-homogeneous setting) and ( 2) Understand the mortality rate as an aggregate quantity which depends on an underlying population dynamics.
Recently, several papers and publications paid attention to data quality issues in the way we usually build mortality tables, especially in relation with the 'discrete time' nature of population estimates provided by national censuses.To our knowledge, the rst insights have been suggested by Richards (2008); his conjecture was focused on the 1919 birth cohort for England & Wales, for which he suggested that errors occurred in the computation of mortality rates due to shocks in the births series.The ONS methodology has then been studied by Cairns et al. (2016) in several directions, who conrmed the conjecture by Richards (2008) and proposed an approach to illustrate and correct mortality tables, applied to the data for England & Wales; the Convexity Adjustment Ratio introduced in their work has then been adapted by Boumezoued (2016) who focused on the Human Mortality Database HMD (2018) -which provides mortality tables for more than 30 countries and regions worldwide -and showed that these anomalies are universal while using the 'population dynamics' point of view to properly dene mortality estimates.To build new mortality tables for several countries, a link with the Human Fertility Database (HFD (2018), the HMD counterpart for fertility) has been made to correct such errors in a systematic way.
However, all precedent contributions did not succeed to introduce a proper mathematical setting for computing mortality rates based on information extracted from censuses.In this paper, we aim at performing a rst step in this direction by deriving an inference strategy from a deterministic population dynamics model.The derivation of a consistent theory in the stochastic setting is in parallel provided in a companion theoretical paper, see Boumezoued et al. (2018).
The main diculty in establishing a consistent theory to estimate mortality rates lies in points (1) and ( 2) mentioned above, which can be summarized as follows: inferring an age and time dependent mortality rate based on a population dynamics model.In the literature, we argue that each point is treated separately.
The inference of a time dependent death rate also depending on a time-dependent covariate (possibly age), which relates to point (1), has been addressed from a non-parametric perspective by Beran (1981), Dabrowska (1987), Keiding (1990), McKeague and Utikal (1990), Nielsen and Linton (1995), Brunel et al. (2008), Comte et al. (2011).From Keiding (1990), "One way of understanding the diculties in establishing an Aalen theory in the Lexis diagram is that although the diagram is two-dimensional, all movements are in the same direction (slope 1) and in the fully non-parametric model the diagram disintegrates into a continuum of life lines of slope 1 with freely varying intensities across lines.The cumulation trick from Aalen's estimator (generalizing ordinary empirical distribution functions and Kaplan & Meier's (1958) non-parametric empirical distribution function from censored data) does not help us here."This explains why data aggregation and smoothing is required to derive an estimate with two crossing dimensions, age and time.
On the other side, the inference of an age-dependent death rate in an homogeneous birth-death model (or similar) -point ( 2) -has been addressed by Clémençon et al. (2008), Doumic et al. (2015), Homann and Olivier (2016).To our knowledge, no statistical method deals with the usual problem faced by demographers related to the construction of a mortality table based on population estimates and death counts.
In this paper, we rely on a deterministic age-structured population model and derive exact formulas in the so-called Lexis diagram, allowing to build new and improved mortality estimates.The inference problem is summarized as follows: • The death rate depends on both age and time and is to be estimated, • The population evolves as an age-structured and time inhomogeneous birthdeath dynamics, • The following observables are available in the Lexis diagram: The number of individuals in each one-year age-class, assumed to be recorded at each beginning of year, The number of deaths in annual Lexis triangles, The number of births, available each month (or more generally at some intra-year frequency).
Note that the practical availability of annual population estimates as well as death counts in the Lexis triangle can be achieved according to the Human Mortality Database, whereas the Human Fertility Database is a public source providing in particular number of births by months for several countries.Such population, death and fertility data allows at this date the method proposed in this paper to be applied to around 10 countries.For other countries, the data (especially number of births by month) has to be reached by means of national institutes.
The paper is organized as follows.In Section 2, we present the non-homogeneous birth-death model and derive the inference strategy -the related interpretations and link with existing estimators is discussed in Subsection 2.6.In Section 3, we compute mortality tables according to our method and compare it to those obtained by the usual formulas.The paper ends with some concluding remarks in Section 4.
2 Model and inference strategy with notation ∂ a ≡ ∂/∂a.Clearly, at this stage, the population dynamics of g(a, t) is not fully specied as the future path of g(a, t) depends on the quantity g(0, t−a).The McKendrick-Von Foerster species how births are given in the (asexual) population, based on a birth rate b(a, t), as That is simply, the newborn at each time is given by the total number of birth from all parents alive at the same time.

Observables in the Lexis diagram
We work here in the Lexis diagram -that is we study lifelines in the time × age coordinates.In an ideal demographic world, two kinds of population estimates are recorded in the one-year age × time square: • Population at exact time t, with age x at its last birthday: g(a, t)da. (2) • Individuals who attained exact age x during the year [t, t + 1): An illustration of population estimates P (x, t) for the French population extracted form the Human Mortality Database is given in Figure 1.This can be analysed in the light of a Lexis diagram in several directions.First, the diagonal eects appear clearly showing that generations (or cohorts) are not equally represented: as an example, the generations born between around 1915 and 1920 are less represented (World War I), whereas the generations born after around 1946 are highly represented (Baby Boom).In this work, the impact of the discrepancy between birth patterns from one year to the next is of interest, as it introduces some bias in the classical formulas used in practice for death rate estimation.Denition 1.The upper (U) and lower (L) triangles for each age range x and observation year t are the age × times sets dened by and Observables in the Lexis diagram Based on this denition, the number of death in the Lexis triangles can be written µ(a, s)g(a, s)dads.
(5) An illustration of death counts in the Lexis triangles (x, t) for the French population extracted form the Human Mortality Database is represented in Figure 2. Variations in number of deaths are closely linked to those of the underlying exposure (Figure 1) but also to the death rate itself, to be estimated.Assuming that the population is closed, the following fundamental relations apply (which can be proved by integration by parts): The assumption of closed-population is further discussed in Subsection 2.6.
In addition to population estimates and death counts, as analyzed by Cairns et al. (2016) and Boumezoued (2016), we aim at including birth counts by month in the inference process -these can be extracted from the Human Fertility Database for a variety of countries.The dynamics of number of births by month in France is illustrated in Figure 3.The interpretation of this dynamics can be linked to that of Figures 1 (population estimates, see ( 2)) and 2 (death counts in Lexis triangles, as dened in (5)).Indeed, a similar information arises as the number of births are low in the period 1915-1920, which explains in particular the diagonal eect in Figure 1.Even more importantly, the dynamics at the monthly scale gives insight on what happens inside each year, then can be used to assess how the population 6/22 is distributed inside a given age band.This is of great interest as the population distribution appears classically in the form of an 'exposure-to-risk', and more precisely the formulas we exhibit in order to estimate the death rate rely explicitly on the births distribution -as such, number of births by month are the key inputs for the inference strategy proposed here as it renes standard annual estimates.This is developed in the following.When two time-dependent dimensions are involved (here age and calendar time), the natural generalization of classical non-parametric estimates of the death rate is not direct (see again the discussion in Keiding (1990)), therefore smoothing is required -see e.g.McKeague and Utikal (1990) and Nielsen and Linton (1995) for the analysis of such two dimensional kernel estimator based on continuous observation.Unfortunately, for building national mortality tables one does not observe continuously the living population (only possibly the date of death through death certicates), therefore standard kernel smoothing techniques are neither applicable here.This leads to dene some geometry on which the death rate is assumed to be piecewise constant, which allows to use aggregate information by year and age-class to derive (approximate) estimators.
In the classical demographic and actuarial practice, it is considered two versions of general population mortality tables: period and cohort.We propose here a brief discussion of these two versions and refer the reader to Boumezoued (2016) for more details (and a study dedicated to period mortality tables).The two versions are illustrated in Figure 4.
• The period table provides death rate estimates based on the assumption that it is piecewise constant on squares in the Lexis diagram; each square (x, t) is equal to the region T U (x, t)∪T L (x, t), where the Lexis triangles T U and T L have been dened in Equations ( 3) and ( 4).The key advantage of period tables is that they provide an estimate of death rate by using information of a single year; the related drawback is that two generations (cohorts) are merged for a given death rate at (x, t): the lifelines crossing the triangle T L (x, t) are born in year t − x, whereas those crossing T U (x, t) are born in year t − x − 1.This way, the period tables do not strictly reect the mortality of single cohorts.
• The cohort table is based on the assumption that the death rate is constant on parallelograms T L (x, t) ∪ T U (x, t + 1), with the advantage that a given death rate at (x, t) relates to lifelines arising from a single cohort: that of people born in year t − x.However, the information provided by this death rate reects conditions of the two consecutive years t and t + 1, as illustrated in Figure 4. Overall, period and cohort tables provide complementary information and their use is driven by the underlying objective.In this paper, we illustrate our method on the computation of triangle-based mortality tables, which generalize period and cohort mortality tables in a natural way as the death rate is assumed to be piecewise constant on Lexis triangles, instead of squares of parallelograms.This will allow us to draw analyses at a more granular scale compared to the two versions available in practice.

2.4
Main result In the derivation of the inference formulas, we assume the death rate to be piecewise constant on Lexis triangles: Assumption 1.The death rate is piecewise constant on Lexis triangles, that is for each integer x and t, From the transport component described in Equation ( 1), for any upper or lower triangle which we denote T , and on which the death rate is constant equal to µ T , it follows that: As the left hand side is the opposite of the number of deaths as introduced in Equation ( 5), it follows from the previous equation that the death rate can be written as the ratio where g(a, s)dads and E U (x, t) = g(a, s)dads, are the so-called 'exposures-to-risk' in the lower and upper triangle respectively.Now, the number of deaths in Lexis triangles being observed (as provided by the Human Mortality Database), it remains to appropriately compute the exposure-torisk.In the literature dedicated to longevity studies, this quantity is approximated by annual observables, see e.g.Pitacco et al. (2009) Section 2.3.4,as well as the Version 5 Methods Protocol of the Human Mortality Database, see Wilmoth et al. (2007).The recent update of the Human Mortality Database methodology allowing to include monthly birth data is further discussed in Subsection 2.6.The standard annual approximation can be illustrated for period tables (see Subsection 2.3) for which the exposure-to-risk writes A possible approximation is therefore given by the trapezoid rule as On the other hand, the exposure-to-risk (period table) can also be written as N (a, t)da and then approximated by which leads to another possible approximation.Note that the Version 5 estimates of the Human Mortality Database rely on a demographic reasoning leading to an approximation in between the two previous ones -see the analysis in Boumezoued (2016) for more details.
Overall, classical approximations have the advantage of being based on observables only, leading to a closed-form for the death rate estimate.The counterpart of this feature is that the validity of the underlying approximation can be put into question for years in which the population curve s → P (s, x) appears far from linear.
We now detail the recursive and implicit scheme for computing death rate estimates, based on equations linking the death rate with the observables in the Lexis diagram introduced in Subsection 2.2.Before stating the main result, we introduce two key quantities: rst, the Laplace transform of the random variable 'date of birth in year y', introduced as: and second, the cumulative gain in longevity at age x last birthday within the same cohort born in year t − x (a diagonal in the Lexis diagram), that is between those born at exact time t − x and those born at the end of the year [t − x, t − x + 1), dened by: The result at the core of the inference strategy is stated below: Proposition 1.Consider the transport Equation (1).Under Assumption 1, the following equalities hold: and The proof is detailed in the next part, along with a detailed discussion in Subsection 2.6.The resulting algorithm is described in Section 3. 10/22 2.5 Proof of Proposition 1

Proof of Proposition 1
To prove (8), let us rst focus on the exposure-to-risk in the lower triangle E L (x, t) = t+1 t x+s−t x g(a, s)dads.According to the transport equation ( 1), the population density in the lower triangle can be expressed as where the last equality comes from the assumption of a piecewise constant death rate on Lexis triangles.By the change of variable v ← s − a + x − t, the exposure-to-risk can then be rewritten as By straightforward computation, one nally gets the following expression for the exposure-to-risk in the lower triangle: (10) and N (x, t) = 1 0 g(x, t + v)dv so that Let us now derive the population density at exact age x, for any v ∈ [0, 1), where S(x, t) = exp − x−1 y=0 µ L (y, t − x + y) is the survival function at age x for individuals which attained (exact) age x at (exact) time t, and where the cumulative 11/22 2.6 Discussion death rate dierential within the cohort H(x, t) has been introduced in Equation ( 7).Let us now combine the previous results to get and nally, let us apply some renormalization of the right hand side, rst by N (x, t) and second by 1 0 g(0, t − x + v)dv to get the following formula, which reduces to Equation ( 8): The proof of ( 9) follows similarly.Since E U (x, t) = t+1 t x+1 x+s−t g(a, s)dads and g(a, s) = g(x + 1, s + x + 1 − a) exp ((x + 1 − a)µ U (x, t)), then by changing variables, one gets dv, so that

Then as
which leads to the result, as the following equality is veried from the denition in Equation ( 7):

Discussion
Exposure-to-risk interpretation.The equality (10) can be interpreted as follows: for each individual attaining exact age x at time t + v, its contribution to the exposure-to-risk in the lower triangle is 1−exp((v−1)µ L (x,t)) , which depends on the unobserved death rate to be estimated.This contrasts with classical methods which compute approximations of the exposure-to-risk based on observables.At rst order, assuming µ L (x, t) << 1, one recovers that and the related interpretation that the contribution of any individual which attained exact age x at time t + v and living through the lower triangle is simply 1 − v as it can be measured in the Lexis diagram.

12/22
Biased birthday density.The formula derived in (11) shows that the birthdays density at some age x is exponentially biased through H(x, t) compared to the initial birthdays distribution (at age zero).This is true in general in the triangle model for the piecewise constant death rate (Assumption 1), as well as in the period table for which the cumulative death rate dierence matrix reduces to H(x, t) = x−1 y=0 µ(y, t − x + y + 1) − µ(y, t − x + y) where µ(x, t) denotes the period death rate for the square (x, t).Moreover, as one expects in general some mortality improvement over the years, age being xed, one may be interested in interpreting the case H(x, t) < 0 -in this situation, one sees that the initial birthdays distribution is distorted to the highest birthdays (youngest individuals) in the cohort as age goes.This demonstrates how even in a discrete time specication, individuals in the same cohort may experience dierent death rates over life (more precisely they pass through the same rates but do not 'spend the same time' in each triangle or square, so that the resulting survival functions are dierent).However, it is interesting to note that for the cohort table, which by denition assumes that µ(y, t − x + y + 1) = µ(y, t − x + y), the H matrix vanishes, so that the initial birthdays distribution perfectly propagates towards highest ages.
Closed population assumption.Due to the renormalization in the nal result (8), the death rate relates to the closest annual population estimate; therefore, the assumption that the population is closed is only local in terms of population count, as the population estimate N may include population ow eects.Also, the assumption of a closed population implies here that the birthdays distribution at some age is obtained as a transformation of the initial birth distribution -to this extent the assumption applies globally in each cohort.
Link with estimates of the Human Mortality Database.It is worth mentioning that at the time of writing, the Human Mortality Database released an update on February 2018, including in particular a revision of exposure calculation based on monthly birth counts.We now make the link with both the new Version 6 and the old Version 5 of the HMD Methods Protocol.
From (10), it can be shown by performing a rst order expansion in µ L (x, t) that where and

13/22
Let us denote by B t−x the random variable with values in [0, 1] that represents the time of birth in the year t − x, with mean m t−x := E [B t−x ] and variance Under the assumption H(x, t) = 0, that is no mortality improvement between the youngest and oldest individuals within the same cohort, one can write Note again that the assumption H(x, t) = 0 is not consistent with the piecewise constant death rate assumption on Lexis triangles, nor with the framework underlying the period tables.Now, if one uses (6) and replaces µ L (x, t) = D L (x,t) E L (x,t) by its zero order approximation one nally obtains the formula (51) displayed in the Version 6 in the HMD methods protocol: Finally, if one assumes births to be uniformly distributed, then m t−x = 1 2 and σ 2 t−x = 1/12 so that the classical formula in Version 5 methods protocol is recovered (see Appendix E therein for the original derivation):

Numerical results
Based on Proposition 1, one can exhibit a recursive and implicit scheme for computing the death rates, as described below.
Algorithm 1.For age x starting at zero: (i) Solve Equation ( 8) to estimate the death rate µ L (x, t) for the lower triangles of any available year t, (ii) Then based on the previous estimates, solve Equation ( 9) to infer the death rate µ U (x, t) for the upper triangles of any available year t, (ii) Compute the value for H(x + 1, t) = H(x, t − 1) + µ U (x, t) − µ L (x, t − 1) for all possible years t, let x ← x + 1 and go to step (i) .
Remark 1.Note that the method is past dependent -this is natural as any change in past death rates modify the future birthdays distribution in the cohort.This way, 14/22 any revision of past death or population count at (x, t), which may occur in practice, requires the re-use of the methodology which will provide an update of the mortality rates at (y, t + y − x) for y ≥ x.
In Figures 5 to 8, we depict the death rate estimates obtained with the method developed in this paper applied to French data sourced from the Human Mortality Database (annual population estimates, Figure 1 and number of deaths in Lexis triangles, Figure 2) and the Human Fertility Database (births by months, Figure 3).The number of births by month are used to approximate the Laplace transform of the birthdays distribution which is used in the inference process.
The results are compared with estimates as they would be classically computed based on annual observables (see Wilmoth et al. (2007) and Boumezoued (2016) for further details): .
Each gure includes on the right the ratio between the new and the old estimate, which helps quantify the dierences between both.First, the ratio is for several age classes close to one, which indicates that the new estimate does not dier much from the classical one, in other words that the classical approximation is valid.However, one sees strong deviations for specic ages in time, and this translates over time and ages, so that it appears that the anomalies belong to specic generations.As displayed, relative discrepancies between the two estimates can reach up to around +/-20%.To assess this specicity, we depict in Figure 9 mortality improvement rates separated between upper and lower triangles as Clearly, the isolated cohort eects disappear in the new mortality tables: mainly the diagonals around 1915 and 1920, and to a lower extent those born around 1940; note that this indeed corresponds to the shocks in birth numbers as illustrated in Figure 3, which conrms from a mathematical perspective the previous contributions by Richards (2008)

Concluding remarks
In this paper, we proposed an inference strategy for general population mortality tables based on the derivation of formulas in the Lexis diagram, which relate the death rate with annual observables and the intra-year distribution of birthdays over ages.The method therefore uses monthly birth counts to rene classical mortality estimates.The new mortality tables show better features, including the fact that previous anomalies in the form of isolated cohort eects disappear, which conrms from a mathematical perspective the previous contributions by Richards (2008), Cairns et al. (2016) and Boumezoued (2016).
Several topics remain to be addressed to strengthen the methodology.First, it 20/22

Figure 1 :
Figure 1: Population estimates for France by year for one-year age classes extracted from the Human Mortality Database

Figure 2 :
Figure 2: Death counts in Lexis triangles extracted from the Human Mortality Database

Figure 3 :
Figure 3: Number of birth by month extracted from the Human Fertility Database

Figure 4 :
Figure 4: Population used (in grey) for the computation of the cohort death rate (left) and period death rate (right) for age 64 and year 2009.

Figure 8 :Figure 9 :
Figure8: Left: death rates estimated based on the new inference method (in black), and compared to estimates using the standard method based on annual population records (in red).Right: ratio between new and old estimates.Top: Upper triangle.Bottom: Lower triangle.

LT mortality rate at age 60 − Ratio Year
Boumezoued (2016)16))andBoumezoued (2016).Figure7: Left: death rates estimated based on the new inference method (in black), and compared to estimates using the standard method based on annual population records (in red).Right: ratio between new and old estimates.Top: Upper triangle.Bottom: Lower triangle.