3.1 Estimation of the modified DeMoivre hazard
The Bulgarian mortality pattern for males and females during 1992–93 has already been depicted in
Figure 1. In this Section we use the modified DeMoivre hazard function in combination with unobserved
heterogeneity in order to evaluate whether the differential increase in mortality by age between males and
females in Bulgaria can plausibly be explained by a selection process in which the higher
male mortality level leads to a quicker selection of the male population towards lowfrailty
individuals.
Our analyses are based on an unique database for Bulgaria which is based on a linkage
between the death records for the period 5th of December 1992 – 31st of December 1993 and the
census on 4th of December 1992. This dataset is the first comprehensive populationbased
data on mortality in Bulgaria that includes a broad array of socioeconomic information for
individuals who are at risk of death. The linkage of the death certificates to the census records is
carried out using a personal identification number (PID) included in all official records of an
individual in Bulgaria. The linkage between the census and the death registration is of high
quality, and in total 92.67 per cent of all death certificates are linked to the census records.
Among the linked deaths, 95.07 per cent are based on the PID number. Only 4.93 per cent of
the deaths are linked using other identification variables (e.g., place and region of residence,
birth day, sex, education, marital status) because the PID number is missing or incomplete.
A relatively small fraction of 7.33 per cent of all deaths could not be linked to the census
records. [Note 3]
The subsequent analyses include all individuals in Bulgaria who are at least 40 years old at census and
are below age 100 either at death or on 31st December 1993. Our data thus comprise 1.87 million males of
whom 57,221 die during the observation period, and 2.09 million females of whom 45,831 die during the
13 months after the census.
We first estimate a standard piecewiseconstant hazard model (with constant hazards in twoyear age
intervals) separately for males and females in order to obtain a nonparametric estimate for the
observed mortality pattern. The respective estimates have already been depicted in Figure
1(a), and they will also be included in subsequent Figures for comparison with our parametric
estimates.
In Table 1 we report the results of different parametric specifications. Model 0 is a standard Gompertz
model without unobserved heterogeneity that allows for malefemale differences in both parameters a and
b. These estimates have been used for the fitted Gompertz hazard curve in Figure 1(a). The
estimates reveal that the ‘levelparameter’ a for females is only a fraction of about 0.27 of the
respective parameter for males, while the ‘slopeparameter’ b for females exceeds that of
males by 0.025. The latter difference implies that the relative increase of mortality rates by
each year of age is 2.5 percentage points higher for females than for males and it leads to
the — already discussed — strong convergence between the male and female mortality rates in
Bulgaria.
Models 1–4 in Table 1 include a Gammadistributed frailty and are based on a modified DeMoivre
hazard function. Except for Model 4, where the maximum attainable age w is estimated from the data,
these models assume a w which is equal to Madame Calment’s age at death. The parameters a, b, and s^{2}
are specified as in equations (9–11) using sex as the only covariate.
The simplest Model 1 estimates the male and female mortality pattern using identical parameter values
b and s^{2} for both sexes, while the parameter
a is allowed to vary between males and females in order to capture the different mortality levels.
That is, the model allows for different levels of mortality by sex, but
it assumes an equal ‘slopeparameter’ b across sex. The model therefore attempts to
explain the differential increase in mortality with age merely by the selection hypothesis, i.e., the
fact that the male population faces a more rapid selection towards lowfrailty individuals due to the
higher overall male mortality level. Most importantly, the model yields an estimate of
^{2} = exp(.6186) = 0.54, indicating a
quite substantial heterogeneity in the population. According to this estimate, about 28 per
cent of the population at age 40 have a frailty of z £ .5 and 9.7 per
cent have a frailty of
z ³ 2.
Figure 4(a) shows that this model traces the convergence between the observed male and female
mortality rates with increasing age quite well (full lines), despite the fact that the mortality rates for a
constant frailty z = 1 increase in a parallel fashion (dasheddotted line). Hence, Model 1 contributes a
substantial part of the observed convergence between male and female mortality rates to the differential
strength of the selection process in the male and female population (a formal measurement of the fit of this
model and a comparison with alternative Gompertz specifications are provided in the sensitivity analysis
in Section 3.3 below).
Model 2 in Table 1 provides an extension of the above model and incorporates a potentially different
degree of heterogeneity between the male and female populations. Possible reasons for such a differential
variance in unobserved frailty could be a greater variation in life styles (such as smoking habits or other
risky behaviors) among the male as compared to the female population. Indeed, the estimates in
Table 1 suggest that the male variance of frailty is s_{male}^{2} = exp(.4876) = 0.61, while the
corresponding variance for females is 35% smaller (s_{female}^{2} = 0.40). This implies that about 30
per cent of males, but only 22 per cent of females, have a low frailty of z £ .5, and more
than 10 per cent of males, but about 8 per cent of females, have a high frailty of z ³ 2 at age
40.
The resulting fit of the Model 2 is depicted in Figure 4(b). Since this model allows for an additional
parameter, it fits the Bulgarian mortality pattern slightly better than our earlier model. Differences in the
observed slope of the mortality pattern in Model 2 are again only due to differences in the overall
mortality level and in subsequent differences in the selection process in a heterogeneous population.
The present model therefore suggests that the male population may be more heterogeneous
with respect to various biological or socioeconomic determinants of mortality. The specific
investigation of this issue is beyond the scope of the present paper, but it is feasible on the basis of
our data that includes a broad range of covariates about individuals. Most important in the
present context is that Model 2 further confirms our argument that differential selection process
provides a plausible explanation for the malefemale difference in the increase of mortality with
age.
An alternative generalization of our initial estimation is provided in Model 3 in Table 1, where the
parameter b, instead of the variance s^{2}, is allowed to vary across sexes. The coefficients show that the
relative difference in the ‘slopeparameter’ b between males and females is substantially reduced by
incorporating unobserved heterogeneity as compared to the Gompertz model without any frailty
considerations. This finding is again consistent with an important malefemale difference in the strength
of selection towards lowfrailty individuals in the population. The fit of this model is given in
Figure 4(c), where this model performs slightly better than the two earlier models. However,
this improvement is not surprising since the specification of a separate slopeparameter b for
males and females provides a direct modelling of the differential mortality increase between
sexes.
Finally, model 4 in Table 1 estimates the maximum attainable age w in addition to the remaining
parameters a, b, and s^{2}. While our earlier models were based on a predetermined w of 122.45 years,
the present estimate reveals a w of slightly above 104 years. Moreover, the variance of the
unobserved frailty has increased to s^{2} = 1.10 (with s^{2} = 1.10, 11 per cent of the population
have a frailty of z £ .1, 41 per cent have a frailty of z £ .5, and 14 per cent have a frailty of
z ³ 2). While these parameter estimates yield a ‘bestfitting model’ in Figure 4(d), the estimate
for w is not plausible and it depends strongly on the age at which the data are censored. For
instance, if the data include individuals who survive to age 105 or 110, then the respective
estimates for w increase respectively to 108 and 111.5. and that for s^{2} decline to 0.91 and
0.79. The estimates for the parameters a and b are relatively insensitive to changes in the age
range above 100. Hence, while Model 4 provides the best fit of the Bulgarian mortality data
using a modified DeMoivre hazard with no sexdifferences in the slope parameter b or the
variance of frailty s^{2}, the estimates of this model about w cannot be interpreted in terms of a
maximum lifespan because the estimation was deliberately censored at age 100 in order to
focus on the convergence of male and female mortality in the agerange 40–100. Without
survivors to very old ages, however, an estimation that assumes a plausible value for w as in
Models 1–3 seems preferable to the direct estimation of w from the data. It is beyond the scope
of this paper to apply the modified DeMoivre hazard explicitly to reliable mortality data at
ages 100+, but these future applications provide a possibility to estimate interpretable and
realistic values for the maximum attainable age w and the changes of this limit to lifespan over
time.
3.2 Comparison with piecewiseconstant hazard with unobserved frailty
In order to assess the empirical plausibility of the modified DeMoivre hazard model we compare the
above results with a nonparametric estimation of the hazard curve. This nonparametric alternative is
feasible if the mortality hazards for males and females, conditional on the frailty z, differ only by a factor
of proportionality. In this case it is possible to combine Gammadistributed relative frailty with a
piecewiseconstant hazard function m^{PW }(x) that does not impose parametric restrictions on the shape of
the mortality pattern [Note 4]. This nonparametric estimation of the baselinehazard m^{PW }(x) can then be
compared to the modified DeMoivre hazard m^{MD}(x) in order to assess the implications of the parametric
assumptions used in the previous section.
The left graph in Figure 5 shows the observed male and female mortality level in Bulgaria along with
the estimated baseline hazard m^{PW }(x) for individuals with a constant frailty z = 1 and the corresponding
observed hazard
^{PW }(x) obtained from the piecewiseconstant estimation. The model fits the observed
male and female mortality pattern relatively well and the fit is comparable to our earlier Model 4 in Figure 4.
The primary question regarding the estimation of this piecewiseconstant frailty model
in Figure 5(a) is whether the respective estimates for the variance of the frailty distribution
and the increase of the mortality rates for a constant frailty z = 1 are consistent with our
knowledge about human mortality. In order to investigate this issue, we compare in Figure 5(b) the
estimated baseline hazards m(x) obtained from the piecewiseconstant and the modified DeMoivre
model.
The graph reveals that the piecewiseconstant specification yields the fastest increasing baseline
hazard across all estimated models, and it also yields the highest variance s^{2} for unobserved frailty
(s^{2} = 1.57). The differences are largest between the piecewiseconstant model and the DeMoivre models
with w = 122.45 (Models 1 and 2 in Figures 4 and 5b), while it is only modest when compared to the
DeMoivre model where w is estimated from the data (Model 4 in Figures 4 and 5b). Since the last model
already implied an implausible value of only 104 years for the highest attainable age w, we also
consider the increase of the baseline hazard obtained from the piecewiseconstant estimation as
too steep. Similarly, the variance of the frailty distribution in the piecewiseconstant model
seems unrealistically high since it suggests that about 19 per cent of the initial population
have a frailty of z £ 0.1 and about 15 per cent have a frailty of z ³ 2. However,
Iachine et al. [1998] obtained similarly large values for the variance of unobserved frailty from twin
data.
While the nonparametric estimation of the baseline hazard with a piecewiseconstant model certainly
has its virtues, it can also lead to estimates of the baselinehazard that are implausible. The
modified DeMoivre model with w set to 122.5 years provides a possibility to restrict the increase
of the baseline hazard with age to values that are consistent with observed survival to very
old ages. Moreover, the differences between the observed and estimated mortality pattern in
the simplest DeMoivre model in Figure 4(a) suggest alternative specifications that are not
feasible with the piecewiseconstant estimation. For instance, in Models 2 and 3 in Figure 4 we
allow the variance of the frailty distribution or the slopeparameter b to be different between
males and females. Both extensions substantially improve the fit of the model. In our opinion
these extensions provide a more plausible, and probably also more accurate description of the
mortality dynamics than the piecewiseconstant analysis. The modified DeMoivre function
therefore provides a very suitable hazard function for the application of frailty models to adult
ages, and it allows specifications of the parameters that are not available with nonparametric
estimations.
3.3 Sensitivity analysis of estimated parameters
In this Section we provide a sensitivity analysis in order to investigate the extent to which the coefficients
obtained from the modified DeMoivre hazard function depend on the choice of the maximum attainable
age w. Our earlier estimates of Models 1–3 in Table 1 and Figure 4 were based on w = 122.45, i.e.,
Madame Calment’s age at death. In order to evaluate the sensitivity of our estimates with respect to
the choice of w, we reestimate Models 1 and 2 in Table 1 using a w that ranges from 105
to 150 years (we do not report the sensitivity analysis for Model 3 since it leads to similar
results).
The topleft graph in Figure 6 shows the relative deviation of the estimated coefficients a_{male}, a_{female},
b, and s^{2} as a function of w. The topright graph in this Figure shows the goodnessoffit of the Model 1 as
a function of w. The goodnessoffit is calculated as 1  RSS/SST , where RSS is the residual sum of
squares on the logarithmic scale and SST is the total sum of squared deviations from the mean on the
logarithmic scale.
The Figure shows that the parameters a and b are relatively insensitive to the specification of w, while
the estimated variance of frailty depends quite strongly on the choice of w. The latter is not very surprising
since w determines the convexity of the hazard function m^{MD}(x) in equation (5), i.e., the hazard for
individuals with a constant frailty z = 1. A low w implies a quite strongly increasing hazard m^{MD}(x) with
age. This subsequently results in a higher estimate for s^{2} in order for the model to fit to the observed
mortality rate.
The topright graph in Figure 6 shows a goodnessoffit analysis of Model 1, i.e., a model that allows
only for leveldifferences in mortality across sex but no differences in the ‘slopeparameter’ b The
dashdotted lines in this graph reveal on one hand the fit of the Gompertz model with separate parameters
a and b for males and females (i.e., Model 0 in Table 1), and on the other hand the Gompertz model with b
constrained equal across sexes. The fit of Model 1 is between these two benchmarks, and it tends to
decrease the higher is the value for w. For low values of w, Model 1 fits almost as good as
Model 0 with no parameter restrictions across sex. With increasing w this goodnessoffit
declines. Ultimately it approaches the lower dashdotted line because a rising w renders the
modified DeMoivre hazard more and more like a Gompertz model. However, for values of w
below 140 years the goodnessoffit of Model 1 is closer to the Gompertz model with separate
slopeparameters for males and females than that of the Gompertz model with equal b for both
sexes.
The sensitivity of the estimates for s^{2}, therefore, is not as severe as the topleft graph in Figure 6 may
suggest. First, the model with w = 122.45, i.e., Madame Calment’s age at death, provides the bestfitting
which is based on a maximum attainable age w that is at least as high as the highest age lived by any
person so far. Second, for a quite broad range of plausible choices for w, say, between 115 and 130 years,
the main conclusion of Model 1 remains unaltered: a substantial part of the malefemale differences in the
slope of the observed mortality pattern can be explained by a differential strength of the selection process
towards lowfrailty individuals that is caused by differences in the overall level of mortality between
males and females.
The bottomleft and bottomright graph in Figure 6 show the corresponding sensitivity analysis for
Model 2 in Table 1. Similar to Model 1, the estimates for a and
b are not very sensitive with respect to the
choice of w, while the estimates for
s_{male}^{2}
and s_{female}^{2}
change substantially with w. The analysis in
the bottomright graph, however, reveals that these changing estimates for
s^{2}parameters leave the
goodnessoffit of the model almost unaffected.
The choice of w in Model 2 is thus not essential for the main
conclusion of the analysis: A
modified DeMoivre hazard model with only one ‘slopeparameter’ b for both males and females
provides a very good description of the Bulgarian mortality pattern. Moreover, this model
attributes a substantial part of the differential malefemale increase in mortality by age to a
differential strength of the selection process which is caused by the higher overall level of male
mortality.
Our preferred choice for w in analyses with the modified DeMoivre hazard is
w = 122.45 years based
on Madame Calment’s age at death. While the specific estimates for
s^{2} are sensitive to this choice, the
primary conclusion resulting from the incorporation of frailty in the analysis of Bulgarian mortality is
very robust with respect to this specification.
