Abstract Statistical Background
1 Introduction

Census data for calculating period age-specific fertility rates typically come in one of two forms. For each woman of childbearing age, census questionnaires usually record either the number of children born in the last year (BLY), or the date of the woman's last live birth (DLB). DLB and BLY questions are both common, and some censuses ask both. In recent surveys, the United Nations [12, 13] reported that among 262 national censuses taken between 1965 and 1994 in Africa, Asia, South America, and North America (excluding the USA and Canada), 63 asked DLB questions only, while another 50 asked both DLB and BLY questions (see [8] for more detail). In the most recent round of censuses nineteen countries, including Kenya, Indonesia, Sudan, Vietnam, Colombia, and Brazil, collected DLB data exclusively.

When estimating fertility rates from census data (still a common situation in many countries, particularly when estimates are for subnational areas), efficient use of DLB data is often an important concern. In principle, DLB data contain more information than BLY data, because a researcher can observe not only the fertility histories of the sampled women in the past year, but also many other births and periods of exposure that occurred more than one year earlier. In practice, demographers generally do not use all of the fertility information inherent in DLB data. Standard procedures for estimating age-specific fertility rates from DLB data merely convert to BLY form:


and then utilize this censored version of DLB in all subsequent calculations. Caution with DLB data stems in part from an early history of statistical problems with more ambitious uses (see [10] and [11] for proposed applications; [9] and [14] for critiques), and in part from the fact that DLB data do not provide researchers with histories of uniform lengths for all women.

In a recent paper [8], the first author proposed a new method for consistent estimation of period fertility from DLB information. The essential intuition is to change the unit of analysis from women to woman-years. A sample of N women who report the date of their last live birth will, in general, contain fertility information on many more than N woman-years. For example, a woman who is interviewed on her 32nd birthday and reports that her last live birth occurred 46 months earlier provides information on not one, but four, years of exposure to fertility risks: She had one birth in age interval (28,29], followed by no births in age intervals (29,30], (30,31], and (31,32].

The previous paper [8] derived maximum likelihood procedures for estimating fertility models from open-interval data. Like standard BLY calculations based on {1}, DLB estimators are consistent under the strong mathematical assumptions of many formal demographic models (unchanging fertility schedules and identical fertility rates for all women of a given age, regardless of parity). DLB estimators also have low bias under more realistic conditions. In contrast to BLY methods, estimators based on the multiple woman-years implicit in open-interval DLB data have much lower sampling variability. Thus, when basic fertility information comes from DLB data, it is possible to produce far more accurate fertility estimates from small samples or for small populations.

The earlier paper [8] derived the mathematical structure for estimating any fertility model from DLB data, but gave examples only for one simple type of fertility schedule (piecewise-constant, with five-year age groups, and no parametric restrictions on the shape of the age schedule). In this paper we demonstrate more fully how to estimate parametric fertility models from DLB data. As a specific example we illustrate maximum likelihood estimation for the M and m parameters in a Coale-Trussell marital fertility model [4].

We also provide two examples of the type of analysis for which increasing the accuracy of fertility estimates is useful. We use a set of small-area data from the state of Minas Gerais, Brazil, to illustrate the analytical gain from using the full DLB data in place of the censored BLY version. We produce maps and spatial statistics from alternative 1991 estimates of Coale and Trussell's m parameter for the state's 723 municipalities, and show how the increased precision of DLB estimates leads to clearer spatial patterns of fertility control. In addition, we illustrate improvements in regression analysis of fertility when using DLB, rather than the usual BLY, fertility data.

Abstract Statistical Background

logo70.gif (2450 bytes)

Estimating Parametric Fertility Models
with Open Birth Interval Data
Carl P. Schmertmann
André Junqueira Caetano
© 1999 - 2000 Max-Planck-Gesellschaft ISSN 1435-9871