3. Data and Methods
The 1994 ZDHS, which is the main data source for this study, has a sample of 6128 women living in 230 census enumeration areas (`clusters '), who were interviewed between July and December 1994 [Central Statistical Office 1995]. The survey also includes a `service availability' module on family planning facilities in or in the vicinity of each enumeration area. The latter data were not used in this analysis because of a presumably spurious relationship with education. (Another problem is that initiatives to improve access to contraception in some areas by setting up clinics or outlets or organizing community-based distribution to some extent may be a response to fertility or assumed fertility desires, see e.g. [Angeles, Guilkey and Mroz 1998]).
In the ZDHS survey, the 10 provinces of Zimbabwe were divided into urban and rural areas (except Bulawayo and Harare, which are almost completely urban) to form 18 strata. A set of 2-30 enumeration areas were supposed to be representative of these strata, but stratum-specific weights had to be used to obtain representativity at the national level.
The 10 provinces consist of a total of 70 districts. Aggregate data for each of these districts were taken from the 1992 Population Census publications [Central Statistical Office 1993] and linked to the survey file by means of a list of district identifiers for each enumeration area (provided by the Central Statistical Office on request). (Macro data at the enumeration-area level could only have been established by aggregating from the ZDHS, which did not seem to be worthwhile, because there were less than 30 respondents in each area on average).
3.2 Definition of Education
In this study, five categories were defined for women's educational level: i) no education or incomplete primary education lasting less than 3 full years, ii) incomplete primary education of longer length (3-6 years), iii) complete primary education or incomplete secondary education of less than 2 full years (excluding the 7 required for primary education), iv) secondary education of 2-3 years, and v) secondary education of 4 or more years. These categories for duration of schooling were chosen to fit with the published aggregate data from the 1992 census.
In the census publications, the proportion literate, defined as having 3 or more years of education, was reported for the two sexes separately, whereas the proportions at higher educational levels were for women and men combined. All these data refer to the population 15 years and older. Because a corresponding measure for women in the childbearing ages would be more relevant to this study, a very simple proportional adjustment was made: The proportions of women and men above age 15 with, say, 3-6 years of education were multiplied with a constant factor (i.e., the same for all districts) in order to get a national average proportion equal to that reported for 15-50 year old women in the ZDHS survey. The proportions in the other educational categories were scaled up or down similarly (using other constants, in order to fit with national ZDHS averages). After this procedure, the sum of proportions over the 5 educational categories in the different districts deviated only slightly from 1 (while it, of course, was exactly 1 at the national level). This was corrected by multiplying all proportions with the same district-specific constant. The result of this approach is that cross-district differences in educational distributions across districts are determined by the census data, while the overall level is determined by the survey data. (The reason why the survey data were not used as the only data source was, of course, the relatively small number of observations even within a district). Such an adjustment will have some influence on the point estimates of aggregate education effects, but leave little imprint on significance levels.
The main representation of aggregate educational level in this study was the average years of schooling. In addition, measures of breadth and depth of education, defined as proportion literate and average years of schooling among the literate, were used. Unfortunately, the very strong correlation between the latter two variables did not allow estimation of separate effects. They were instead considered as two alternative specifications. When these averages were calculated, the years of schooling in the five different categories defined above were set to 1, 4.5, 7.5, 9.5 and 12.
3.3 Discrete-Time Birth Rate Models
Discrete-time hazard regression models for the period January 1990 - June 1994 were estimated from the birth histories in ZDHS. Each person contributed a series of 3-month observation intervals. Tests showed this to be a sufficiently short interval. First and higher-order birth rates were modelled separately, in recognition of the widely different individual decisions and social processes involved in entry into parenthood and subsequent parity transitions. In first-birth models, follow-up started at age 14, unless this age was reached before 1990. After first birth, multi-episode models for the transition into a higher parity level were estimated, with observation intervals running from the time of first birth, or from 1990, if the first birth was before that.
Generally, such a retrospective approach strongly restricts the number of variables to include in the models. Many attitudes and characteristics at interview may be different from those earlier in the 5-year observation period, and, even more critically, they may partly be a response to births. Fortunately, educational level is not a very problematic variable. In principle, the relatively low fertility that is found during 1990 - 1994 among, for example, women who reported at interview that they had taken some secondary education, does not only reflect a causal effect of such factors as the skills obtained at secondary school. It may also reflect that pregnancy or childbirth among those with less education may have inhibited their entry into or continuation of secondary school. However, only estimates from first-birth models suffer from such a simultaneity bias, and only at certain ages and for certain educational levels. For example, the estimated effects of a short secondary education will be biased only for early teenage years, when attending the first part of secondary school is still an option. Primary education takes place too early to be influenced by childbearing, and higher-order birth models are not hampered by such problems, because, at this stage, few women would take further education anyway. It was experimented with models where both enrollment and educational level (lagged one year) were included, on the basis of an assumption that school attendance is continuous from age 6.This gave very similar results.
3.4 Migration as a Complicating Factor
In birth rate models for 1990-94, it would be most relevant to include the educational distribution in the district in which the women lived during that period or before (because their surroundings at an earlier age may have had a lasting influence on, for example, their attitudes). Fortunately, the data available in this study came close to allowing this. They provided information about the educational distribution of the district in which the women lived at interview in 1994, and where the large majority must have lived also the previous five years. As explained above, the educational data were for a combination of 1992 (measurement of district variation) and 1994 (measurement of overall level).
It is more of a problem that a woman's individual education is determined by characteristics of her place of residence as a child or adolescent. These are not known, except that it is reported whether she largely lived in a city or town or in the countryside up to age 12. Her current place of residence, which is the basis for this analysis, may even be a consequence of her education. For example, having some years of schooling presumably makes the cities more attractive to a woman grown up in the countryside. In other words, the effect of investments in a woman's education may actually be larger than indicated by estimates from models where urban/rural at time of interview is included as a control (assuming a fertility-inhibiting effect of both education and urban residence).
Another complicating factor is that the effects of aggregate education, net of individual education, not only are confounded by unobserved aggregate characteristics, as explained above (section 2.3), but also by unobserved individual characteristics due to selective migration. For example, poorly educated women who have moved to (or remained in) a place where the general educational level is high may be different from those who have moved to (or remained in) a place where few are educated.
3.5 Models for Fertility Desires, Post-Partum Susceptibility and Contraceptive Use
Special interest is, of course, attached to the birth rates, but the data also allow a check of how important determinants, such as fertility desires, post-partum susceptibility and contraceptive use, are influenced by aggregate education.
Logistic models for the probability of wanting another child within two years, for the probability of being susceptible, and for the probability of using modern contraception were estimated for women who were married (including those who reported less formal unions) at interview, and who were non-pregnant and had at least one child at that time. A considerable proportion of these women, including those in monogamous unions, reported that their husband lived elsewhere, but they were not considered a separate group in the analysis.
Susceptibility was defined as having had intercourse during the last month before interview and not being amenorrhoeic [Note 7]. The susceptibility models were further restricted to mothers of children younger than 2 years.
Two types of contraceptive-use models were estimated. One of the models was restricted to women who had been sexually active during the last month, who were neither infecund nor amenorrhoeic at interview, and who reported that they did not want another child within the next two years (although also those who want a child so soon may use contraception, in order to avoid an immediate pregnancy). In the other model, there were no such conditions. Many studies, including that by Amin et al. [Amin, Diamond and Steel 1996], have been based on such a model exclusively, which means that one cannot know whether low contraceptive use is due to little need for contraception or a poor ability to satisfy a substantial need.
3.6 Multilevel Models
Individuals who live in the same enumeration area may share some unobserved characteristics, which means that standard assumptions in regression analysis about independent observations are not reasonable. So-called multilevel models have been developed to handle these problems, and have been applied in many demographic research projects during the last few years. In this study, random terms at the enumeration-area-level or district-level were added to the constant term in some models to check the robustness of the main conclusions [Note 8]. This simple version of the multilevel model was estimated in MLwiN [Goldstein et al 1998].
A Search for Aggregate-Level Effects of Education on Fertility, Using Data from Zimbabwe
© 2000 Max-Planck-Gesellschaft ISSN 1435-9871