|
In the United States, census counts are used to
apportion congressional seats to states, and to draw the boundaries of electoral
districts within states (“redistricting”). The counts also enter the
formulas for allocating tax funds to states, counties, cities, and smaller
jurisdictions. Thus, the census has some effect on the distribution of power and
money [Skerry 2000]. Controversy over proposed statistical adjustments of
population counts from decennial censuses has stimulated an extended program of
demographic research over twenty years [see 4]. These issues have been brought
again to the fore by the current Director of the U.S. Census Bureau,
Kenneth Prewitt.
In considering whether to
certify adjusted or unadjusted counts as the official census counts, Prewitt
directs attention to the problem of geographical heterogeneity in quality of
coverage, which limits the accuracy of small-area estimates; he acknowledges the
renewed importance of data from 1990, recognizing that decisions will be made
before much of the data from the 2000 Census evaluation process will become
available: he favors certifying the adjusted counts, barring some unforeseen
developments when the data are collected and analyzed [Prewitt 2000].
Because the U.S. Supreme Court ruled that federal law mandates the use of
unadjusted population counts for apportionment, the impact of certification will
be on the use of census data for redistricting within states, and the allocation
of tax funds to state and substate jurisdictions [Brown et al. 1999].
A large data set for studying geographical
heterogeneity in quality of coverage for substate areas as well as for states
was assembled around 1990 by the U.S. Census Bureau in its P-12 Evaluation
Project. However, most analysis was directed toward state-by-state
heterogeneity. In this paper, we analyze the scale of substate heterogeneity as
revealed by the P-12 data, to provide scientific background for the political
decisions at stake in the Prewitt report.
The issue of heterogeneity should be viewed in the broader statistical context of
small-area estimation. Classical statistical sampling theory is about inferences
upward from the part to the whole, from sample to population. Accuracy is
limited by the size of the sample, essentially through the square root of the
sample size. In small-area estimation, the situation is different. The aim is to
make inferences sideways from a few parts to all other parts. The plan for the
U.S. Census in 2000 calls for extrapolating sideways from a sample of
12,000 block clusters to separate estimates of census undercount for tens of
thousands of local areas and each of 5 million inhabited Census blocks. Accuracy is limited not only by
sample size but also fundamentally by the amount of heterogeneity from local area to local
area. The square root law ceases to apply - even if all data-processing can be done without
error.
It is standard practice to apply uniform ratio estimators and other small-area techniques only after
stratifying on available variables like age, sex, and race [Ghosh and Rao 1994]. Through stratification
some heterogeneity is removed, leaving residual heterogeneity which at some point still imposes
diminishing returns on the gains in accuracy achievable from larger sample size. Before 1990, little was
known about levels of residual heterogeneity and the pace of diminishing returns to sample size. Since
then, interest in census adjustment has led to a series of studies in the United States [see 4], principally
focussed on state-to-state heterogeneity in various indices of enumeration difficulty. The U.S. Census
Bureau created, in its P-12 Evaluation Project, a unique data set suitable for studying local as well as
state-level heterogeneity. The present study exploits P-12 to derive the first - albeit somewhat tentative -
measurements of residual heterogeneity for local areas containing on the order of 10,000 people
each.
The measurements of heterogeneity in this study provide a benchmark for assessing small-area
undercount estimation in the census. The issues are summarized in [Prewitt 2000], with extensive
references; for another perspective, see [Brown et al. 1999]. An underlying probability model is useful for
distinguishing the effects of geographical heterogeneity treated here from other components of error
[Freedman, Stark and Wachter 2000]. As well as playing a role in discussions of Census adjustment, the
measurements in the present study also bear on the likely accuracy of small-area estimation in many other
applications. They follow, in an American context, on the new scientific interest in structural properties of
geographical heterogeneity kindled by [Le Bras 1993].
Several questions are frequently asked about research in this area. (i) Why study indices of
enumeration difficulty rather than undercounts themselves? (ii) Can residual heterogeneity not be
eliminated by finer stratification? (iii) What about other datasets?
(i) The Census does not measure its own undercount. Surveys that do measure undercounts, large as they
are, are much too small to measure heterogeneity at any fine geographical scale. Problems with data
quality in the 1990 Post-Enumeration Survey (PES) also restrict its usefulness for appraising
heterogeneity. Data from the 2000 PES, renamed “Accuracy and Coverage Evaluation” (ACE), will not be
available for some time, and research projects to assess the data quality in ACE have uncertain completion
dates.
(ii) Possibilities for finer stratification are limited. For Census Bureau purposes, only variables recorded
for all respondents on Census short forms are usable for stratification. Moreover, there is little evidence to
show that doubling or tripling the number of post-strata would achieve any marked reduction in
heterogeneity; we return to this point, below.
(iii) Other publicly available data sets known to us lack one or another key feature of P-12. The U.S. Public
Use Microdata Samples (PUMS) only identify geographical location down to “Public Use Microdata
Areas” (PUMAs) with more than 100,000 people each. The Census Bureau’s Summary Tabulation Files
have precise geography but little cross-classification by stratifying variables. The other large U.S. surveys,
like the Current Population Survey, are much smaller than P-12. Similar limitations of one kind or another
apply to data sets collected in other developed countries. The data for French communes achieve
geographical resolution an order of magnitude finer than P-12, but lack stratification variables [Le
Bras 1993].
Every silver lining has its cloud, and P-12 is no exception. The P-12 data were aggregated by the
Bureau in a data-dependent way into “superblocks,” in order to protect the confidentiality of the
respondents. Superblocks range in size from a city block in Manhattan to some large swath of rural
Wyoming. The data we have are based on superblocks: our summary statistics show the heterogeneity in
these units, thereby averaging across a full spectrum of more familiar geography. The data suggest,
however, that a typical superblock represents a locality whose order of size is 10,000 inhabitants, and our
results are best interpreted on that geographic scale. More formal arguments are postponed to the
Appendix.
|