Volume 15 - Article 1 | Pages 1-20

A simulation-based assessment of the bias produced when using averages from small DHS clusters as contextual variables in multilevel models

By Øystein Kravdal

Date received:03 Jan 2006
Date published:18 Jul 2006
Word count:4522
Keywords:bias, clustering, Demographic and Health Surveys (DHS), measurement error, multilevel model, simulation


There is much interest these days in the importance of community institutions and resources for individual mortality and fertility. DHS data may seem to be a valuable source for such multilevel analysis. For example, researchers may consider including in their models the average education within the sample (cluster) of approximately 25 women interviewed in each primary sampling unit (PSU). However, this is only a proxy for the theoretically more interesting average among all women in the PSU, and, in principle, the estimated effect of the sample mean may differ markedly from the effect of the latter variable.
Fortunately, simulation experiments show that the bias actually is fairly small - less than 14% - when education effects on first birth timing are estimated from DHS surveys in sub-Saharan Africa. If other data are used, or if the focus is turned to other independent variables than education, the bias may, of course, be very different. In some situations, it may be even smaller; in others, it may be unacceptably large. That depends on the size of the clusters, and on how the independent variables are distributed within and across communities. Some general advice is provided.

Author's Affiliation

Øystein Kravdal - Universitetet i Oslo, Norway [Email]

