Volume 51 - Article 41 | Pages 1299–1350  

Using online genealogical data for demographic research: An empirical examination of the FamiLinx database

By Andrea Colasurdo, Riccardo Omenti

Abstract

Background: Online genealogies are promising data sources for demographic research, but their limitations are understudied. This paper takes a critical approach to evaluating the potential strengths and weaknesses of using online genealogical data for population studies. We focus on the FamiLinx dataset, which contains demographic information and kinship ties across multiple countries and centuries.

Objective: We propose novel measures to assess the completeness and the quality of demographic variables in the FamiLinx data at both the individual and the familial level over the 1600–1900 period. Utilizing Sweden as a test country, we investigate how the age–sex distribution and the mortality levels of the digital population extracted from FamiLinx diverge from the registered population.

Methods: We employ descriptive statistics, negative binomial regression modeling, and standard life table techniques for our measures of completeness and quality.

Results: Missing values and accuracy in demographic information from FamiLinx are selective. When one demographic variable is available, researchers can effectively anticipate the availability of other demographic information. The completeness and quality of demographic variables within kinship networks are markedly higher for individuals with more complete and accurate demographic information. Populations from FamiLinx display lower mortality levels than the registered population and their representativeness improves towards the end of the 19th century.

Contribution: This study sheds new light on the opportunities and challenges of harnessing online genealogies for demographic research. Although this data source offers much promise, its usability in population studies is dependent on the quality and completeness of its recorded demographic information and their selectivity.

Author's Affiliation

Most recent similar articles in Demographic Research

Using household death questions from surveys to assess adult mortality in periods of health crisis: An application for Peru, 2018–2022
Volume 51 - Article 8    | Keywords: adult mortality, data quality, household surveys, Peru

The quality of fertility data in the web-based Generations and Gender Survey
Volume 49 - Article 3    | Keywords: accuracy, data quality, fertility, Generations and Gender Survey (GGS)

How does the demographic transition affect kinship networks?
Volume 48 - Article 32    | Keywords: demographic transition, demographically dense ages, kinship network, net reproductive rate, time-invariant model

Mexican mortality 1990‒2016: Comparison of unadjusted and adjusted estimates
Volume 44 - Article 30    | Keywords: data quality, demography, Human Mortality Database (HMD), life expectancy, life tables, Mexico, mortality

Evaluating interviewer manipulation in the new round of the Generations and Gender Survey
Volume 43 - Article 50    | Keywords: data quality, Generations and Gender Survey (GGS), interviewer effects, retrospective histories, survey methods