Volume 40 - Article 40 | Pages 1153–1166

IRS county-to-county migration data, 1990‒2010

By Mathew Hauer, James Byars

Print this page  Facebook  Twitter


Date received:24 Aug 2018
Date published:07 May 2019
Word count:2782
Keywords:database, Internal Revenue Service (IRS), migration, R
Updated Items:The text of the publication is not changed. The original data file contained two oddities that have been fixed in the csv file now available for download: 1) Some users noticed some odd activity for 1998 inflows for the states of WV and PA. For only these states, only this year, and only for in-migrants, the IRS did not explicitly declare the state and county variables as characters, which led to them being imported as 421 (for state =42, county = 001) instead of 42001. These are now corrected in the csv. 2) 1996 contained missing states. The IRS capitalized some of the raw files (ie, “CO967DEO.XLS” rather than “co967dco.xls”), and they were not imported in the file. This only affected 1996, out-migration, for a subset of states. The authors have fixed this error in the new csv file.
Additional files:demographic-research.40-40 (zip file, 3 MB)


Background: The county-to-county migration data of the Internal Revenue Service’s (IRS) is an incredible resource for understanding migration in the United States. Produced annually since 1990 in conjunction with the US Census Bureau, the IRS migration data represents 95% to 98% of the tax-filing universe and their dependents, making the IRS migration data one of the largest sources of migration data. However, any analysis using the IRS migration data must process at least seven legacy formats of this public data across more than 2000 data files – a serious burden for migration scholars.

Objective: To produce a single, flat data file containing complete county-to-county IRS migration flow data and to make the computer code to process the migration data freely available.

Methods: This paper uses R to process more than 2,000 IRS migration files into a single, flat data file for use in migration research.

Contribution: To encourage and facilitate the use of this data, we provide a single, standardized, flat data file containing county-to-county one-year migration flows for the period 1990–2010 (containing 163,883 dyadic county pairs resulting in 3.2 million county-year observations totaling over 343 million migrants) and provide the full R script to download, process, and flatten the IRS migration data.

Author's Affiliation

Mathew Hauer - Florida State University, United States of America [Email]
James Byars - University of Georgia, United States of America [Email]

Most recent similar articles in Demographic Research

» Smoothing migration intensities with P-TOPALS
Volume 43 - Article 55    | Keywords: migration

» The effect of spousal separation and reunification on fertility: Chinese internal and international migration
Volume 43 - Article 29    | Keywords: migration

» Fathers' migration and nutritional status of children in India: Do the effects vary by community context?
Volume 43 - Article 20    | Keywords: migration

» The long-run effects of poverty alleviation resettlement on child development: Evidence from a quasi-experiment in China
Volume 43 - Article 10    | Keywords: migration

» Gender preferences and fertility: Investigating the case of Turkish immigrants in Germany
Volume 43 - Article 3    | Keywords: migration


»Volume 40





Similar Articles



Jump to Article

Volume Page
Volume Article ID