Volume 40 - Article 40 | Pages 1153–1166

IRS county-to-county migration data, 1990‒2010

By Mathew Hauer, James Byars

Print this page  Facebook  Twitter

 

 
Date received:24 Aug 2018
Date published:07 May 2019
Word count:2782
Keywords:database, Internal Revenue Service (IRS), migration, R
DOI:10.4054/DemRes.2019.40.40
Updated Items:The text of the publication is not changed. The original data file contained two oddities that have been fixed in the csv file now available for download: 1) Some users noticed some odd activity for 1998 inflows for the states of WV and PA. For only these states, only this year, and only for in-migrants, the IRS did not explicitly declare the state and county variables as characters, which led to them being imported as 421 (for state =42, county = 001) instead of 42001. These are now corrected in the csv. 2) 1996 contained missing states. The IRS capitalized some of the raw files (ie, “CO967DEO.XLS” rather than “co967dco.xls”), and they were not imported in the file. This only affected 1996, out-migration, for a subset of states. The authors have fixed this error in the new csv file.
Additional files:demographic-research.40-40 (zip file, 3 MB)
 

Abstract

Background: The county-to-county migration data of the Internal Revenue Service’s (IRS) is an incredible resource for understanding migration in the United States. Produced annually since 1990 in conjunction with the US Census Bureau, the IRS migration data represents 95% to 98% of the tax-filing universe and their dependents, making the IRS migration data one of the largest sources of migration data. However, any analysis using the IRS migration data must process at least seven legacy formats of this public data across more than 2000 data files – a serious burden for migration scholars.

Objective: To produce a single, flat data file containing complete county-to-county IRS migration flow data and to make the computer code to process the migration data freely available.

Methods: This paper uses R to process more than 2,000 IRS migration files into a single, flat data file for use in migration research.

Contribution: To encourage and facilitate the use of this data, we provide a single, standardized, flat data file containing county-to-county one-year migration flows for the period 1990–2010 (containing 163,883 dyadic county pairs resulting in 3.2 million county-year observations totaling over 343 million migrants) and provide the full R script to download, process, and flatten the IRS migration data.

Author's Affiliation

Mathew Hauer - Florida State University, United States of America [Email]
James Byars - University of Georgia, United States of America [Email]

Most recent similar articles in Demographic Research

» Maternal educational attainment and infant mortality in the United States: Does the gradient vary by race/ethnicity and nativity?
Volume 41 - Article 25    | Keywords: migration

» Migration influenced by environmental change in Africa: A systematic review of empirical evidence
Volume 41 - Article 18    | Keywords: migration

» Back to replacement migration: A new European perspective applying the prospective-age concept
Volume 40 - Article 45    | Keywords: migration

» Distinguishing tempo and ageing effects in migration
Volume 40 - Article 44    | Keywords: migration

» Gender-specific effects of commuting and relocation on a couple's social life
Volume 40 - Article 36    | Keywords: migration

Articles

»Volume 40

 

Citations

 

 

Similar Articles

 

 

Jump to Article

Volume Page
Volume Article ID