Replication materials for: “Summertime, and the livin’ is easy: Winter and summer pseudoseasonal life expectancy in the United States” to appear in Demographic Research This file prepared by Tina Ho & Andrew Noymer. noymer@uci.edu PREAMBLE: This project uses two languages, IDL and Stata. IDL is further described at: http://www.harrisgeospatial.com/ProductsandTechnology/Software/IDL.aspx and is very similar to MATLAB. GDL is syntax-compatible with IDL: http://gnudatalanguage.sourceforge.net/ Stata is further described at: http://www.stata.com/ and will be more familiar to the social science community. FILES IN THIS ZIPFILE: ------------------------- flowchart.txt THIS FILE deaths.zip described at (i) below exposure.zip described at (ii) below misc_programs.zip described at (iii) below misc_data.zip described at (iv) below All files w/extension "*.do" are Stata programs All files w/extension "*.pro" are IDL programs All files w/extension "*.tsv" are tab-sep-value ASCII data Source data: -------------------- (i) Deaths: deaths.zip this file contains male_all_causes.txt and female_all_causes.txt, which are tab-delimited ASCII files of deaths by 22 age groups and by month. Extracted from NCHS data: www.cdc.gov/nchs/data_access/vitalstatsonline.htm www.nber.org/data/vital-statistics-mortality-data-multiple-cause-of-death.html The publicly-available data USA mortality data has monthly time resolution. (ii) Exposures: exposure.zip this file contains exposure_22groups_monthly.txt, exposures by 22 age groups and by month. These have been interpolated from HMD annual exposure data. The IDL code that does this is interpolate-denominators.pro (this takes as input: leap-year.txt [this is just a table of leap years so that the program can know how manyd do budget for February] and hmd_1x1 [these are 1x1 exposures, straight from HMD website] and gives as output: exposure_1x1_monthly.txt). Also: make_22_groups.do a Stata program to condense exposure_1x1_monthly.txt into exposure_22groups_monthly.txt. (iii) misc_programs.zip contains: (A) collapse_by_season_v01.do Stata program. Takes as input: female_all_causes.txt & male_all_causes.txt [see (i)] and outputs numerators_pseudoyear.dta, which is the deaths, arranged by pseudoyear, in Stata format. wide_2_long_v01.do takes this file as input and outputs numerators_long.dta, the same data, re-shaped, in Stata format. (B) collapse_exposure_by_season_v01.do Stata program. Takes as input: exposure_22groups_monthly.txt [from (ii)] and outputs: exposure_pseudoyear.dta which is a Stata-format dataset with exposure shaped into pseudoseasons. (C) calc_e0.do takes as input exposure_pseudoyear.dta (from B) and numerators_long.dta (from A), and outputs summer_e0.csv and winter_e0.csv which are pseudoseasonal life expectancy data. These (along with E0per.csv, which are the HMD data for calendar years) are source data for Figures 2&3 of the paper. (D) extract_heatmap_v00a.do takes as input exposure_pseudoyear.dta (from B) and numerators_long.dta (from A) and outputs: heatmap_male.tsv heatmap_male.tsv, to make Figure 4. (E) PH_v010.pro takes as input: males.tsv females.tsv and calculates propotional hazard data (Figures 5-7 and proportional hazard information in the text body). These input files come from extract_data.do, which, in turn, takes as input canon_dataset.dta, which is produced by make_canon_dataset_v00.do, which itself takes as input numerators_long.dta and exposure_pseudoyear.dta (see D). (F) gompertz_example_v04.do takes as input numerators_long.dta and exposure_pseudoyear.dta (see D) and performs Poisson-Gompertz regression. The output of this gets cut+pasted from Stata to gompertz_example_results.tsv, which gompertz_example_v03.pro will convert to Table 1. (G) interpolate-denominators.pro & make_22_groups.do See (ii), above. (iv) misc_data.zip: as explained above: males.tsv females.tsv canon_dataset.dta exposure_pseudoyear.dta numerators_long.dta numerators_pseudoyear.dta exposure_1x1_monthly.txt gompertz_example_results.tsv summer_e0.csv winter_e0.csv E0per.csv heatmap_male.tsv heatmap_female.tsv hmd_1x1 as well as: pct_positive.tsv : CDC flu data, obtained from the source listed in the paper. Used to make figure 1. -------------------------------------------------------------------------------- NARRATIVE EXPLANATION: --------------------- To do everything from scratch: Start at collapse_by_season_v01.do, which is run in Stata, and which needs as input female_all_causes.txt & male_all_causes.txt (from deaths.zip). This will produce numerators_pseudoyear.dta; then run wide_2_long_v01.do which will make numerators_long.dta. Then run collapse_exposure_by_season_v01.do in Stata, which takes as input exposure_22groups_monthly.txt (from exposure.zip). This outputs exposure_pseudoyear.dta. Then run calc_e0.do in Stata. You have now reproduced the data for Figs 2 & 3 and the corresponding parts of the paper. Then run extract_heatmap_v00a.do in Stata. You have now reproduced the data for Fugure 4 and the corresponding parts of the paper. Then run make_canon_dataset_v00.do in Stata. This creates canon_dataset.dta. Then run extract_data.do in Stata. This creates males.tsv females.tsv. Then run PH_v010.pro in IDL. You have now reproduced the data for Figs 5-7 and the corresponding parts of the paper. Then run gompertz_example_v04.do in Stata. This creates gompertz_example_results.tsv. Run gompertz_example_v03.pro in IDL to make table 1. To re-create Figure 1, use the data in pct_positive.tsv. You have now replicated the entire paper. The above narrative assumes you start with exposure_22groups_monthly.txt as the exposures data. If, in turn, you wish to reproduce these, use interpolatre-denominators.pro in IDL, which starts with the raw HMD annual exposure data. EOF.