Readme file for "COVID-19 Risk Factors and Mortality among Native Americans" Fumiya Uchikoshi is responsible for this document. /* ------- Overview ------- */ This is a set of replication files for the paper titled "COVID-19 Risk Factors and Mortality among Native Americans." There are one R markdown file, six R files and one do file to run the replication. In this paper, Rstudio Version 1.2.5042 and Stata/MP version 17.0 were used. /* ------- List of replication codes (See "codes" folder) ------- */ - Replication_ DR5365.Rmd: a main R markdown file that integrates multiple sources of data and submit outputs including figures. - brfss.do: a do file that imports SAS formatted brfss datasets and prepare for analysis using the markdown file above. - ACS2015-2019.R: a R file that imports ACS microdata from IPUMS and prepare for analysis using the markdown file above. - Note that apply function in acs package may conflict with apply as base function - standardization: a series of R functions originally developed by Goldstein and Atherwood (2020). The author added some edits or create new codes to get estimates they did not provide (e.g., state-level SMRs). These functions are as follows. - create_deaths_by_race_and_state_DJk.R - standardization_functions.R - get_standardization_by_state_fun.R - create_Nijk.R - allcause_dataclean.R /* ------- Replication process ------- */ 1. Download the following files from CDC and Census webpage - CDC - Provisional COVID-19 Death Counts by Sex, Age, and State https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku - Provisional Death Counts for Coronavirus Disease (COVID-19): Distribution of Deaths by Race and Hispanic Origin https://data.cdc.gov/NCHS/Provisional-Death-Counts-for-Coronavirus-Disease-C/pj7m-y5uh - Provisional COVID-19 Death Counts in the United States by County https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-in-the-United-St/kn79-hsxy - Get age distribution of Covid-19 deaths - Use Provisional_COVID-19_Death_Counts_by_Sex__Age__and_State_Jan21.csv - Create deaths_by_age_Di_Jan21.csv - Census Bureau - https://www.census.gov/newsroom/press-kits/2020/population-estimates-detailed.html - cc-est2019-alldata.csv (county-level estimates) * Note: change Doña Ana County in NM to Dona Ana County 2. Create datasets using ACS microdata from IPUMS - Download the microdata from https://usa.ipums.org/usa/ (see a separate file titled メACS microdata.txtモ) - Run ACS2015-2019.R to create the following files - acs_homeland_tab.csv - acs_homeland_n.csv - age_state_race_ACS2015-2019.csv - age_state_race_ACS2015-2019_US.csv 3. Create datasets using BRFSS microdata - Download data from https://www.cdc.gov/brfss/index.html - Run brfss.do to import SAS format data into dta format and create descriptive statistics in csv format - Variables used in this study are as follows. - state: State FIPS code - race: computed race-ethnicity grouping - hlthpln1: have any health care coverage - checkup1: check up (2 or more years) - cvdcrhd4: heart disease - asthma3: asthma - chcocncr: cancer excluding skin - chccopd1: COPD - diabete3: deabetes - obese: obese (BMI>30) - smoker3: smoking - chckidny: kidney - age: personユs age - llcpwt: weight - year: year of survey 4. Run chunks in メRevision.Rmdモ - Chunk cdcdata: create datasets for estimating SMRs (including all-cause mortality) - create_deaths_by_race_and_state_DJk.R produces the following file * clean_cdc_race_jan_21.csv * clean_cdc_race_count_jan_21.csv - allcause_dataclean.R produces the following file * perc_DJk_all.csv - create_Nijk.R produces the exposure file * Nijk.csv - Chunk agedist: create the average age for each racial group - Chunk tribe_county: create the following two files - able_tribe_ACS2015-2019.csv: proportion of American Indians and Alaska Native among the Native American population (used for the footnote 1) - Res_Home_Dif.csv: the median difference between % NAs in homeland and those in reservations (used for the footnote 4) - Chunk table1: create Table 1 - Chunk cdcviz: create Figures 1, 2, 3 - Chunk brfss: create Table 2 and Appendix tables for brfss data and the state-level estimates for potential explanatory factors (brfss_ByAge.csv) - Chunk acs: create Table 2 and Appendix tables for ACS data. - Chunk correlation: create Figure 4 and 5 /* ------- Data (these files can be found in メDataモ folder ------- */ - Raw data - CDC * Provisional_COVID-19_Death_Counts_in_the_United_States_by_County_Jan21ed.csv * Provisional_Death_Counts_for_Coronavirus_Disease__COVID-19___Distribution_of_Deaths_by_Race_and_Hispanic_Origin_Jan21.csv * Provisional_COVID-19_Death_Counts_by_Sex__Age__and_State_Jan21.csv * MultipleCauseofDeathRace2019.txt * MultipleCauseofDeathCounty2019.txt * MultipleCauseofDeathAge2019.txt * The last three files are all-cause mortality in 2019 by race, age, and county, downloaded from CDC Wonder page. (https://wonder.cdc.gov/) - Census * cc-est2019-alldata.csv * nc-est2019-syasexn.xlsx (for brfss age standardization) - ACS 2015-2019 * As mentioned earlier, please download the ACS microdata from IPUMS - BRFSS 2011-2019 * All raw data can be found in BRFSS website (https://www.cdc.gov/brfss/index.html). Stata do file (brfss.do) allows you to download these data. - Edited data - CDC * clean_cdc_race_jan_21.csv * clean_cdc_race_count_jan_21.csv * deaths_by_age_Di_Jan21.csv * deaths_by_age_Di_AllCause2019 * perc_DJk_all.csv - Census * Nijk.csv - ACS * acs_home_tab.csv (estimates by race and states, including the US) * acs_home_n.csv (number of persons and households by race and state, including the US) * age_state_race_ACS2015-2019.csv (average age by racefor each state) * age_state_race_ACS2015-2019_US.csv (age average rage age by race) - BRFSS * desc_race*i*.csv (national-level counts by race) * brfss_count_state*j*_race*i*.csv (State-level counts by race) * brfss_ByAge.csv (age-rage-state estimates for predictors) * brfss_merge_rist_r.dta (brfss merged data) * brfss_2011_2019_merge_health_r.dta (brfss merged data that create the number of health conditions) - Other - FIPS.csv: US state FIPS code - State_Openness_Jan_16states.xlsx: state openness score - Reservation.xlsx: % Native Americans on reservation - Frontline.xlsx: % Frontline workers by race and state - countycode.csv (used to link multiple data sources based on county information) /* ------- Output figures and tables (file names correspond to the name in the paper ------- */ - Figure1.pdf - Figure2.pdf - Figure3.pdf - Figure4.pdf - Figure5.pdf - Appendix_Figure1.pdf - Appendix_Figure2.pdf - Table1.xlsx - Table2_ACS.csv - Table2_BRFSS.csv - Appendix_Table1_ACS.csv - Appendix_Table1_BRFSS.csv - Appendix_Table2.csv - Correlation_SMR.csv /* ------- Notes ------- */ - Our analysis for SMRs is based on the data which CDC released on January 21, 2021. - For estimation of SMRs, we relied upon the code developed by Goldstein and Atherwood (2020) - For the proportion of frontline workers, we used the estimation developed by Goldman et al. (2021) - Proportion of Native Americans living on reservation and State openness scores were collected by one of the authors (KLB). Reservation data are based on ACS-5 year estimates available on My Tribal Area (https://www.census.gov/tribal/) - Change county names if there is any inconsistencies between census and CDC (e.g. DeKalb county in IN) - Add 0 to New York City in death stats by race References Goldman, N., Pebley, A.R., Lee, K., Andrasfay, T., and Pratt, B. (2021). Racial and ethnic differentials in COVID-19-related job exposures by occupational standing in the US. PLOS ONE 16(9):e0256085. doi:10.1371/journal.pone.0256085. Goldstein, J.R. and Atherwood, S. (2020). Improved Measurement of Racial/ethnic Disparities in COVID-19 Mortality in the United States. medRxiv. doi:10.1101/2020.05.21.20109116. Codebook reference - ACS - https://usa.ipums.org/usa/volii/codebooks.shtml - brfss - 2011: https://www.cdc.gov/brfss/annual_data/2011/pdf/CODEBOOK11_LLCP.pdf - 2012: https://www.cdc.gov/brfss/annual_data/2012/pdf/CODEBOOK12_LLCP.pdf - 2013: https://www.cdc.gov/brfss/annual_data/2013/pdf/CODEBOOK13_LLCP.pdf - 2014: https://www.cdc.gov/brfss/annual_data/2014/pdf/CODEBOOK14_LLCP.pdf - 2015: https://www.cdc.gov/brfss/annual_data/2015/pdf/CODEBOOK15_LLCP.pdf - 2016: https://www.cdc.gov/brfss/annual_data/2016/pdf/CODEBOOK16_LLCP.pdf - 2017: https://www.cdc.gov/brfss/annual_data/2017/pdf/codebook17_llcp-v2-508.pdf - 2018: https://www.cdc.gov/brfss/annual_data/2018/pdf/codebook18_llcp-v2-508.pdf - 2019: https://www.cdc.gov/brfss/annual_data/2019/pdf/codebook19_llcp-v2-508.HTML