Readme for the code used in "Decomposing delayed first marriage and birth across cohorts: The role of increased employment instability among men in Japan" Author: Ryota Mugiyama Date: 2025-02-23 The .zip folder contains: - the do-files and R codes used to conduct the analysis. Specifically, - _master.do: This do-file combines all do-files used in the analyses. - 1_genData: This folder contains do-files that relates to constructing variables. - expand_data.do: Construct person-year data. - demographics.do: Construct Demographic variables. - job_history.do: Construct job history related varibles. - status.do: Construct employment status varible. - school_hitsory.do: Construct school history related varibles. - educ.do: Construct educational attainment varibles. - student.do: Revise employment status and educational attainment varibles by introducing "enrollment in school" category. - cumyears.do: Construct years (or share) of experience in each employment status variable. - marriage_history.do: Construct marriage history related variables. - child_history.do: Construct child history related variables. - timing_mar_birth.do: Construct timing of first marriage and first birth and the transitions. - 2_sample: This folder contains do-files that select the analytical sample from the whole sample. - selectsample.do: Select and construct analytical samples. - 3_analysis: This folder contains do-files that replicates the analyses I have conducted. - summaryStat.do: Export summary statistics (Table 1 and Table A-9). - survival.do: Export Kaplan-Meier survival estimates and smoothed hazard function (Figure 2 and Figure A-3). - lifetable.do: Calculate life table for checking specific survival rates. - distStatus.do: Calculate agep-specific distribution of employment status (Figure 3; visualization is conducted in R). - logit_modelspecification.do: Estimate various logit models for comparing age and cohort specification (Table A-2). - Estimate discrete-time logit models (Table 2, A-3, A-7, 3, and A-8). - Estimate discrete-time logit models taking interactions between age and cohort (Table A-4). - Show correlation matrix and estimate logit models using share variables (Table A-5 and A-6). - Calculate estimated hazerd and counterfactual survival rates (Figure 4 and 5; visualization is conducted in R). - 4_analysis_R: This folder contains R scripts that replicates the analyses (mainly visualization) I have conducted. - fig1_macrostat.R: Construct trends in national statistics regarding delayed marriage and birth and increased employment instability (Figure 1 and Figure A-1). - fig3_statuplot.R: Construct figures that shows age-specific distribution of employment status. - fig4_5_counterfactual.R: Visualize the counterfactual survival plots and decomposition results. Data has been analyzed Stata/MP 18.0, R version 4.3.1, and RStudio version 2024.12.0+467. For any inquireies, please feel free to contact the author. The data I used, the Social Stratification and Mobility (SSM) Survey in Japan, was provided from the 2025 SSM Survey Management Committee for allowing us to use SSM data. The data can also be available at the Center for Social Research and Data Archives, Institute of Social Science, the University of Tokyo (https://csrda.iss.u-tokyo.ac.jp/english). Users can download the data for research or educational purposes if their applications are allowed. The project folder should be organized as follows. - code_replication (this folder is set as "working directory") - results: The analytical results are exported in this folder. - data: The dataset should be located in this folder. The original dataset contains 7,817 respondents. After restricting the sample, the analytical sample consists of three types of sample, which are combined into the one integrated data: - Sample 1: This sample is used for transition to first marriage. 33824 person-year observations from 2507 respondents. - Sample 2: This sample is used for transition to first birth. 40779 person-year observations from 2507 respondents. - Sample 3: This sample is used for transition to first marriage. 8101 person-year observations from 1763 respondents. The variables I used: - meibo_1: Gengo (Showa, Heisei, or Seireki). - meibo_2: Birth year when respondents chose Showa or Heisei. - meibo_3: Birth year when respondents chose Seireki. - q1_2_5: Age at the time of survey. - q1_1: Sex. - q8_a: Employment status at first job. - q9_*_c_4: Employment status at *th job. - q8_2: Number of firms respondents are employed at first job. If they are not employed, this takes 0. - q9_*_c_1: Number of firms respondents are employed at *th job. If they are not employed, this takes 0. - q9_*_c_7: Age at beginning *th job. - q18_5: Whether respondents have enrolled in high school - q19_d: Whether respondents were graduated from high school - q20_*: Whether respondents were enrolled in *th school. - q20_*_a: Whether respondents were graduated from *th school. - q20_*_b_1: Age at enrollment in *th school after leaving high school. - q20_*_b_2: Number of years enrolled in *th school. - q25: Current marital status. - q33: Whether the current marriage is first marriage or remarriage. - q26: Age at marriage when current marriage is first marriage. - q34: Age at first marriage when respondents are divorced or separated. - sq1: Age at first marriage when the marriage are remarriage. - dq13_*_2a: Gengo of year of birth of *th child (Showa or Heisei). - dq13_*_2b: Year of birth of *th child when respondents chose Showa or Heisei. - dq13_*_3: Whether *th child is respondent's biological child or stepchild. Note for API of Japanese National Statistics (relating to Figure 1, Appendix Table A-1, Figure A-1, and Figure A-2): To construct these figures, you have to register your own api ID from here: https://www.e-stat.go.jp/mypage/user/preregister.