Mini data dictionary (Name -> Label) • pidp -> cross-wave person identifier (public release) • union -> Type of union • union2 -> Cohabitation or Marriage (tv) • union3 -> Cohabitation/Marriage/Premarital cohabitation (tv) • spell -> — • date -> — • end -> End of union by type • n -> time • n2 -> time squared • n3 -> time cubic • y2 -> — • u -> start date • e -> end date • age -> Time varying age • u_year -> Year of union • u_year_sq -> Year of union – squared • e_year -> Year of end • sex_dv -> Sex, derived • doby_dv -> DOB: Year, derived • dur_prem -> Year of first union (name looks like a duration; keep a note if needed) • year_first_union -> Year of first union • year_first_union_sq -> Year of first union – squared • year_first_union_cu -> Year of first union – cubic • family16_v3 -> Family status at 16 • cohort1…cohort7 -> Cohort of birth, by decade • ISEI_avg_z -> Normalised score of ISEI avg • dom_class4 -> Dominant class: social class of parent with higher class • dom_educ -> Dominant education: education of the more highly educated parent • ethnicity_m -> Ethnicity m • ethnicity_p -> Ethnicity p • ethnicity -> Ethnicity • education_fix -> — • education_var -> RECODE of education_time_varying2 (Time varying education) • age_sq -> Age squared • age_c -> Age groups • doby_cat -> Date of birth – Year groups • outcome1 — parenthood outside union (first birth out of union) • outcome2 — parenthood within cohabitation (first birth in cohabitation) • outcome3 — parenthood within marriage (first birth in marriage) • outcome0 — transition to parenthood (any first birth) • outcome4 — transition to parenthood (any first birth; duplicate of outcome0) What the do-file reproducibility_figure1 does • Defines outcomes • outcome1 = parenthood outside union • outcome2 = parenthood within cohabitation • outcome3 = parenthood within marriage • outcome0 and outcome4 = any first birth (transition to parenthood) • Loops over four outcomes (o = 1…4) to: 1. Load data.dta. 2. Keep ages 12–54 and birth cohorts 1940–1989. 3. Select the last observation per person (and retain valid births). 4. Drop missing SES info, compute duration = date - doby_dv. 5. Set survival time with entry=15 and exit=60 and compute hazards Expected output Hazards are computed on a yearly basis. This figure displays age-specific hazards of transitioning to first parenthood between ages 15 and 50, based on discrete-time survival models. The four panels correspond to overall first birth (A), out-of-union first birth (B), cohabiting first birth (C), and marital first birth (D), each stratified by parental occupational class. Although the analysis includes the full 15–50 age range, the plotted curves are naturally constrained to ages where events are sufficiently frequent to estimate hazards reliably. What the do-file reproducibility_fig2 does 1. Loads the data (use "$reproducibility/data/data.dta", clear) and applies variable labels so the file is self-describing if metadata are lost. 2. Defines outcome meanings (used downstream): • outcome1 = parenthood outside union • outcome2 = parenthood within cohabitation • outcome3 = parenthood within marriage • outcome0 and outcome4 = any first birth (transition to parenthood) 3. Model loop — Panel A (any first birth) Fits discrete-time logit models (via xtlogit/reg as specified in the loop) for the transition to parenthood, stratified by parental class (dom_class4). • Sample restrictions: keep ages ?15 (upper bound sometimes <50) and cohorts up to 1989. • Drops records with missing SES (dom_class4 and dom_educ) and invalid sex. • Saves marginal predicted probabilities (Stata margins) by dom_class4 to margins/xtlogit_dom_class4_m*.dta. • Builds a scatter+CI plot named y for Panel A. 4. Model loop — Panels B–D (by union context) Repeats the estimation for each context (y=1..3 = out of union / cohabitation / marriage), saving margins to margins/xtlogit_y{1,2,3}_dom_class4_*.dta. Aggregates margins across models, rescales to percent, and generates three plots y1, y2, y3 (Panels B–D). 5. Combine panels Uses grc1leg to assemble a 1Χ4 figure (y y1 y2 y3) with a common y-axis. How to run 1. Open Stata, set the working directory to the project root, and define the globals shown above. 2. Install required packages (once). 3. Run: log using "$reproducibility/logs/run_fig2.smcl", replace do code/reproducibility_fig2.do log close Interpretation / caption (for the paper) Figure 2 displays annual probabilities of transitioning to first parenthood between ages 15 and 50, estimated with discrete-time survival models. Panels show overall first birth (A), out-of-union first birth (B), cohabiting first birth (C), and marital first birth (D), each stratified by parental occupational class. Although the analysis covers the full 15–50 age range, the plotted curves are naturally limited to ages where events are sufficiently frequent to estimate probabilities reliably. Notes & assumptions • The script references control sets via locals (e.g., n_`m''`, demo_m'', ``edu_m'', `edu_par`). If these locals are not defined elsewhere in your project, Stata treats them as empty and models run with the available terms. • The code saves intermediate .dta files in margins/. • If your raw outcome4 is not already “any first birth,” the script coerces it from components where indicated. reproducibility_fig3.do — What this script does Purpose Produces Figure 3: predicted annual probabilities (discrete-time hazards) of transition to first birth by union status and parents’ occupational class, across birth cohorts. Panels: • B: first birth out of union • C: first birth in cohabitation • D: first birth in marriage (Shaded ribbons are 95% CIs.) 1. Load data & basic cleaning • Drops records with missing SES (dom_class4 and dom_educ) and invalid sex. • Keeps ages 15–49 (age >= 15 & age < 50). • Creates n_spell (union order indicator) for descriptives. 2. Define cohort scale • cohort = doby_dv - 1960 (so 0 = 1960). • Quadratic and cubic transforms are created; the models used here rely on linear and quadratic cohort terms (see “Model spec”). 3. Model estimation (by outcome y = 1..3) • Outcome meanings: y=1 out-of-union birth; y=2 cohabiting birth; y=3 marital birth. • Cohort windows tested via restrictions: • r=2 corresponds to 1940–1989 cohorts (used in this script). • Model specification chosen for plotting: M=5 • Robust standard errors. • Margins: predicted probabilities by dom_class4 at cohort grid 1940, 1945, …, 1990 (i.e., cohort = -20(5)30), saved to margins/xtlogit_y{1,2,3}_dom_class4_m5_r2_coh.dta. • (Optional) Exports coefficient tables via outreg2 to /output/reg. 4. Post-processing & scaling for plots • Converts margins and CI bounds to percent (Χ100). • For the out-of-union panel (y==1) the code divides the plotted values by 2 before scaling (a presentation choice already in the script). If you do not intend this halving, remove the line replace ... = .../2 if y==1. 5. Graph construction • For each panel (B, C, D) draws shaded CI bands (one band per class: Low-skilled working, Skilled working, Lower-middle, Upper-middle) over the cohort axis. • Labels the cohort axis using the _at positions to show 1940 … 1990. 6. Combine panels • graph combine y1 y2 y3, ycommon row(1) builds the 1Χ3 figure. Model specification (concise) • Discrete-time logit (via xtlogit call; the script uses robust SEs). • Key term: ib1.dom_class4##c.cohort##c.cohort ? class-specific cohort profiles with a quadratic trend. • Controls grouped under locals: • n_`m'2' = time functions (e.g., age/tenure polynomials, as defined elsewhere in the project). • demo_`m'' = demographic controls (defined elsewhere). (If these locals are not defined upstream, Stata treats them as empty and the model runs with the available terms.) Outputs • Intermediate margins files: margins/xtlogit_y{1,2,3}_dom_class4_m5_r2_coh.dta • Final figure in the Graph window: Predicted probabilities by cohort and class for each union context (Panels B–D). Interpretation Each panel shows how the annual probability of first birth varies across birth cohorts (x-axis) for each parents’ class (four shaded ribbons). Differences across ribbons indicate socio-economic gradients; curvature over cohorts captures period/cohort shifts. CIs reflect uncertainty in the predicted probabilities.