Read me. Data Management "Educational outcomes in stepfamilies: A comparative analysis of cohabitation and remarriage" Anna Tegunimataka & Jonas Helgertz Data is can be applied for via Statistics Denmark, (Data for research) Stata version 17 is used for the analysis STUDY OVERVIEW This project links Danish population registers to construct child–parent–stepparent panels (1986–2017) and merges grade records and education registers to analyze ninth-grade outcomes by post-separation family structure (cohabitation vs remarriage). The final analysis file is grade_analysis.dta. SOURCE DATA The study combines several Danish population and administrative registers covering all individuals born in Denmark since the mid-1980s. The key sources are: Population registers (BEF86_17, DOD2016): provide information on personal identifiers, sex, date and country of birth, biological parent IDs, deaths, civil and marital status, cohabitation links, municipality, and region of residence. Education registers (UDFK2017): contain students’ final (FP9) exam grades in Danish, mathematics, and English, as well as school and institution identifiers. Parental education register (UDDA86_17) and a corresponding mapping key (educ_key): used to assign the highest completed level of education to parents and stepparents. DATA PREPARATION 1. Building the population and family structure All individuals born in Denmark from 1986 onward are included. Duplicate IDs are removed, and biological mothers and fathers are linked. For each child, information on both parents’ birth year, country of birth, and civil status is merged. The data are then expanded into a child–year panel, giving one record per child for every year up to age 16. 2. Adding information on family formation and separation Using annual updates from the population register, the dataset captures when parents marry, separate, or form new unions. Spouse and cohabitation IDs are harmonized so that each individual’s partnership spells are continuous and mutually consistent. Variables are created to mark: the year and age when parents first separate or divorce; the year when a mother or father starts living with a new partner (stepmother or stepfather entry); and the duration and type of each new union (cohabitation or remarriage). 3. Linking children, siblings, and stepparents Each parent’s children are identified to construct full siblings (same mother and father), half-siblings (shared one biological parent), and step-siblings (children of a parent’s new partner). Death dates are used to ensure that no family ties appear after a parent’s death. 4. Merging education outcomes Children’s final compulsory-school grades inath are added from the education register. Grades are standardized within each exam year (mean = 0, SD = 1) to make scores comparable across cohorts. School and region identifiers are retained for later fixed-effects controls. 5. Adding parental and stepparent education Education records are merged for both biological parents and current stepparents. Each person’s highest attained education is grouped into three levels: 1 = Primary/lower secondary, 2 = Upper secondary, 3 = Tertiary. For stepparents, education refers to the partner present in the household during the child’s upbringing. 6. Derived descriptive variables Additional variables are generated to describe family background and timing: Mother’s age at childbirth Firstborn indicator Years since parental separation Years remarried or cohabiting, both in continuous and categorical form Number of biological siblings Age at divorce (and indicator for divorce before child age 16) 7. Sample restrictions To ensure comparability and valid family information, the following exclusions are applied: Children born before 1986 Cases where parents were not cohabiting at birth Families where a parent has more than one new partnership before the child turns 16 Children who gained a stepparent at age 1 (likely data noise) Cases with missing parental education, or where parents were not divorced but a step age was recorded Intact marriages where the two parents reside in different municipalities Observations without valid grades or duplicate records After all merges and exclusions, the final analysis file (grade_analysis.dta) includes one observation per child with complete information on family structure, education background, and standardized test outcomes. ****ANALYSIS*** Table 2 – Family type and test performance These models examine the association between maternal and paternal family status and children’s standardized grades. Family status is captured by the variables m_remarrgroups and f_remarrgroups, which classify each biological parent as: 0 = still with biological partner (no separation), 1 = single after separation, 2 = cohabiting with new partner, 3 = remarried. The interaction i.m_remarrgroups#i.f_remarrgroups allows all combinations of maternal and paternal post-separation statuses. Control variables: mother’s age at childbirth (mother_age_birth), firstborn indicator (first_born), child sex (koen), number of biological siblings (biosib_cat), parental education (m_highest_edu, f_highest_edu), divage(age at divorce) and regional fixed effects (i.reg). **** Table 3 – Duration of Post-Separation Relationships These models test whether the length of time a mother or father has lived with a new partner after separation is associated with children’s academic outcomes. Duration is measured as the number of years spent in the mother’s or father’s new cohabiting or marital relationship (m_years_repartnered, f_years_repartnered) before the child’s age 16. Duration is grouped into the following four categories: 0 years = No new partner before the child’s age 16 1–2 years = Short duration 3–5 years = Medium duration 6–9 years = Long duration 10 or more years = Very long duration (Coded in the dataset as m_dur_cat and f_dur_cat.) Separate interactions are estimated for maternal (i.m_remarrgroups#i.m_dur_cat) and paternal (i.f_remarrgroups#i.f_dur_cat) durations. All models include the same covariates as in Table 2, The dependent variables are standardized ninth-grade math (std_math)