************************************************************ ************************************************************ *** *** *** Do-file for working with pairfam data *** *** FIXED EFFECTS-REGRESSION *** *** ANCHOR DATA WAVES 1-11 *** *** Release 13.0 *** *** *** *** May 2022 *** *** *** *** Authors: Josef BrĂ¼derl, Kristin Hajek *** *** *** ************************************************************ ************************************************************ /* This do-file demonstrates how to perform a Fixed-Effects (FE) regression with pairfam data. It shows how to prepare the data, how to define an estimation sample and how to perform the analysis. It uses waves 1 to 11, for the handling of waves 12 and 13 see the Data Manual. The example is built according to BrĂ¼derl/Ludwig (2015) Fixed-Effects Panel Regression. In: H. Best and C. Wolf (eds.) The Sage Handbook of Regression Analysis and Causal Inference. Sage. The effect of marriage (the event) on life satisfaction (the outcome) is analyzed. */ *************************************************************************** *** PRELIMINARIES *** *************************************************************************** clear all set more off // tells Stata not to pause for --more-- messages set maxvar 10000 // increases maximal number of variables global inpath `""insert your datapath here""' // directory of original data *************************************************************************** *** EXTRACTING THE VARIABLES *** *************************************************************************** cd $inpath use id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet using anchor1.dta, clear // extract variables from anchor1 * NOTE: A Stata problem might arise with the command "use varlist using filename" for users of the English version: * If the data set was saved with German labels as the main labels, the English labels will be lost when opening the data with that command * Quick fix: Open each data set separately, change to English labels with the command "label language en" and save the data label language de // use German labels * label language en // use English labels (optional) * Looping syntax for combining the data sets quietly: for num 2/11: append using anchorX.dta, keep(id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet) // create panel dataset (long form) & suppress label notices (via quietly) quietly: append using anchor1_DD.dta, keep(id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet) // add DemoDiff (DD) sample * Customize data set order id wave sat6 relstat mardur sex_gen age hhincnet sort id wave *************************************************************************** *** DATA PREPARATION *** *************************************************************************** * Drop anchors with a sex change drop if sex_gen == -4 * Define missings mvdecode _all, mv(-1/-11=.) * Life Satisfaction rename sat6 happy // rename sat6 to happy tab happy, missing * Household income summ hhincnet, d replace hhincnet = . if hhincnet < 100 // recode implausible values to mv replace hhincnet = hhincnet / 1000 // rescaling hhincome * Sex gen woman = sex_gen==2 // Dummy for women * Dummy for marriage (0=never-married 1=married 2=divorced, widowed) recode relstat 1/3=0 4/5=1 6/11=2, into(marry) lab var marry "Marriage" * Correcting wave for the first demodiff survey tab wave demodiff, miss // DD started in years 2009/10! recode wave (1=2) if demodiff==1 // therefore set starting year of DD to pairfam wave 2 (2009/10) *************************************************************************** *** SAMPLE DEFINITION *** *************************************************************************** * Because with FE estimation we are comparing the life satisfaction * before and after marriage, we need a sample where persons were never * married when first observed. Therefore, the estimation sample is restricted: * - only persons who were never married when first observed * - all person years (pys) after first marriage (separation, death) are excluded ***** I) EXCLUDE PYS WITH MISSINGS ON INTERESTING VARIABLE ***** gen help=0 replace help=1 if missing(happy,marry,age,woman,hhincnet) keep if help==0 drop help ***** II) ONLY PERSONS WHO WERE NEVER MARRIED WHEN FIRST OBSERVED ARE KEPT ***** bysort id (wave): gen pynr = _n // person-year ID (within person) gen help=0 replace help=1 if marry>0 & pynr==1 bysort id (wave): replace help = sum(help) // ==1 for all pys of those initially not unmarried keep if help==0 drop help ***** III) ALL PYS AFTER FIRST MARRIAGE ARE EXCLUDED ***** gen help=0 replace help=1 if marry>1 // flag pys after first marriage bysort id (wave): replace help=sum(help) // flag all following pys (could be a second marriage) keep if help==0 // all pys after first marriage are dropped drop help * Some checks tab marry, missing // should be only 0 or 1 tab marry if pynr==1,missing // should be only 0s ***** IV) ONLY THOSE WITH AT LEAST 2 OBSERVATIONS ARE KEPT ***** bysort id: gen pycount = _N // # of person-years (within person) tab pycount if pynr==1 // length of the panels keep if pycount>1 *************************************************************************** *** ANALYZING THE DATA *** *************************************************************************** * Define the data to be panel data xtset id wave ***** SOME DESCRIPTIVE INFORMATION ON THE PANEL DATA ***** xtdes, pattern(30) // Patterns in the data xttrans marry , freq // Information on transitions ***** COMPARING SIMPLE MODELS ******************************************* * Pooled OLS reg happy marry age woman, vce(cluster id) est store POLS * Random-Effects Regression xtreg happy marry age woman, re vce(cluster id) theta est store RE * Fixed-Effects Regression xtreg happy marry age woman, fe vce(cluster id) est store FE xtreg happy marry age woman hhincnet, fe vce(cluster id) est store FE1 * Table of estimation results estimates table POLS RE FE FE1, b(%7.2f) star stfmt(%6.0f) stats(N N_clust) /// keep(marry age woman hhincnet) ***** DUMMY IMPACT FUNCTION IN THE FE MODEL ***************************** * Creating an event centered time scale * A) Using the event history information from "mardur" recode mardur .=0 0/12=1 13/24=2 25/36=3 37/48=4 49/60=5 61/72=6 73/84=7 85/96=8 97/108=9 109/max=10, into(mc) * B) Alternative strategy via the panel variable "marry" bysort id: egen treat = max(marry) //indicator for treatment group bysort id: egen h1 = min(wave) if marry==1 //marriage year (no marriage==.) bysort id: egen h2 = min(h1) if treat==1 //copy the marriage year to all pyrs gen h3 = wave - h2 + 1 if treat==1 //event centered time scale for treated tab mc h3, m // by using the panel information we make some (small) errors ******** Estimation ****** * If needed download coefplot package: * ssc install coefplot * 1) Model with event history information xtreg happy i.mc age, fe vce(cluster id) coefplot, keep(*.mc) vertical yline(0) xline(1) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.mc="0" 2.mc="12" 3.mc="24" 4.mc="36" 5.mc="48" 6.mc="60" 7.mc="72" 8.mc="84" 9.mc="96" 10.mc="108") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Months after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy1, replace) * 2) Model with panel information recode h3 min/0=0, into(ym) //years before marriage are reference recode ym .=0 //not-treated have now a 0 on ym xtreg happy i.ym age, fe vce(cluster id) coefplot, keep(*.ym) vertical yline(0) xline(1) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.ym="0" 2.ym="1" 3.ym="2" 4.ym="3" 5.ym="4" 6.ym="5" 7.ym="6" 8.ym="7" 9.ym="8" 10.ym="9") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Years after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy2, replace) * 3) Now with anticipation (-1 dummy) drop ym recode h3 min/-1=-1, into(ym) //years before -1 are reference replace ym = ym + 1 //bring all to positive values recode ym .=0 //not-treated have now a 0 on ym xtreg happy i.ym age, fe vce(cluster id) coefplot, keep(*.ym) vertical yline(0) xline(2) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.ym="-1" 2.ym="0" 3.ym="1" 4.ym="2" 5.ym="3" 6.ym="4" 7.ym="5" 8.ym="6" 9.ym="7" 10.ym="8" 11.ym="9") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Years before / after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy3, replace)