************************************************************ ************************************************************ *** *** *** Do-file for working with pairfam data *** *** FIXED EFFECTS-REGRESSION *** *** ANCHOR DATA WAVES 1-11 *** *** Release 12.0 *** *** *** *** May 2021 *** *** *** *** Authors: Josef BrĂ¼derl, Kristin Hajek *** *** *** ************************************************************ ************************************************************ /* This do-file demonstrates how to perform a Fixed-Effects (FE) regression with pairfam data. It shows how to prepare the data, how to define an estimation sample and how to perform the analysis. It uses waves 1 to 11, for the handling of wave 12 see the Data Manual. The example is built according to BrĂ¼derl/Ludwig (2015) Fixed-Effects Panel Regression. In: H. Best and C. Wolf (eds.) The Sage Handbook of Regression Analysis and Causal Inference. Sage. The effect of marriage (the event) on life satisfaction (the outcome) is analyzed. If you need further help, do not hesitate to contact: support@pairfam.de */ *************************************************************************** *** PRELIMINARIES *** *************************************************************************** clear all set more off // tells Stata not to pause for --more-- messages set maxvar 10000 // increases maximal number of variables global inpath `""insert your datapath here""' // directory of original data *************************************************************************** *** EXTRACTING THE VARIABLES *** *************************************************************************** cd $inpath use id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet using anchor1.dta, clear // extract variables from anchor1 * NOTE: A Stata problem might arise with the command "use varlist using filename" for users of the English version: * If the data set was saved with German labels as the main labels, the English labels will be lost when opening the data with that command * Quick fix: Open each data set separately, change to English labels with the command "label language en" and save the data label language de // use German labels * label language en // use English labels (optional) * Looping syntax for combining the data sets quietly: for num 2/11: append using anchorX.dta, keep(id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet) // create panel dataset (long form) & suppress label notices (via quietly) quietly: append using anchor1_DD.dta, keep(id wave demodiff cohort sat6 age sex_gen relstat mardur hhincnet) // add DemoDiff (DD) sample * Customize data set order id wave sat6 relstat mardur sex_gen age hhincnet sort id wave *************************************************************************** *** DATA PREPARATION *** *************************************************************************** * Drop anchors with a sex change drop if sex_gen == -4 * Define missings mvdecode _all, mv(-1/-11=.) * Life Satisfaction rename sat6 happy // rename sat6 to happy tab happy, missing * Household income summ hhincnet, d replace hhincnet = . if hhincnet < 100 // recode implausible values to mv replace hhincnet = hhincnet / 1000 // rescaling hhincome * Sex gen woman = sex_gen==2 // Dummy for women * Dummy for marriage (0=never-married 1=married 2=divorced, widowed) recode relstat 1/3=0 4/5=1 6/11=2, into(marry) lab var marry "Marriage" * Correcting wave for the first demodiff survey tab wave demodiff, miss // DD started in years 2009/10! recode wave (1=2) if demodiff==1 // therefore set starting year of DD to pairfam wave 2 (2009/10) *************************************************************************** *** SAMPLE DEFINITION *** *************************************************************************** * Because with FE estimation we are comparing the life satisfaction * before and after marriage, we need a sample where persons were never * married when first observed. Therefore, the estimation sample is restricted: * - only persons who were never married when first observed * - all person years (pys) after first marriage (separation, death) are excluded ***** I) EXCLUDE PYS WITH MISSINGS ON INTERESTING VARIABLE ***** gen help=0 replace help=1 if missing(happy,marry,age,woman,hhincnet) keep if help==0 drop help ***** II) ONLY PERSONS WHO WERE NEVER MARRIED WHEN FIRST OBSERVED ARE KEPT ***** bysort id (wave): gen pynr = _n // person-year ID (within person) gen help=0 replace help=1 if marry>0 & pynr==1 bysort id (wave): replace help = sum(help) // ==1 for all pys of those initially not unmarried keep if help==0 drop help ***** III) ALL PYS AFTER FIRST MARRIAGE ARE EXCLUDED ***** gen help=0 replace help=1 if marry>1 // flag pys after first marriage bysort id (wave): replace help=sum(help) // flag all following pys (could be a second marriage) keep if help==0 // all pys after first marriage are dropped drop help * Some checks tab marry, missing // should be only 0 or 1 tab marry if pynr==1,missing // should be only 0s ***** IV) ONLY THOSE WITH AT LEAST 2 OBSERVATIONS ARE KEPT ***** bysort id: gen pycount = _N // # of person-years (within person) tab pycount if pynr==1 // length of the panels keep if pycount>1 *************************************************************************** *** ANALYZING THE DATA *** *************************************************************************** * Define the data to be panel data xtset id wave ***** SOME DESCRIPTIVE INFORMATION ON THE PANEL DATA ***** xtdes, pattern(30) // Patterns in the data xttrans marry , freq // Information on transitions ***** COMPARING SIMPLE MODELS ******************************************* * Pooled OLS reg happy marry age woman, vce(cluster id) est store POLS * Random-Effects Regression xtreg happy marry age woman, re vce(cluster id) theta est store RE * Fixed-Effects Regression xtreg happy marry age woman, fe vce(cluster id) est store FE xtreg happy marry age woman hhincnet, fe vce(cluster id) est store FE1 * Table of estimation results estimates table POLS RE FE FE1, b(%7.2f) star stfmt(%6.0f) stats(N N_clust) /// keep(marry age woman hhincnet) ***** DUMMY IMPACT FUNCTION IN THE FE MODEL ***************************** * Creating an event centered time scale * A) Using the event history information from "mardur" recode mardur .=0 0/12=1 13/24=2 25/36=3 37/48=4 49/60=5 61/72=6 73/84=7 85/96=8 97/108=9 109/max=10, into(mc) * B) Alternative strategy via the panel variable "marry" bysort id: egen treat = max(marry) //indicator for treatment group bysort id: egen h1 = min(wave) if marry==1 //marriage year (no marriage==.) bysort id: egen h2 = min(h1) if treat==1 //copy the marriage year to all pyrs gen h3 = wave - h2 + 1 if treat==1 //event centered time scale for treated tab mc h3, m // by using the panel information we make some (small) errors ******** Estimation ****** * If needed download coefplot package: * ssc install coefplot * 1) Model with event history information xtreg happy i.mc age, fe vce(cluster id) coefplot, keep(*.mc) vertical yline(0) xline(1) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.mc="0" 2.mc="12" 3.mc="24" 4.mc="36" 5.mc="48" 6.mc="60" 7.mc="72" 8.mc="84" 9.mc="96" 10.mc="108") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Months after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy1, replace) * 2) Model with panel information recode h3 min/0=0, into(ym) //years before marriage are reference recode ym .=0 //not-treated have now a 0 on ym xtreg happy i.ym age, fe vce(cluster id) coefplot, keep(*.ym) vertical yline(0) xline(1) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.ym="0" 2.ym="1" 3.ym="2" 4.ym="3" 5.ym="4" 6.ym="5" 7.ym="6" 8.ym="7" 9.ym="8" 10.ym="9") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Years after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy2, replace) * 3) Now with anticipation (-1 dummy) drop ym recode h3 min/-1=-1, into(ym) //years before -1 are reference replace ym = ym + 1 //bring all to positive values recode ym .=0 //not-treated have now a 0 on ym xtreg happy i.ym age, fe vce(cluster id) coefplot, keep(*.ym) vertical yline(0) xline(2) recast(line) lwidth(thick) lcolor(blue) /// ciopts(recast(rline) lpattern(dash) lwidth(medthick) lcolor(green)) /// coeflabels(1.ym="-1" 2.ym="0" 3.ym="1" 4.ym="2" 5.ym="3" 6.ym="4" 7.ym="5" 8.ym="6" 9.ym="7" 10.ym="8" 11.ym="9") /// ylabel(-.2(.1).8, grid angle(0) labsize(medium) format(%3.1f)) /// xtitle("Years before / after marriage", size(medlarge) margin(0 0 0 2)) /// ytitle("Change in happiness", size(medlarge)) name(dummy3, replace)