log using match2002, text replace *by Jean Roth, 2003-01-30, jroth@nber.org *This program takes a minute or two to run per year set mem 750m set more 1 local latest_year=2003 ! rm -f /tmp/match.raw match.dct matchhead ! echo >matchhead dictionary using /tmp/match.raw program define match2 if `1' < 1994 { !zcat /home/data/morg/raw/morg`2'.Z |tr "\-A" " " >/tmp/match.raw } if `1' >= 1994 { !zcat /home/data/morg/raw/morg`2'.Z >/tmp/match.raw } *if `1' >= 1994 {!zcat /homes/nber/jroth/morg/morg`2'.Z >/tmp/match.raw} ! cat /home/data/morg/sources/matchhead /home/data/morg/sources/match`3'.dbd >./match.dct quietly infile using match if mage>15&mage!=. #delimit cr * Miscellaneous (record keeping) variables gen id = _n if `1' == 1994 { replace hhid = hhid94 drop hhid94 } if `1'== 1995 { replace hhid = hhid94 if intmonth<9 drop hhid94 } if `1' > 1995 & `1' <= 1997 { drop hhid94 } * Person Match unab vlist: _all sort `vlist' generate int mym = `1' * 10 + mminsamp label var mym "Match year and month-in-sample" display "Sort vars" local sort_stem mym intmonth state hhid hhnum local sortnf `sort' lineno local sortf `sort' famnum lineno local sort94 `sort' famnum lineno serial if ( `1' < 1984 ) { local sort `sort_stem' `sortnf' } if ( `1' >= 1984 & `1' < 1994 ) { local sort `sort_stem' `sortf' } if (`1' >= 1994 ) { local sort `sort_stem' `sort94' } display "sort `sort' id" sort `sort' id by `sort94' : gen dup = _n display "Removing duplicates to avoid creating extra observations" ** WARNING: If merge variable duplicates aren't eliminated from ** the master database too, then extra observations will be created" ** re: http://www.stata.com/support/faqs/data/merge.html tab dup drop if dup>1 drop dup saveold /home/data/morg/match/match`1'4.dta, replace drop if mminsamp==4 drop mminsamp if ( `1' == 1984 ) { sort `sort_stem' `sortnf' } if ( `1' == 1994 ) { sort `sort_stem' `sortf' } saveold /home/data/morg/match/match`1'8.dta,replace clear use /home/data/morg/match/match`1'4.dta drop if mminsamp==8 drop mminsamp saveold /home/data/morg/match/match`1'4.dta,replace clear ! rm -f match.dct /tmp/match.raw end * Each block of commands does one year of the data. The data must be * decompressed, and possibly have dashes converted to blanks. * Then a dictionary for the particular year is prepared by * concatenating a one line header (with the file name) to a generic * dictionary body that covers several years of data that used the * same format. *Lastly, the data is read, modified, summarized and saved. match2 1979 79 79_83 match2 1980 80 79_83 794 818 match2 1981 81 79_83 804 828 match2 1982 82 79_83 814 838 match2 1983 83 79_83 824 848 match2 1984 84 84_88 834 858 match2 1985 85 84_88 844 868 match2 1986 86 84_88 854 878 match2 1987 87 84_88 864 888 match2 1988 88 84_88 874 898 match2 1989 89 89_93 884 908 match2 1990 90 89_93 894 918 match2 1991 91 89_93 904 928 match2 1992 92 89_93 914 938 match2 1993 93 89_93 924 948 match2 1994 94 94_97 934 958 match2 1995 95 94_97 944 968 match2 1996 96 94_97 954 978 match2 1997 97 94_97 964 988 match2 1998 98 98 974 998 match2 1999 99 98 984 008 match2 2000 00 98 994 018 match2 2001 01 98 004 028 match2 2002 02 98 014 038 match2 2003 02 98 024 048