2015-01-10 39 views
0

我正在尝试编写一个循环来生成并填写一个虚拟变量,以确定某个人是否是该年度某个特定团体的成员。我的数据很长,每个观察结果都是一个人,一年。它看起来像下面。Stata循环观察以年为字符串变量

X1     X2     X3     
AR, 1972-1981  PDC, 1982-1986  PFL, 1986-. 
MD, 1966-1980  PMDB, 1980-1988  PSB, 1988-. 
MD, 1966-1968  AR, 1968-1980  PDS, 1980-1985 

在逗号之前是派对,之后是该人是派对成员的年份。 任何帮助将不胜感激!

到目前为止我的代码是:

rename X1 XA 
rename X2 XB 
rename X3 XC 

foreach var of varlist XA XB XC{ 
    split `var', parse (,) 
} 
tabulate XA1, gen(p) 
+0

PLZ分享你已经尝试过,但不起作用varlist中的X1 X2 X3的 –

+0

的foreach VAR代码{ 分裂'变种”,解析(,) }制表X1,根() – user4438802

+0

哦,对不起,我认为这是一个python问题) 但是,我建议用你的代码更新问题,这通常有助于获得答案) –

回答

2

下面是做到这一点的方法之一。我不得不假设在X3中缺失的年份对应于什么,所以你需要改变它。

/* Enter Data */ 
clear 

input str20 X1 str20 X2 str20 X3     
"AR, 1972-1981"  "PDC, 1982-1986"  "PFL, 1986-." 
"MD, 1966-1980"  "PMDB, 1980-1988"  "PSB, 1988-." 
"MD, 1966-1968"  "AR, 1968-1980"  "PDS, 1980-1985" 
end 

compress 

/* Split X1,X2,X3 into party, start year and end year and create 3 ID variables that we need later */ 
forvalues v=1/3 { 
    split X`v', parse(", " "-") 
    gen id`v'=_n 
} 

/* Makes years numeric, and get rid of messy original data */ 
destring X12 X13 X22 X23 X32 X33, replace 
replace X33 = 1990 if missing(X33) // enter your survey year here 
drop X1 X2 X3 

/* stack the spells on top of each other */ 
stack (id1 X11 X12 X13) (id2 X21 X22 X23) (id3 X31 X32 X33), into(id party year1 year2) clear 
drop _stack 

/* Put the data into long format and fill in the gaps */ 
reshape long year, i(id party) j(p) 
drop p 
/* need this b/c people can be in more than one party in a given year */ 
egen idparty = group(id party), label 
xtset idparty year 
tsfill 
carryforward id party, replace 
drop idparty 

/* create party dummies */ 
tab party, gen(DD_) 

/* rename the dummies to have party affiliation at the end instead of numbers */ 
foreach var of varlist DD_* { 
    levelsof party if `var'==1, local(party) clean 
    rename `var' ind_`party' 
} 

drop party 

/* get back down to one person-year observation */ 
collapse (max) ind_*, by(id year) 

list id year ind_*, sepby(id) noobs 
1

关注Dimitriy的领导(和解释),这里有一个稍微不同的方式。我对丢失的终点做出了不同的假设,即我将该系列截断为最后一个已知年份。

clear 
set more off 

input /// 
str15 (XA     XB     XC)     
"AR, 1972-1981"  "PDC, 1982-1986"  "PFL, 1986-." 
"MD, 1966-1980"  "PMDB, 1980-1988" "PSB, 1988-." 
"MD, 1966-1968"  "AR, 1968-1980" "PDS, 1980-1985" 
end 

list 

*----- what you want? ----- 

// main 
stack X*, into(X) clear 
bysort _stack: gen id = _n 
order id, first 

split X, parse (, -) 
rename (X1 X2 X3) (party sdate edate) 

destring ?date, replace 
gen diff = edate - sdate + 1 
expand diff 

bysort id party: replace sdate = sdate[1] + _n - 1 

drop _stack X edate diff 

// create indicator variables 
tabulate party, gen(y) 

// fix years with two or more parties 
levelsof party, local(lp) clean 
collapse (sum) y*, by(id sdate) 

// rename 
unab ly: y* 
rename (`ly') (`lp') 

list, sepby(id) 
相关问题