2015-10-26 42 views
0

我有一个字符串变量,看起来像接收到的数据串的顺序:如何更改基于日期

var_name 
25-DEC-99: A11, B14, C89; 28-FEB-94: A27, B94, C30 
01-APR-11: A25, B82, C65 
04-JUL-09: A21, B55, C26; 12-MAR-03: A11, B72, C68; 08-JUN-11: A62, B47, C82 
12-JUN-00: A77, B19, C73; 03-JUL-12: A99, B04, C54 
27-OCT-15: A22, B95, C08 

等。我的目标是将这些字符串分成不同的变量名称。变量名是v1_datev1_Av1_Bv1_Cv2_datev2_Av2_Bv2_Cv3_datev3_Av3_Bv3_C

我可以用split var_name, p(";"),重命名为v1v2,并且v3,然后再split做到这一点。但问题是我想要v1v2v3基于日期的时间顺序,并且数据当前没有按照这种方式排列。如何使v1的日期在v2之前,并且v2的日期在v3之前?例如,在第一次观察中,我希望25-DEC-99: A11, B14, C89v228-FEB-94: A27, B94, C30关联,并与v1关联。

回答

1

以下让你接近,我相信。它使用splitreshape

clear 
set more off 

input /// 
str100 myvar 
"25-DEC-99: A11, B14, C89; 28-FEB-94: A27, B94, C30" 
"01-APR-11: A25, B82, C65" 
"04-JUL-09: A21, B55, C26; 12-MAR-03: A11, B72, C68; 08-JUN-11: A62, B47, C82" 
"12-JUN-00: A77, B19, C73; 03-JUL-12: A99, B04, C54" 
"27-OCT-15: A22, B95, C08" 
end 

split myvar, p(;) 
drop myvar 

gen obs = _n 
reshape long myvar, i(obs) 
drop if missing(myvar) 

split myvar, p(:) 
drop myvar 

gen myvar11 = date(myvar1, "DMY", 2020) 
format %td myvar11 

drop myvar1 
rename (myvar11 myvar2) (mydate mycells) 
order mydate, before(mycells) 

bysort obs (mydate) : gen neworder = _n 
drop _j 

reshape wide mydate mycells, i(obs) j(neworder) 

list 

您可以循环在mycells变量,如果您需要进一步split他们。

+0

这就是OP所要求的,但我的预测是数据结构将证明很尴尬。 –

+0

@NickCox我同意。原始的海报可以保留一个面板结构放弃最后一次'重塑'。 –

1

一般来说,请考虑使用dataex(SSC)来创建简单的数据示例。

您不给所有(不是平凡的)代码,您用于split变量。碰巧,我不认为你的变量名称很容易处理,所以我以我自己的方式重新创建了分割。如果你的分割数据,然后按日期排序很容易,但我已经拉上了短缺reshape wide,因为我怀疑长期结构更容易处理。

clear 
input str80 data 
"25-DEC-99: A11, B14, C89; 28-FEB-94: A27, B94, C30" 
"01-APR-11: A25, B82, C65" 
"04-JUL-09: A21, B55, C26; 12-MAR-03: A11, B72, C68; 08-JUN-11: A62, B47, C82" 
"12-JUN-00: A77, B19, C73; 03-JUL-12: A99, B04, C54" 
"27-OCT-15: A22, B95, C08" 
end 

split data, p(;) gen(x) 

local j = 1 
gen work = "" 
foreach x of var x* { 
    replace work = substr(`x', 1, strpos(`x', ":") - 1) 
    gen date`j' = daily(work, "DMY", 2050) 
    replace work = substr(`x', strpos(`x', ":") + 1, .) 
    split work, p(,) 
    rename (work1 work2 work3) (vA`j' vB`j' vC`j') 
    local ++j 
} 

drop work 
drop x* 
drop data 

gen id = _n 
edit 
reshape long date vA vB vC, i(id) j(which) 
drop if missing(date) 
bysort id (date): replace which = _n 
list, sepby(id) 

    +----------------------------------------+ 
    | id which date vA  vB  vC | 
    |----------------------------------------| 
    1. | 1  1 12477 A27 B94 C30 | 
    2. | 1  2 14603 A11 B14 C89 | 
    |----------------------------------------| 
    3. | 2  1 18718 A25 B82 C65 | 
    |----------------------------------------| 
    4. | 3  1 15776 A11 B72 C68 | 
    5. | 3  2 18082 A21 B55 C26 | 
    6. | 3  3 18786 A62 B47 C82 | 
    |----------------------------------------| 
    7. | 4  1 14773 A77 B19 C73 | 
    8. | 4  2 19177 A99 B04 C54 | 
    |----------------------------------------| 
    9. | 5  1 20388 A22 B95 C08 | 
    +----------------------------------------+