2016-09-16 60 views
-1

我有一个数据集:创建列binarise日期R中

Date Customer ID Customer Delivery City Category 
31/12/2015 14057267 a NewCity Software - System Infrastructure 
31/12/2015 14057267 a NewCity Software - Information/Data Management 
31/12/2015 14057267 a NewCity Software - Information/Data Management 
31/12/2015 14057267 b NewCity Software - Information/Data Management 
31/12/2015 14057267 b OldCity Software - Information/Data Management 
31/12/2015 14057267 c OldCity Software - Information/Data Management 
31/12/2015 14057267 c OldCity Software - Information/Data Management 

我想根据日期来创建新列,所以如果最大日期是31,我需要尽可能多的列数天。这些列将有0或1个值,这取决于日期列中的日期,例如如果日期是01,那么X_1=1 &剩下31天的列X_2 ... X31 = 0。我想对日期进行二进制化,同样我想为客户名称做X_a,X_b,X_c,它们也将具有值0 & 1。

有人可以帮忙吗?

+0

能否请您提供您的数据的采样('dput(头(your_data))')和预期输出? –

回答

2

如何以下(只是在数据帧2列所示):

# initial dataframe 
head(df) 
# Date  Customer 
#1 01/12/2015  b 
#2 02/12/2015  c 
#3 03/12/2015  a 
#4 04/12/2015  b 
#5 05/12/2015  b 
#6 06/12/2015  b 

df$X <- substring(as.character(df$Date), 1, 2) 
df <- cbind.data.frame(df, model.matrix(~X-1, df))[-3] 

# final dataframe 
head(df) 
# Date  Customer X01 X02 X03 X04 X05 X06 X07 X08 X09 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31 
#1 01/12/2015  c 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#2 02/12/2015  a 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#3 03/12/2015  a 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#4 04/12/2015  b 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#5 05/12/2015  c 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#6 06/12/2015  a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
+0

感谢您发送此通过,但我已经找到了一个方便的方式:for(level in unique(B2B_Data $ Day)){B2B_Data [paste(“Day”,level,sep =“_”)] - ifelse(B2B_Data $ Day == level,1,0) } – user6016731