2015-06-18 162 views
0

我是R编程的初学者(我刚刚完成Coursera课程),并且无法创建此嵌套循环。R:嵌套排序

我有结构化的这样一个CSV(实际上有108列):

Type  Status Campaign Name Group  Budget Budget Type Bids 
Campaign Active Burritos     500  Daily 
Campaign Active Tacos      400  Daily 
Group Active Burritos Bean Burritos       0.5 
Group Active Burritos Beef Burritos       0.5 
Group Paused Burritos Chicken Burritos      0.5 
Group Active Tacos  Beef Tacos       0.5 
Group Active Tacos  Chicken Tacos       0.5 
Group Paused Tacos  Fish Tacos       0.5 

我想重新安排广告系列名称,则表中去除暂停:

Type  Status Campaign Name Group  Budget Budget Type  Bids 
Campaign Active Burritos     500  Daily 
Group Active Burritos Bean Burritos       0.5 
Group Active Burritos Beef Burritos       0.5 
Campaign Active Tacos      400  Daily 
Group Active Tacos  Beef Tacos       0.5 
Group Active Tacos  Chicken Tacos       0.5 

我要使用一系列的For循环,但我一直在遇到错误。我很确定这个rbind有错误。另外,当我创建temp.ds和temp.group.ds时,我认为存在错误。也可能是循环中的错误。

下面是我的代码:R中

ds <- do.call(rbind, lapply(list.files(path=directory, full.names=TRUE), read.table, header=TRUE, sep="\t", fileEncoding="UTF-16LE", fill = TRUE, quote = "")) 

valid.campaign <- ds[ which(ds$Status == "Active" & ds$Type == "Campaign"), ] 

new.ds <- NULL 

for(campaign in valid.campaign$Type) { 
    temp.ds <- valid.campaign[,campaign] 
    valid.group <- ds[ which(ds$Status == "Active" & ds$Type == "Group"), ] 

    for (group in valid.group$Type) { 
    temp.group.ds <- valid.group[,group] 
    temp.ds <-rbind(temp.ds, temp.group.ds) 
    rm(temp.group.ds) 
    } 

    if (exists("new.ds")) new.ds <- rbind(new.ds,temp.ds) 
    else new.ds <- temp.ds 
    rm(temp.ds) 
    } 
new.ds 
} 
+0

由于R是一种解释语言,你可以执行的代码逐行。这应该使您能够找到引发错误的行。一边的说明:你应该尝试在Stackoverflow上发布可复制的代码。 – cryo111

+1

试试'library(dplyr); ds%>%arrange(CampaignName)%>%filter(Status!=“Paused”)' – Khashaa

+0

您可以输入您的数据吗? – Hav0k

回答

0

的dplyr和magrittr包都是优秀的处理这些各种各样的问题。具体而言,在dplyr的安排功能可以安排行,并在dplyr过滤功能允许您删除行:

ds %<>% arrange(CampaignName, Group) %>% filter(Status != 'Paused') 
0

在基地R我会用下面的代码两行。第一个是排序,第二个是子集。当然有办法把它包在oneliner,但我认为这是更具可读性这样的:

ds = ds[order(ds$Campaign_Name, ds$Group),] 
ds = ds[which(ds$Status != "Paused"),] 

给我们:

 Type Status Campaign_Name   Group Budget Budget_Type Bids 
1 Campaign Active  Burritos     500 Daily  NA 
3 Group Active  Burritos Bean Burritos  NA    0.5 
4 Group Active  Burritos Beef Burritos  NA    0.5 
2 Campaign Active   Tacos     400 Daily  NA 
6 Group Active   Tacos Beef Tacos  NA    0.5 
7 Group Active   Tacos Chicken Tacos  NA    0.5