我需要更新一个有1000行的问题的电子表格。过滤多个条件的数据帧
我有两个数据集:
DF
CompanyID1 TMC1
ABC company QBT
BCD company G W TMC
jb hi fi QBT
ABC company GW TMC
FB Company AMEX
LL company AMEX
j k QBT
k. l company TP oil
1 to 1 lts TP oil
2 in 1 pty ltd. AMEX
DF2
DRA CompanyID2 TMC2 Status
11 2 in 1 pty ltd. AMEX sent
12 1 to 1 lts TP oil produce
13 BCD company ACE sent
14 k. l company TP oil sent
15 jb hi fi QBT produce
16 ABC company QBT sent
17 j k QBT sent
18 FB Company AMEX sent
19 facebook pty QBT sent
20 2 in 1 pty ltd. AMEX produce
我所试图实现df2$CompanyID2
首先找到df$CompanyID1
值,如果有一个匹配,那么如果其df$TMC1
匹配df2$TMC2
然后它必须有df2$status=='sent'
然后在创建一个新列并返回df2$DRA
值;如果df2$status=='produce'
然后df$new
应该有 '删除'
例
“ABC公司” 从df2$CompanyID2
存在df1$CompanyID1
。 ABC公司的df$TMC1
匹配df2$TMC2
和df2$status=='sent'
。因此,df$new <- 16
我将非常感谢您的帮助。这将节省大量的时间,我可以用于其他生产目的。由于
dput(DF1)
structure(list(Company.ID1 = structure(c(3L, 4L, 7L, 3L, 5L,
9L, 6L, 8L, 1L, 2L), .Label = c("1 to 1 lts", "2 in 1 pty ltd.",
"ABC company", "BCD company", "FB Company", "j k ", "jb hi fi",
"k. l company", "LL company"), class = "factor"), TMC1 = structure(c(4L,
2L, 4L, 3L, 1L, 1L, 4L, 5L, 5L, 1L), .Label = c("AMEX", "G W TMC",
"GW TMC", "QBT", "TP oil"), class = "factor")), .Names = c("Company.ID1",
"TMC1"), class = "data.frame", row.names = c(NA, -10L))
dput(DF2)
structure(list(DRA = 11:20, Company.ID2 = structure(c(2L, 1L,
4L, 9L, 8L, 3L, 7L, 6L, 5L, 2L), .Label = c("1 to 1 lts", "2 in 1 pty ltd.",
"ABC company", "BCD company", "facebook pty", "FB Company", "j k ",
"jb hi fi", "k. l company"), class = "factor"), TMC2 = structure(c(2L,
4L, 1L, 4L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("ACE", "AMEX",
"QBT", "TP oil"), class = "factor"), Status = structure(c(2L,
1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L), .Label = c("produce", "sent"
), class = "factor")), .Names = c("DRA", "Company.ID2", "TMC2",
"Status"), class = "data.frame", row.names = c(NA, -10L))
#
for (i in 1:nrow(df1))
{
if(df1$Company.ID1[i]==df2$Company.ID2[i] & df1$TMC1[i]==df2$TMC2[i] & df2$Status[i]=='sent')
data1$new[i]<- 'sent'
}else{ data1$new<- 'delete'}
但是可能有超过1家公司从df1$Company.ID1
在df2$Company.ID2
同名并且它们也可以在不同的行中。
我的预期输出将以下内容:
- 从
df1$Company.ID1
匹配X公司名称df2$Company.ID2
- 如果匹配检查X公司的
data1$TMC1
比赛df2df2$TMC2
- 如果1 & 2为真,则检查其状态的公司x从
df2$Status=='sent'
- 如果它是TRUE,那么创建一个新的列df1 $ new并获得DRA编号
df$DRA
,并存储为X公司
感谢
@pierre lafortune谢谢 – Chemjong