2017-03-17 137 views
0

我有A R数据框df_big合并r dataframes

Candidate Status 
A   1 
B   10 
C   12 
D   15 
E   25 

我有第二个数据帧df_small

Candidate_1 Candidate_2 
A    C 
B    E  
C    D 

我要合并df_smalldf_big到得到df_final看起来像

Candidate_1 Candidate_2  Status_1  Status_2 
A    C     1   12 
B    E     10   25 
C    D     12   15 

我试过的东西效果

df_small_1 = merge(x=df_small,y = df_big,by.x = "Candidate_1",by.y="Candidate") 

df_small_2 = merge(x=df_small,y = df_big,by.x = "Candidate_2",by.y="Candidate") 

,但我不知道如何结合df_small_1df_small_2df_small

+0

像'df_final =合并(X =合并(X = df_small,Y = df_big,by.x = “Candidate_2”,by.y = “候选人” ),y = df_big,by.x =“Candidate_1”,by.y =“Candidate”)' – HubertL

+0

刚刚重塑为long形式比较容易:'library(tidyverse); df_small%>%gather(var,Candidate)%>%left_join(df_big)' – alistaire

回答

1

您需要为每个两位候选人的身份参加两次,一次:

df_result <- merge(x=df_small, y=df_big, by.x="Candidate_1", by.y="Candidate") 
df_result <- merge(x=df_result, y=df_big, by.x="Candidate_2", by.y="Candidate") 
0

合并是一项昂贵的操作。您可以更好地执行此操作,而无需使用其组合和索引的合并操作。我已经对合并和非合并解决方案进行了基准测试。答案也根据需要给出列的顺序。

doit <- function(df_small, df_big) 
{ 

    # Which elements do we need to copy 
    indx1 <- df_big[["Candidate"]] %in% df_small[["Candidate_1"]] 

    indx2 <- df_big[["Candidate"]] %in% df_small[["Candidate_2"]] 

    # Copy them 
    df_needed <- data.frame(Candiate_1 = df_big[indx1, "Candidate"], Candiate_2 = df_big[indx2, "Candidate"], 
          Status_1 = df_big[indx1, "Status"], Status_2 = df_big[indx2, "Status"]) 

} 

#merge two times 
doit_merge <- function(df_small, df_big) 
{ 
    df_result <- merge(x=df_small, y=df_big, by.x="Candidate_1", by.y="Candidate") 
    df_result <- merge(x=df_result, y=df_big, by.x="Candidate_2", by.y="Candidate") 
} 

library(microbenchmark) 

# benchmark results 
microbenchmark(
    doit(df_small, df_big) , 
    doit_merge(df_small, df_big) 
) 

成绩

Unit: microseconds 
expr        min  lq  mean median  uq  max neval cld 
doit(df_small, df_big)  676.570 758.472 1077.203 834.0115 978.9315 4591.473 100 a 
doit_merge(df_small, df_big) 1329.327 1449.205 1986.995 1612.3940 2021.9070 5966.780 100 b