第一行出现

-5

我有两个varaibles和总额来分类由第一行出现

我想在一个变量

output 

a amount 
112 12000 
113 14000 
114 18000 
115 19000

来源

2016-05-13 111111

那你试试 –

http://stackoverflow.com/questions/34042294/首先获取数据的第一行数据，或者http://stackoverflow.com/questions/19451032/r-returning-first-row-of-group或http：// stats。 stackexchange.com/questions/7884/fast-ways-in-r-to-get-the-first-row-of-a-data-frame-grouped-by-an-iden tifier或http://stackoverflow.com/questions/19424762/efficiently-selecting-top-number-of-rows-for-each-unique-value-of-a-column-in-a或http：// stackoverflow。 com/questions/13279582/select-only-the-first-rows-for-each-unique-value-of-column-in-r – thelatemail

每个值的第一行出现，我们可以使用

library(data.table) 
setDT(df1)[, head(.SD, 1), by = a]

或一个快速变体（由@Symbolix贡献）

setDT(df1)[df1[, .I[1L], by = a]$V1]

或者使用unique

unique(setDT(df1), by = "a") 
# a amount 
#1: 112 12000 
#2: 113 14000 
#3: 114 18000 
#4: 115 19000

或者

library(dplyr) 
df1 %>% 
    group_by(a) %>% 
    slice(1)

或者使用summarise与first

df1 %>% 
    group_by(a) %>% 
    summarise(amount = first(amount))

或用base R

aggregate(.~a, df1, head, 1) 
# a amount 
#1 112 12000 
#2 113 14000 
#3 114 18000 
#4 115 19000

来源

2016-05-13 04:53:32 akrun

我怀疑避免'.SD'更快'dt [dt [ .I [1]，by = a] $ V1]'？ – SymbolixAU

这些都是合法的答案，它具有“基础R”以及封装解决方案。所以，我不知道为什么这是downvoted。我可以想象它是某种偏见的选民。 – akrun

您可以使用duplicated这会给你重复的值。你可以用!运营商忽略它们

df[!duplicated(df$a), ] 


# a amount 
#1 112 12000 
#3 113 14000 
#4 114 18000 
#6 115 19000

或者

你也可以使用match随着unique

df[match(unique(df$a), df$a), ] 

# a amount 
#1 112 12000 
#3 113 14000 
#4 114 18000 
#6 115 19000

来源

2016-05-13 04:58:29

回答

相关问题