确定第i个元素的位置矢量

-1

我有一个矢量：a<-rep(sample(1:5,20, replace=T))确定第i个元素的位置矢量

我确定每个值的出现频率：

tabulate(a)

现在我想，以确定最频繁出现的的位置值。

比方说，向量是：

[1] 3 3 3 5 2 2 4 1 4 2 5 1 2 1 3 1 3 2 5 1

制表回报：

[1] 5 5 5 2 3

现在我确定由制表返回的最高值max(tabulate(a))

这将返回

[1] 5

有3个频率为5的值。我想知道这些值在表格输出中的位置。

即我列表的前三项。

来源

2013-08-30 ghb

请为您的标题努力。 *确定向量中第i个元素的位置*应该是一个字母答案... – flodel

也许是更容易table工作：

x <- table(a) 
x 
# a 
# 1 2 3 4 5 
# 5 5 5 2 3 
names(x)[x == max(x)] 
# [1] "1" "2" "3" 
which(a %in% names(x)[x == max(x)]) 
# [1] 1 2 3 5 6 8 10 12 13 14 15 16 17 18 20

另外，有一个与tabulate了类似的方法：

x <- tabulate(a) 
sort(unique(a))[x == max(x)]

这里是数字和字符向量一些基准。数字数据在性能上的差异更为明显。

的样本数据

set.seed(1) 
a <- sample(20, 1000000, replace = TRUE) 
b <- sample(letters, 1000000, replace = TRUE)

功能标杆

t1 <- function() { 
    x <- table(a) 
    out1 <- names(x)[x == max(x)] 
    out1 
} 

t2 <- function() { 
    x <- tabulate(a) 
    out2 <- sort(unique(a))[x == max(x)] 
    out2 
} 

t3 <- function() { 
    x <- table(b) 
    out3 <- names(x)[x == max(x)] 
    out3 
} 

t4 <- function() { 
    x <- tabulate(factor(b)) 
    out4 <- sort(unique(b))[x == max(x)] 
    out4 
}

结果

library(rbenchmark) 
benchmark(t1(), t2(), t3(), t4(), replications = 50) 
# test replications elapsed relative user.self sys.self user.child sys.child 
# 1 t1()   50 30.548 24.244 30.416 0.064   0   0 
# 2 t2()   50 1.260 1.000  1.240 0.016   0   0 
# 3 t3()   50 8.919 7.079  8.740 0.160   0   0 
# 4 t4()   50 5.680 4.508  5.564 0.100   0   0

来源

2013-08-30 19:32:03 A5C1D2H2I1M1N2O1R2T1

是的，但我试图避免表，因为它非常慢 – ghb

是第一个输出“1”“2”“3”正是我正在寻找的 – ghb

如果你指出我没有提到我想避免表格，我可以接受你的答案。但是，如果您有任何想法如何避免它，我也欢迎任何帮助。 – ghb

确定第i个元素的位置矢量

回答

相关问题