2014-02-11 54 views
7

什么是最快的方式来采取阵列A和输出unique(A) [即[A]的唯一数组元素集合以及其第i个位置中的第i个条目的第i个条目unique(A)的第i个多样性集合在A中。如何快速获得多重阵列

这是一口,所以这里是一个例子。鉴于A=[1 1 3 1 4 5 3],我想:

  1. unique(A)=[1 3 4 5]
  2. mult = [3 2 1 1]

这可以用一个单调乏味的for循环来实现,但想知道是否有利用MATLAB的阵列性质的方式。

回答

7
uA = unique(A); 
mult = histc(A,uA); 

或者:

uA = unique(A); 
mult = sum(bsxfun(@eq, uA(:).', A(:))); 

标杆

N = 100; 
A = randi(N,1,2*N); %// size 1 x 2*N 

%// Luis Mendo, first approach 
tic 
for iter = 1:1e3; 
    uA = unique(A); 
    mult = histc(A,uA); 
end 
toc 

%// Luis Mendo, second approach  
tic 
for iter = 1:1e3; 
    uA = unique(A); 
    mult = sum(bsxfun(@eq, uA(:).', A(:))); 
end 
toc 

%'// chappjc 
tic 
for iter = 1:1e3; 
    [uA,~,ic] = unique(A); % uA(ic) == A 
    mult= accumarray(ic.',1); 
end 
toc 

结果与N = 100

Elapsed time is 0.096206 seconds. 
Elapsed time is 0.235686 seconds. 
Elapsed time is 0.154150 seconds. 

结果与N = 1000

Elapsed time is 0.481456 seconds. 
Elapsed time is 4.534572 seconds. 
Elapsed time is 0.550606 seconds. 
+0

你有任何意见作为那些二是更快? – Lepidopterist

+0

@GregorianFunk我不知道...另外,它可能取决于'A'的大小。有时候一种解决方案对于小尺寸而言是最快的,但对于大尺寸来说则不是。请给他们一个尝试! –

+1

@GregorianFunk我做了一些测试(见编辑答案)。第一个显然更快。 chappjc的答案非常接近 –

2
[uA,~,ic] = unique(A); % uA(ic) == A 
mult = accumarray(ic.',1); 

accumarray非常快。不幸的是,unique 3个输出变慢。


晚此外:

uA = unique(A); 
mult = nonzeros(accumarray(A(:),1,[],@sum,0,true)) 
2
S = sparse(A,1,1); 
[uA,~,mult] = find(S); 

我发现这个优雅的解决方案中an old Newsgroup thread

测试与benchmark of Luis MendoN = 1000

Elapsed time is 0.228704 seconds. % histc 
Elapsed time is 1.838388 seconds. % bsxfun 
Elapsed time is 0.128791 seconds. % sparse 

(在我的机器,accumarray导致Error: Maximum variable size allowed by the program is exceeded.