2010-07-15 90 views
5

我正在寻找一种在Numpy中线性索引和多维索引之间进行相互转换的快速方法。多维和线性索引之间的Numpy相互转换

为了使我的用法具体,我有一个N个粒子的大量集合,每个粒子分配5个浮点值(维度),给出一个Nx5数组。然后,我使用numpy.digitize每个维度使用适当的边界边界选择,为每个粒子在5维空间中分配一个bin。

N = 10 
ndims = 5 
p = numpy.random.normal(size=(N,ndims)) 
for idim in xrange(ndims): 
    bbnds[idim] = numpy.array([-float('inf')]+[-2.,-1.,0.,1.,2.]+[float('inf')]) 

binassign = ndims*[None] 
for idim in xrange(ndims): 
    binassign[idim] = numpy.digitize(p[:,idim],bbnds[idim]) - 1 

binassign然后包含对应于多维索引的行。如果我当时想多维指标转换为线性指标,我想我会想要做的事,如:

linind = numpy.arange(6**5).reshape(6,6,6,6,6) 

这将使查找每个多维指数把它映射到线性指标。然后,您可以回去使用:

mindx = numpy.unravel_index(x,linind.shape) 

在那里我遇到困难是搞清楚如何利用binassign(NX5的数组)包含每一行的多维指标,即coverting到一维线性指标,由用它来分割线性索引数组linind。

如果任何人有一个(或几个)行索引技巧来在多维索引和线性索引之间来回切换,以向所有N个粒子矢量化操作的方式,我将不胜感激您的洞察力。

回答

3

虽然我非常喜欢EOL的答案,但我想将它推广到每个方向上非均匀数量的箱子,并且还要突出显示C和F款式排序之间的差异。下面是一个例子的解决方案:

ndims = 5 
N = 10 

# Define bin boundaries 
binbnds = ndims*[None] 
nbins = [] 
for idim in xrange(ndims): 
    binbnds[idim] = numpy.linspace(-10.0,10.0,numpy.random.randint(2,15)) 
    binbnds[idim][0] = -float('inf') 
    binbnds[idim][-1] = float('inf') 
    nbins.append(binbnds[idim].shape[0]-1) 

nstates = numpy.cumprod(nbins)[-1] 

# Define variable values for N particles in ndims dimensions 
p = numpy.random.normal(size=(N,ndims)) 

# Assign to bins along each dimension 
binassign = ndims*[None] 
for idim in xrange(ndims): 
    binassign[idim] = numpy.digitize(p[:,idim],binbnds[idim]) - 1 

binassign = numpy.array(binassign) 

# multidimensional array with elements mapping from multidim to linear index 
# Two different arrays for C vs F ordering 
linind_C = numpy.arange(nstates).reshape(nbins,order='C') 
linind_F = numpy.arange(nstates).reshape(nbins,order='F') 

现在进行转换

# Fast conversion to linear index 
b_F = numpy.cumprod([1] + nbins)[:-1] 
b_C = numpy.cumprod([1] + nbins[::-1])[:-1][::-1] 

box_index_F = numpy.dot(b_F,binassign) 
box_index_C = numpy.dot(b_C,binassign) 

,并检查正确性:

# Check 
print 'Checking correct mapping for each particle F order' 
for k in xrange(N): 
    ii = box_index_F[k] 
    jj = linind_F[tuple(binassign[:,k])] 
    print 'particle %d %s (%d %d)' % (k,ii == jj,ii,jj) 

print 'Checking correct mapping for each particle C order' 
for k in xrange(N): 
    ii = box_index_C[k] 
    jj = linind_C[tuple(binassign[:,k])] 
    print 'particle %d %s (%d %d)' % (k,ii == jj,ii,jj) 

以及物品是否完整,如果你想从回去1d以快速向量化方式指向多维指标:

print 'Convert C-style from linear to multi' 
x = box_index_C.reshape(-1,1) 
bassign_rev_C = x/b_C % nbins 

print 'Convert F-style from linear to multi' 
x = box_index_F.reshape(-1,1) 
bassign_rev_F = x/b_F % nbins 

,并再次检查:

print 'Check C-order' 
for k in xrange(N): 
    ii = tuple(binassign[:,k]) 
    jj = tuple(bassign_rev_C[k,:]) 
    print ii==jj,ii,jj 

print 'Check F-order' 
for k in xrange(N): 
    ii = tuple(binassign[:,k]) 
    jj = tuple(bassign_rev_F[k,:]) 
    print ii==jj,ii,jj 
4

你可以简单地计算出每个仓的指数:

box_indices = numpy.dot(ndims**numpy.arange(ndims), binassign) 

标量积简单地做1 * X0 + 5×X1 + 5 * 5 * X2 + ...这是通过与NumPy的dot()非常高效地完成。

+0

谢谢,我概括你的建议,我自己的解决方案。 – JoshAdel 2010-07-16 00:46:10