2011-06-21 17 views
11

我正在为包含NumPy而不是SciPy的应用程序编写插件。我的插件需要将数据从一个常规3D网格插入另一个常规3D网格。从源代码运行,这可以使用scipy.ndimage非常有效地完成,或者,如果用户没有安装SciPy,我编写的织物生成.pyd。不幸的是,如果用户正在运行二进制文件,那么这些选项都不可用。没有SciPy的NumPy数组的3D插值

我已经写了一个简单的trilinear interpolation例程在Python中给出正确的结果,但对于我使用的数组大小,需要很长时间(〜5分钟)。我想知道是否有一种方法可以使用NumPy中的功能加速它。就像scipy.ndimage.map_coordinates一样,它需要一个3D输入数组和一个阵列,每个点的x,y和z坐标都要进行插值。

def trilinear_interp(input_array, indices): 
    """Evaluate the input_array data at the indices given""" 

    output = np.empty(indices[0].shape) 
    x_indices = indices[0] 
    y_indices = indices[1] 
    z_indices = indices[2] 
    for i in np.ndindex(x_indices.shape): 
     x0 = np.floor(x_indices[i]) 
     y0 = np.floor(y_indices[i]) 
     z0 = np.floor(z_indices[i]) 
     x1 = x0 + 1 
     y1 = y0 + 1 
     z1 = z0 + 1 
     #Check if xyz1 is beyond array boundary: 
     if x1 == input_array.shape[0]: 
      x1 = x0 
     if y1 == input_array.shape[1]: 
      y1 = y0 
     if z1 == input_array.shape[2]: 
      z1 = z0 
     x = x_indices[i] - x0 
     y = y_indices[i] - y0 
     z = z_indices[i] - z0 
     output[i] = (input_array[x0,y0,z0]*(1-x)*(1-y)*(1-z) + 
       input_array[x1,y0,z0]*x*(1-y)*(1-z) + 
       input_array[x0,y1,z0]*(1-x)*y*(1-z) + 
       input_array[x0,y0,z1]*(1-x)*(1-y)*z + 
       input_array[x1,y0,z1]*x*(1-y)*z + 
       input_array[x0,y1,z1]*(1-x)*y*z + 
       input_array[x1,y1,z0]*x*y*(1-z) + 
       input_array[x1,y1,z1]*x*y*z) 

    return output 

显然功能是如此缓慢的原因是for环比在三维空间中的每个点。有什么方法可以执行某种切片或矢量化魔法来加速它?谢谢。

回答

8

事实证明,它很容易矢量化。

output = np.empty(indices[0].shape) 
x_indices = indices[0] 
y_indices = indices[1] 
z_indices = indices[2] 

x0 = x_indices.astype(np.integer) 
y0 = y_indices.astype(np.integer) 
z0 = z_indices.astype(np.integer) 
x1 = x0 + 1 
y1 = y0 + 1 
z1 = z0 + 1 

#Check if xyz1 is beyond array boundary: 
x1[np.where(x1==input_array.shape[0])] = x0.max() 
y1[np.where(y1==input_array.shape[1])] = y0.max() 
z1[np.where(z1==input_array.shape[2])] = z0.max() 

x = x_indices - x0 
y = y_indices - y0 
z = z_indices - z0 
output = (input_array[x0,y0,z0]*(1-x)*(1-y)*(1-z) + 
      input_array[x1,y0,z0]*x*(1-y)*(1-z) + 
      input_array[x0,y1,z0]*(1-x)*y*(1-z) + 
      input_array[x0,y0,z1]*(1-x)*(1-y)*z + 
      input_array[x1,y0,z1]*x*(1-y)*z + 
      input_array[x0,y1,z1]*(1-x)*y*z + 
      input_array[x1,y1,z0]*x*y*(1-z) + 
      input_array[x1,y1,z1]*x*y*z) 

return output 
4

非常感谢这篇文章,并对其进行了跟踪。我将自己放在了矢量化的基础上,以便再次提高速度(至少在我正在使用的数据中)!

我正在处理图像关联,因此我在同一个input_array内插了多组不同的坐标。

不幸的是,我已经使它更复杂一点,但如果我能解释我做了什么,多余的并发症应该a)证明自己和b)变得清楚。您的最后一行(输出=)仍然需要在input_array的非连续位置查找相当数量的数据,因此速度相对较慢。

假设我的3D数据长度为NxMxP。我已经决定做以下事情:如果我可以得到一个(8 x(NxMxP))矩阵的预先计算的灰色值为一个点及其最近的邻居,我还可以计算一个((NxMxP)X 8)矩阵系数(你的第一个系数在上面的例子中是(x-1)(y-1)(z-1)),那么我可以把它们放在一起,然后在家里免费!

对我来说,一个好的收益是我可以预先计算灰色矩阵并回收它!

这里是一个代码示例位(来自两个不同的功能粘贴,所以可能无法工作开箱即用,但应作为灵感的好来源):

def trilinear_interpolator_speedup(input_array, coords): 
    input_array_precut_2x2x2 = numpy.zeros((input_array.shape[0]-1, input_array.shape[1]-1, input_array.shape[2]-1, 8), dtype=DATA_DTYPE) 
    input_array_precut_2x2x2[ :, :, :, 0 ] = input_array[ 0:new_dimension-1, 0:new_dimension-1, 0:new_dimension-1 ] 
    input_array_precut_2x2x2[ :, :, :, 1 ] = input_array[ 1:new_dimension , 0:new_dimension-1, 0:new_dimension-1 ] 
    input_array_precut_2x2x2[ :, :, :, 2 ] = input_array[ 0:new_dimension-1, 1:new_dimension , 0:new_dimension-1 ] 
    input_array_precut_2x2x2[ :, :, :, 3 ] = input_array[ 0:new_dimension-1, 0:new_dimension-1, 1:new_dimension ] 
    input_array_precut_2x2x2[ :, :, :, 4 ] = input_array[ 1:new_dimension , 0:new_dimension-1, 1:new_dimension ] 
    input_array_precut_2x2x2[ :, :, :, 5 ] = input_array[ 0:new_dimension-1, 1:new_dimension , 1:new_dimension ] 
    input_array_precut_2x2x2[ :, :, :, 6 ] = input_array[ 1:new_dimension , 1:new_dimension , 0:new_dimension-1 ] 
    input_array_precut_2x2x2[ :, :, :, 7 ] = input_array[ 1:new_dimension , 1:new_dimension , 1:new_dimension ] 
    # adapted from from http://stackoverflow.com/questions/6427276/3d-interpolation-of-numpy-arrays-without-scipy 
    # 2012.03.02 - heavy modifications, to vectorise the final calculation... it is now superfast. 
    # - the checks are now removed in order to go faster... 

    # IMPORTANT: Input array is a pre-split, 8xNxMxO array. 

    # input coords could contain indexes at non-integer values (it's kind of the idea), whereas the coords_0 and coords_1 are integer values. 
    if coords.max() > min(input_array.shape[0:3])-1 or coords.min() < 0: 
    # do some checks to bring back the extremeties 
    # Could check each parameter in x y and z separately, but I know I get cubic data... 
    coords[numpy.where(coords>min(input_array.shape[0:3])-1)] = min(input_array.shape[0:3])-1 
    coords[numpy.where(coords<0      )] = 0    

    # for NxNxN data, coords[0].shape = N^3 
    output_array = numpy.zeros(coords[0].shape, dtype=DATA_DTYPE) 

    # a big array to hold all the coefficients for the trilinear interpolation 
    all_coeffs = numpy.zeros((8,coords.shape[1]), dtype=DATA_DTYPE) 

    # the "floored" coordinates x, y, z 
    coords_0 = coords.astype(numpy.integer)     

    # all the above + 1 - these define the top left and bottom right (highest and lowest coordinates) 
    coords_1 = coords_0 + 1 

    # make the input coordinates "local" 
    coords = coords - coords_0 

    # Calculate one minus these values, in order to be able to do a one-shot calculation 
    # of the coefficients. 
    one_minus_coords = 1 - coords 

    # calculate those coefficients. 
    all_coeffs[0] = (one_minus_coords[0])*(one_minus_coords[1])*(one_minus_coords[2]) 
    all_coeffs[1] =  (coords[0])  *(one_minus_coords[1])*(one_minus_coords[2]) 
    all_coeffs[2] = (one_minus_coords[0])* (coords[1])  *(one_minus_coords[2]) 
    all_coeffs[3] = (one_minus_coords[0])*(one_minus_coords[1])*  (coords[2]) 
    all_coeffs[4] =  (coords[0])  *(one_minus_coords[1])*  (coords[2])  
    all_coeffs[5] = (one_minus_coords[0])*  (coords[1])  *  (coords[2]) 
    all_coeffs[6] =  (coords[0])  *  (coords[1])  *(one_minus_coords[2]) 
    all_coeffs[7] =  (coords[0])  *  (coords[1])  *  (coords[2]) 

    # multiply 8 greyscale values * 8 coefficients, and sum them across the "8 coefficients" direction 
    output_array = ( input_array[ coords_0[0], coords_0[1], coords_0[2] ].T * all_coeffs).sum(axis=0) 

    # and return it... 
    return output_array 

我没拆xy和z坐标,因为之后将它们重新合并似乎没有用处。在上面的代码中可能会有假设立方数据(N = M = P)的东西,但我不这么认为......

让我知道您的想法!