2014-10-16 10 views
5

我有一些代码用于基于2D圆形窗口中的相邻值来计算图像中的缺失值。它还使用来自相同位置处的一个或多个时间相邻图像的值(即,在第三维上移动的相同2D窗口)。蟒蛇 - 结合Argsort和蒙版以获得移动窗口中的最近值

对于缺失的每个位置,我需要计算不一定基于整个窗口中所有可用值的值,但只能计算具有值的空间最近的n个单元格(在两个图像/ Z-轴位置),其中n是小于2D窗口中的单元总数的某个值。

当时,计算窗口中的所有内容要快得多,因为我的分类方法是将数据最近的n个单元排序,这是函数最慢的部分,因为每次都必须重复根据窗口坐标的距离不会改变。我不确定这是否必要,并且感觉我必须能够获得一次排序的距离,然后在只选择可用单元的过程中屏蔽这些距离。

这里是我的代码,选择的数据差距小区位置的窗口中使用:

# radius will in reality be ~100 
radius = 2 
y,x = np.ogrid[-radius:radius+1, -radius:radius+1] 
dist = np.sqrt(x**2 + y**2) 
circle_template = dist > radius 

# this will in reality be a very large 3 dimensional array 
# representing daily images with some gaps, indicated by 0s 
dataStack = np.zeros((2,5,5)) 
dataStack[1] = (np.random.random(25) * 100).reshape(dist.shape) 
dataStack[0] = (np.random.random(25) * 100).reshape(dist.shape) 

testdata = dataStack[1] 
alternatedata = dataStack[0] 
random_gap_locations = (np.random.random(25) * 30).reshape(dist.shape) > testdata 
testdata[random_gap_locations] = 0 
testdata[radius, radius] = 0 

# in reality we will go through every gap (zero) location in the data 
# for each image and for each gap use slicing to get a window of 
# size (radius*2+1, radius*2+1) around it from each image, with the 
# gap being at the centre i.e. 
# testgaplocation = [radius, radius] 
# and the variables testdata, alternatedata below will refer to these 
# slices 

locations_to_exclude = np.logical_or(circle_template, np.logical_or 
            (testdata==0, alternatedata==0)) 
# the places that are inside the circular mask and where both images 
# have data 
locations_to_include = ~locations_to_exclude 
number_available = np.count_nonzero(locations_to_include) 

# we only want to do the interpolation calculations from the nearest n 
# locations that have data available, n will be ~100 in reality 
number_required = 3 

available_distances = dist[locations_to_include] 
available_data = testdata[locations_to_include] 
available_alternates = alternatedata[locations_to_include] 

if number_available > number_required: 
    # In this case we need to find the closest number_required of elements, based 
    # on distances recorded in dist, from available_data and available_alternates 
    # Having to repeat this argsort for each gap cell calculation is slow and feels 
    # like it should be avoidable 
    sortedDistanceIndices = available_distances.argsort(kind = 'mergesort',axis=None) 
    requiredIndices = sortedDistanceIndices[0:number_required] 
    selected_data = np.take(available_data, requiredIndices) 
    selected_alternates = np.take(available_alternates , requiredIndices) 
else: 
    # we just use available_data and available_alternates as they are... 

# now do stuff with the selected data to calculate a value for the gap cell 

这工作,但在功能的总时间的一半被采取蒙面的argsort空间距离数据。 (〜900uS总共1.4mS--而且这个功能将运行数百亿次,所以这是一个重要的区别!)

我相信我一定能够做这个argsort一次以外的函数,最初设置空间距离窗口,然后在掩码中包含这些排序索引,以获取第一个howManyToCalculate索引而不必重新进行排序。答案可能涉及将我们提取的各种数据放入一个记录数组中 - 但我无法弄清楚如何,如果是的话。任何人都可以看到我如何使这部分过程更有效率?

+1

你的代码是真的很难看......您可能需要阅读[PEP8(http://legacy.python.org/dev/peps/pep-0008/),并按照它:它有利于与其他Python程序员共享代码。 – Jaime 2014-10-16 16:15:51

+0

我同意Jaime的观点,这很难阅读,特别是代码,但是这些描述也留下了一些解释空间。所以我不会冒险提供一个答案,但这里有一些工具,我会检查,如果我(如果我至少含糊地理解你的问题正确)。 [sklearn.feature_extraction.image.extract_patches](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/image.py#L238)可以让你看到你的补丁,你可以掩盖。它会创建一个副本,所以要小心内存问题。 – eickenberg 2014-10-16 19:22:27

+0

您可能还对看起来完全不相关的函数感兴趣,该函数使用扩张来补偿丢失的值。它不会给你你的确切结果,但可能是一个很好的代理:[nilearn.masking._extrapolate_out_img](https://github.com/nilearn/nilearn/blob/fd7e7a7186dca43d0ef5ebd19990b0751d476bda/nilearn/masking.py#L65) – eickenberg 2014-10-16 19:27:04

回答

1

所以,你想要做的环路的排序外:

sorted_dist_idcs = dist.argsort(kind='mergesort', axis=None) 

然后使用从原来代码中的一些变数,这是我能想出的,虽然它仍然感觉就像一个大廿四行程..

loc_to_incl_sorted = locations_to_include.take(sorted_dist_idcs) 
sorted_dist_idcs_to_incl = sorted_dist_idcs[loc_to_incl_sorted] 

required_idcs = sorted_dist_idcs_to_incl[:number_required] 
selected_data = testdata.take(required_idcs) 
selected_alternates = alternatedata.take(required_idcs) 

注意required_idcs参考位置在testdata而不是available_data在原代码。我使用take这个片段来方便地索引扁平数组。

+0

谢谢!这回答了我编辑版本中的问题,因此我接受了答案。然而,当距离数据边缘的距离<半径时,它与“实际”数据并不完全一致。在这些情况下,测试数据和交替数据被视为原始(直径*直径)大小的切片,但被裁剪以确保其不与数据的边缘重叠。然后sorted_dist_idcs_to_incl中的数字不再与修剪阵列切片的索引对应。不过,我已经想出了一个基于您的输入的解决方案,我很快就会发布。 – harrygibson 2014-10-20 12:12:30

0

@moarningsun - 感谢您的评论和答复。这些让我走上了正轨,但当距离数据边缘的距离为<半径时,对我而言,这并不适合我:在这种情况下,我在间隙单元周围使用了一个窗口,该窗口“修剪”到数据边界。在这种情况下,索引反映“完整”窗口,因此不能用于从有界窗口中选择单元格。

不幸的是,我在澄清原始问题后编辑了部分代码,但结果相关。

我已经意识到,如果您在argsort的输出上再次使用argsort,那么您将获得排名;即每个项目在整个阵列排序时的位置。我们可以放心地屏蔽这些数据,然后取其中最小的数据(并在结构化数组上执行此操作以同时获取相应的数据)。

这意味着循环内的另一种排序,但实际上我们可以使用分区而不是完全排序,因为我们需要的是最小的num_required项目。如果num_required大大小于数据项的数量,则这比执行argsort要快得多。

例如与num_required = 80和num_available = 15000全argsort花费〜900μs,而argpartition随后索引和切片,以获得第一80取〜110μs。我们仍然需要执行argsort以便在一开始就获得排名(而不是仅基于距离进行分区),以便获得mergesort的稳定性,从而在距离不唯一时获得“正确的”。

我的代码如下所示现在运行在〜610uS的真实数据上,包括这里没有显示的实际计算。我现在对此感到满意,但似乎还有其他几个显然很小的因素会对运行时产生影响,这些因素很难理解。

例如把circle_template结构阵列,使distranks一起,在这里没有示出的另一场,双打整体功能的运行时(即使我们没有在循环访问circle_template!)。更糟糕的是,在order=['ranks']的结构阵列上使用np.partition将整体功能运行时间增加了几乎两个数量级与使用np.argpartition如下所示!

# radius will in reality be ~100 
radius = 2 
y,x = np.ogrid[-radius:radius+1, -radius:radius+1] 
dist = np.sqrt(x**2 + y**2) 
circle_template = dist > radius 
ranks = dist.argsort(axis=None,kind='mergesort').argsort().reshape(dist.shape) 
diam = radius * 2 + 1 

# putting circle_template in this array too doubles overall function runtime! 
fullWindowArray = np.zeros((diam,diam),dtype=[('ranks',ranks.dtype.str), 
             ('thisdata',dayDataStack.dtype.str), 
             ('alternatedata',dayDataStack.dtype.str), 
             ('dist',spatialDist.dtype.str)]) 
fullWindowArray['ranks'] = ranks 
fullWindowArray['dist'] = dist 

# this will in reality be a very large 3 dimensional array 
# representing daily images with some gaps, indicated by 0s 
dataStack = np.zeros((2,5,5)) 
dataStack[1] = (np.random.random(25) * 100).reshape(dist.shape) 
dataStack[0] = (np.random.random(25) * 100).reshape(dist.shape) 

testdata = dataStack[1] 
alternatedata = dataStack[0] 
random_gap_locations = (np.random.random(25) * 30).reshape(dist.shape) > testdata 
testdata[random_gap_locations] = 0 
testdata[radius, radius] = 0 

# in reality we will loop here to go through every gap (zero) location in the data 
# for each image 
gapz, gapy, gapx = 1, radius, radius 
desLeft, desRight = gapx - radius, gapx + radius+1 
desTop, desBottom = gapy - radius, gapy + radius+1 
extentB, extentR = dataStack.shape[1:] 

# handle the case where the gap is < search radius from the edge of 
# the data. If this is the case, we can't use the full 
# diam * diam window 
dataL = max(0, desLeft) 
maskL = 0 if desLeft >= 0 else abs(dataL - desLeft) 
dataT = max(0, desTop) 
maskT = 0 if desTop >= 0 else abs(dataT - desTop) 
dataR = min(desRight, extentR) 
maskR = diam if desRight <= extentR else diam - (desRight - extentR) 
dataB = min(desBottom,extentB) 
maskB = diam if desBottom <= extentB else diam - (desBottom - extentB) 

# get the slice that we will be working within 
# ranks, dist and circle are already populated 
boundedWindowArray = fullWindowArray[maskT:maskB,maskL:maskR] 
boundedWindowArray['alternatedata'] = alternatedata[dataT:dataB, dataL:dataR] 
boundedWindowArray['thisdata'] = testdata[dataT:dataB, dataL:dataR] 

locations_to_exclude = np.logical_or(boundedWindowArray['circle_template'], 
            np.logical_or 
            (boundedWindowArray['thisdata']==0, 
             boundedWindowArray['alternatedata']==0)) 
# the places that are inside the circular mask and where both images 
# have data 
locations_to_include = ~locations_to_exclude 
number_available = np.count_nonzero(locations_to_include) 

# we only want to do the interpolation calculations from the nearest n 
# locations that have data available, n will be ~100 in reality 
number_required = 3 

data_to_use = boundedWindowArray[locations_to_include] 

if number_available > number_required: 
    # argpartition seems to be v fast when number_required is 
    # substantially < data_to_use.size 
    # But partition on the structured array itself with order=['ranks'] 
    # is almost 2 orders of magnitude slower! 

    reqIndices = np.argpartition(data_to_use['ranks'],number_required)[:number_required] 
    data_to_use = np.take(data_to_use,reqIndices) 
else: 
    # we just use available_data and available_alternates as they are... 
    pass 
# now do stuff with the selected data to calculate a value for the gap cell