2017-02-16 43 views
0

我使用defaultdicts来存储值的列表,其中keys是可以观察到值的时间段。 当从感兴趣的所有时期的列表中查找时,我想找到我的默认字典中最接近的时期(注意:并非所有时期都存储在defaultdict中)。在defaultdict中查找最近的密钥

由于defaultdicts没有排序,但下面的方法不会返回正确的值。

是否有不同的方式返回defaultdicts最接近的可用键?

from collections import defaultdict 
import numpy as np 

def_dict = defaultdict(list) 
# entries that will be stored in the defaultdict 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

# store items from regular dict in defaultdict 
for k, v in reg_dict.items(): 
    def_dict[k] = v 

# Lookup periods 
periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8] 

for period in periods: 

    # this approach does not return the right keys as defaultdicts are not sorted 
    closest_key = np.abs(np.array(list(def_dict.keys())) - period).argmin() 

    print("period: ", period, " - looked up key: ", closest_key) 

这将返回以下:

period: -1 - looked up key: 0 
period: 0 - looked up key: 0 
period: 1 - looked up key: 0 
period: 2 - looked up key: 1 
period: 3 - looked up key: 1 
period: 4 - looked up key: 2 
period: 5 - looked up key: 2 
period: 6 - looked up key: 2 
period: 7 - looked up key: 2 
period: 8 - looked up key: 2 
+2

1)你并不真的需要一个'defaultdict',一个'OrderedDict'会的工作,和2你为什么不按键排序?你可以[编辑]你的帖子来显示预期的输出? –

+0

argmin返回密钥,以便结果正确。如果你想要值,使用'min(closest_key)'。 –

回答

1

我明白的样子,你想类似这样的输出?

[0, 0, 0, 2, 2, 5, 5, 5, 5, 5] 

针对上述情况,所述逻辑将是

closest_key = [min(def_dict.keys(), key = lambda x: abs(x - p)) for p in periods] 

指定可选的参数key内置在python功能是在这样的情况下非常有用。

1

我同意你需要euqlidean距离@septra,但是这是可以实现的与numpy的还有:

from collections import defaultdict 
import numpy as np 

def_dict = defaultdict(list) 
# entries that will be stored in the defaultdict 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

# store items from regular dict in defaultdict 
for k, v in reg_dict.items(): 
    def_dict[k] = v 

periods = [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8] 
a = list(def_dict.keys()) 
for period in periods: 
    closest_key = np.sqrt(np.power(np.add(a, -period),2)).argmin() 
    # OR closest_key = np.abs(np.add(a, -period)).argmin() 

    print("period: ", period, " - looked up key: ", a[closest_key]) 
2

随着OrderedDict和分类键,你可以使用一个二进制搜索。 对于大量的键,查找将比您当前的方法快得多。

既然你想要最近的键,你需要找到低于x的最右边的键和高于x的最左边的键。在找到低于x的最右边键的索引i后,另一个候选键(高于x的最左边键)将在索引i+1上。

您需要确保这些索引仍然在您的数组中。

最后,你只需要计算从这两个值到x的距离。

下面是bisectnp.searchsorted

1

正如埃里克说,DOC,要做到这一点有效,你应该使用二进制搜索。但是,如果键的数量很少,简单的线性搜索可能就足够了。不需要使用defaultdict或OrderedDict,只需对键进行排序。

import numpy as np 

# entries 
reg_dict = {0: ["a", "b"], 2: ["c", "d"], 5: ["k", "h"], -3: ["i", "l"]} 

keys = np.array(sorted(reg_dict.keys())) 
print('keys', keys) 

# Lookup periods 
periods = np.arange(-1, 9) 

for period in periods: 
    closest_key = keys[np.abs(keys - period).argmin()] 
    print("period: ", period, " - looked up key: ", closest_key) 

输出

keys [-3 0 2 5] 
period: -1 - looked up key: 0 
period: 0 - looked up key: 0 
period: 1 - looked up key: 0 
period: 2 - looked up key: 2 
period: 3 - looked up key: 2 
period: 4 - looked up key: 5 
period: 5 - looked up key: 5 
period: 6 - looked up key: 5 
period: 7 - looked up key: 5 
period: 8 - looked up key: 5