查找列表中的最大匹配

我一直在尝试在列表中找到最大结果 - 使用置信度值。查找列表中的最大匹配

列表的例子：

[[{u'categories': [u'health-beauty'], u'confidence': 0.3333333333333333}, 
{u'categories': [u'activities-events'], u'confidence': 0.6666666666666666}]]

将返回活动事件字典

[[{u'categories': [u'home-garden'], u'confidence': 0.3333333333333333}, 
{u'categories': [u'None of These'], u'confidence': 0.3333333333333333}, 
{u'categories': [u'toys-kids-baby'], u'confidence': 0.3333333333333333}]]

将返回所有三个，因为他们是平等的

[[{u'categories': [u'entertainment'], u'confidence': 1.0}]]

将返回娱乐

我试图使用python的最大功能：

seq = [x['confidence'] for x in d[0]] 
max(seq)

但只是返回值

来源

2012-11-12 AlexZ

“最大的结果”使用的是什么规则？ – 2012-11-12 08:30:41

更新了问题。谢谢@Tichodroma将继续并做到这一点。 – AlexZ

问题和你想要的应该清楚。 – raton

你可以找到的最大的信心，在你自己的例子，然后用filter创建的所有最大记录的列表：

max_conf = max(x['confidence'] for x in d[0]) 
filter(lambda x: x['confidence']==max_conf, d[0])

如在下面的评论指出，filter可以用列表理解取代：

max_records = [x for x in d[0] if x['confidence'] == max_conf]

来源

2012-11-12 08:47:37 aquavitae

你可能意思是：'max_conf = max（x ['confidence'] for x in d [0]）;如果x ['confidence'] == max_conf]，结果= [x for d [0]] – jfs

不，我的意思是使用过滤器函数，尽管我看到我弄错了所以我会纠正它。当然，列表理解是另一种方式。 – aquavitae

它仍然不正确：'MAX（d [0]，键=拉姆达X：X [ '信心']）'返回整个词典，不只是''confidence''一部分。 – jfs

max(d[0], key=lambda x: x['confidence'])

从d[0]最高confidence属性返回整个元素。

另一种方式：

import operator as op 

max(d[0], key=op.attrgetter('confidence'))

来源

2012-11-12 08:28:45 eumiro

真的希望它能够在0.3333例子中返回全部三个。但这会做。谢谢。 – AlexZ

sorted(d[0], key=lambda k: k['confidence'])[-1]

再多一次的做法。还会返回d[0]中的最高confidence属性的整个元素。

来源

2012-11-12 08:42:51 alexvassel

如果您想要以最高置信度检索所有匹配项，则不会选择max。您首先需要按键=置信度对其进行排序（您可以使用sorted用于此目的，而operator.itemgetter可以检索该键），然后根据置信度对元素进行分组（可以使用itertools.groupby）。最后，具有最高的信心恢复组

from itertools import groupby 
from operator import itemgetter 
groups = groupby(sorted(inlist[0], key = itemgetter(u'confidence'), reverse = True), 
       key = itemgetter(u'confidence')) 
[e[u'categories'] for e in next(groups)[-1]]

例子

>>> inlist = [[{u'categories': [u'health-beauty'], u'confidence': 0.3333333333333333}, {u'categories': [u'activities-events'], u'confidence': 0.6666666666666666}]] 
>>> groups = groupby(sorted(inlist[0], key = operator.itemgetter(u'confidence'), reverse = True),key = operator.itemgetter(u'confidence')) 
>>> [e[u'categories'] for e in next(groups)[-1]] 
[[u'activities-events']] 
>>> inlist = [[{u'categories': [u'home-garden'], u'confidence': 0.3333333333333333}, {u'categories': [u'None of These'], u'confidence': 0.3333333333333333}, {u'categories': [u'toys-kids-baby'], u'confidence': 0.3333333333333333}]] 
>>> groups = groupby(sorted(inlist[0], key = operator.itemgetter(u'confidence'), reverse = True),key = operator.itemgetter(u'confidence')) 
>>> [e[u'categories'] for e in next(groups)[-1]] 
[[u'home-garden'], [u'None of These'], [u'toys-kids-baby']] 
>>> inlist = [[{u'categories': [u'entertainment'], u'confidence': 1.0}]] 
>>> groups = groupby(sorted(inlist[0], key = operator.itemgetter(u'confidence'), reverse = True),key = operator.itemgetter(u'confidence')) 
>>> [e[u'categories'] for e in next(groups)[-1]] 
[[u'entertainment']] 
>>>

来源

2012-11-12 08:55:28 Abhijit

查找列表中的最大匹配

回答

相关问题