2013-04-10 25 views
3

我使用python创建了一个邪恶的hang子手游戏,我被卡住了。我试图弄清楚如何把文字放入家庭。例如,假设我有一个列表给出单词列表,如何将它们放入“家庭”?

ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX 

每一个字分为基础上,其中E是几户人家之一:

- - - -, containing ALLY, COOL, GOOD 
- E - -, containing BETA and DEAL 
- - E -, containing FLEW and IBEX 
E - - E, containing ELSE 
- - - E, containing HOPE. 

是否有使用字典来帮助绘制出一个办法什么词属于哪些家庭?我们班还没有谈论字典,但我提前阅读并相信这是可能的。我使用的文件大约有170,000个字,但以上只是一个简单的例子。

+2

是他们所有4个字母的单词? – 2013-04-10 21:42:21

+1

当你说'家人'时,你的意思是什么? – That1Guy 2013-04-10 21:51:33

回答

3
from itertools import groupby 

words = ['ALLY', 'BETA', 'COOL', 'DEAL', 'ELSE', 'FLEW', 'GOOD', 'HOPE', 'IBEX'] 
e_locs = sorted(([c == 'E' for c in w], i) for i, w in enumerate(words)) 
result = [[words[i] for x, i in g] for k, g in groupby(e_locs, lambda x: x[0])] 

结果:

>>> result 
[['ALLY', 'COOL', 'GOOD'], ['HOPE'], ['FLEW', 'IBEX'], ['BETA', 'DEAL'], ['ELSE']] 

这里是一个版本,还跟踪的居身在何处:

words = ['ALLY', 'BETA', 'COOL', 'DEAL', 'ELSE', 'FLEW', 'GOOD', 'HOPE', 'IBEX'] 
result = {} 
for word in words: 
    key = ' '.join('E' if c == 'E' else '-' for c in word) 
    if key not in result: 
     result[key] = [] 
    result[key].append(word) 

结果:

>>> pprint.pprint(result) 
{'- - - -': ['ALLY', 'COOL', 'GOOD'], 
'- - - E': ['HOPE'], 
'- - E -': ['FLEW', 'IBEX'], 
'- E - -': ['BETA', 'DEAL'], 
'E - - E': ['ELSE']} 

要选择最大家庭(使用第一个版本,在哪里result是一个列表的列表):

>>> max(result, key=len) 
['ALLY', 'COOL', 'GOOD'] 

要使用第二个版本选择最大的家族,你可以只使用result.values()代替result,或获得与E的位置,你可以使用以下的家庭一个元组:

>>> max(result.items(), key=lambda k_v: len(k_v[1])) 
('- - - -', ['ALLY', 'COOL', 'GOOD']) 
+0

太棒了,非常感谢。现在,如果我想选择最大的家庭,我会怎么做呢? – Bob 2013-04-10 22:15:27

+0

'key = len'优于'key = lambda x:len(x)'... – 2013-04-10 22:41:46

+0

@BlaXpirit愚蠢的我,谢谢! – 2013-04-10 22:48:05

0

使用正则exprssions你可以做这样的事情:

import re 

def into_families(words): 
    # here you could add as many families as you want 
    families = { 
       '....': re.compile('[^E]{4}'), 
       '...E': re.compile('[^E]{3}E'), 
       '..E.': re.compile('[^E]{2}E[^E]'), 
       '.E..': re.compile('[^E]E[^E]{2}'), 
       'E..E': re.compile('E[^E]{2}E'), 
    } 
    return dict((k, [w for w in words if r.match(w)]) for k, r in families.items()) 

或者,如果你想牛逼Ø动态创建的正则表达式:

def into_families(words): 
    family_names = set(''.join('E' if x == 'E' else '.' for x in w) for w in words) 
    families = dict((x, re.compile(x.replace('.', '[^E]'))) for x in family_names) 
    return dict((k, [w for w in words if r.match(w)]) for k, r in families.items()) 
1
In [1]: from itertools import groupby 

In [2]: import string 

In [3]: words = "ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX".split() 

In [4]: table = string.maketrans('ABCDEFGHIJKLMNOPQRSTUVWXYZ', 
    ...:       '????E?????????????????????') 

In [5]: f = lambda w: w.translate(table) 

In [6]: for k,g in groupby(sorted(words, key=f), f): 
    ...:  print k, list(g) 
    ...:  
???? ['ALLY', 'COOL', 'GOOD'] 
???E ['HOPE'] 
??E? ['FLEW', 'IBEX'] 
?E?? ['BETA', 'DEAL'] 
E??E ['ELSE'] 

# to get the biggest group 
In [7]: max((list(g) for _,g in groupby(sorted(words, key=f), f)), key=len) 
Out[7]: ['ALLY', 'COOL', 'GOOD'] 
0
from collections import defaultdict 
import re 

words = 'ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX'.split() 

groups = defaultdict(list) 

for word in words: 
    indices = tuple(m.start() for m in re.finditer('E', word)) 
    groups[indices].append(word) 

for k, v in sorted(groups.items()): 
    tpl = ['E' if i in k else'-' for i in range(4)] 
    print ' '.join(tpl), ' '.join(v) 
相关问题