2013-11-28 18 views
0

我目前正试图排序的表格清单列表排序的字母数字:的Python列表

[["Chr1", "949699", "949700"],["Chr11", "3219", "444949"], 
["Chr10", "699", "800"],["Chr2", "232342", "235345234"], 
["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]] 

使用内置sorted(),我得到:

[“CHR1”, ''949699','949700'],['Chr1','950000','960000'],['Chr10','699','800'],['Chr11','3219','444949'] ,''Chr2','232342','235345234'],['ChrX','4567','45634']]

但我希望“Chr2”在“Chr10”之前。我目前的解决方案包括改编自页一些代码:Does Python have a built in function for string natural sort?

我目前的解决办法是这样的:

import re 

def naturalSort(l): 
    convert= lambda text: int(text) if text.isdigit() else text.lower() 
    alphanum_key= lambda key: [convert(c) for c in re.split('([0-9]+)', key)] 
    if isinstance(l[0], list): 
     return sorted(l, key= lambda k: [alphanum_key(x) for x in k]) 
    else: 
     return sorted(l, key= alphanum_key) 

屈服正确的顺序:

[['Chr1', '949699', '949700'], ['Chr1', '950000', '960000'], ['Chr2', '232342', '235345234'], ['Chr10', '699', '800'], ['Chr11', '3219', '444949'], ['ChrX', '4567', '45634']] 

有没有更好的方式来做到这一点?

+0

这被称为 '自然排序'。 –

+0

啊..但我认为这可能不是一个骗局,因为他试图自己创造它。但是这个问题可能更适合http://codereview.stackexchange.com – aIKid

+0

我引用了自然排序页面。我具体询问如何对列表进行排序。 – Megatron

回答

0

它是否喜欢:

In [1]: l = [["Chr1", "949699", "949700"],["Chr11", "3219", "444949"],["Chr10", "699", "800"],["Chr2", "232342", "235345234"],["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]] 

In [2]: sorted(l, key=lambda x: int(x[0].replace('Chr', '')) if x[0].replace('Chr', '').isdigit() else x[0]) 
Out[2]: 
[['Chr1', '949699', '949700'], 
['Chr1', '950000', '960000'], 
['Chr2', '232342', '235345234'], 
['Chr10', '699', '800'], 
['Chr11', '3219', '444949'], 
['ChrX', '4567', '45634']] 

或者更优雅的变体:

sorted(l, key=lambda x: int(''.join([i for i in x[0] if i.isdigit()])) if re.findall(r'\d+$', x[0]) else x[0]) 
+0

输入并不总是这种形式。有时也可以只是没有“Chr”前缀的“1”,“2”,“11”,“X”。 – Megatron

+0

改变了排序方式like'sorted(l,key = lambda x:int(''。join([i for i in x [0] if i.isdigit()]))if [i for i in x [0]如果i.isdigit()] else x [0])' – greg

+0

更有趣的变体:'import re;如果re.findall(r'\ d + $',x [0]),则返回true,否则返回false。 )else x [0])' – greg

0

这里有一个更紧凑的解决方案:

natkey = lambda e: [x or int(y) for x, y in re.findall(r'(\D+)|(\d+)', e)] 
print sorted(data, key=lambda item: map(natkey, item))