如何使用python根据最后一列的字符串值对文本文件进行排序？

-1

我有一个问题，排序一个大的文本文件。文本文件看起来像这样：如何使用python根据最后一列的字符串值对文本文件进行排序？

word, two words, 15, 988, anotherword, 99 
also some words, nope, 20, 122, characters, 39 
text, words words, 10, 300, more words, 9

每条线的一端有一个换行符（\ n）的。

我想根据最后一列中的整数对文件进行降序排序。

我用下面的代码，我发现这里的计算器：

from operator import itemgetter 

with open ('sourcefile.txt') as fin: 
lines = [line.split(',') for line in fin] 
lines.sort(key=itemgetter(5),reverse=True) 
with open('sortedfile.txt', 'w') as fout: 
    for el in lines: 
     fout.write('{0}\n'.format(','.join(el)))

我这个解决方案获得的问题是，脚本是按字母顺序排序。像这样：

word, two words, 15, 988, anotherword, 99 
text, words words, 10, 300, more words, 9 
also some words, nope, 20, 122, characters, 39

什么可能是这个问题的实际解决方案？

来源

2016-04-14 David De Smedt

您正在对字符串进行排序，因此它们将按字母顺序排序。如果您想按数字排序，请确保对整数进行排序。要做到这一点，确保在lines第六元素实际上是一个整数，通过简单地调用：

lines[5] = int(lines[5])

你之前排序。

来源

2016-04-14 13:09:55 Joost

您与一些优化代码：

with open ('sourcefile.txt') as fin, open('sortedfile.txt', 'w') as fout: 
    lines_and_numbers = [(line, int(line.rsplit(',', 1)[1])) for line in fin] 
    for el in sorted(lines_and_numbers, key=lambda l:l[1], reverse=True): 
     fout.write(el[0])

我创建列表lines_and_numbers，其由含有从每一行的最后一列的原始线元件0和整数数作为元素的元组的1

然后我遍历每个元组的元素1排序的这个列表。

通过这种方式，您不必再次将每条分割线连接在一起，而且您不需要追加另一条分割线，因为旧分割线仍然存在。

我也用简单的lambda表达式替换了导入的函数。

来源

2016-04-14 13:14:55

这工作就像一个魅力。我花了一段时间才明白你做了什么，但我想我现在就做！谢谢。 –

@DavidDeSmedt不要写“谢谢”意见，请点击左侧灰色的勾号来接受答案。您可以在[帮助]或小型[游览]页面了解有关本网站如何工作的更多信息。 –

您可以先用换行符分隔字符串;然后根据最后一列中的整数对列表进行排序。 [::-1]是反转列表，以便它以降序排列。

import re 
with open('sample.txt', 'r') as fin, open('fout.txt', 'w') as fout: 
    fout.write('\n'.join(sorted(fin.read().split('\n'), key=lambda x: int(re.findall('(\d+)', x)[-1]))[::-1]))

输出文件的内容：

word, two words, 15, 988, anotherword, 99 
also some words, nope, 20, 122, characters, 39 
text, words words, 10, 300, more words, 9

来源

2016-04-14 13:17:15 Quinn

如何使用python根据最后一列的字符串值对文本文件进行排序？

回答

相关问题