for循环到列表理解

嘿所有，我有一些代码来读取文件中的某些行，并想知道它是否会作为列表理解或生成器表达式/函数运行得更快。如果它运行得更快，代码将如何查看？仍在学习Python。感谢您的帮助for循环到列表理解

input = open('C:/.../list.txt', 'r') 
output = open('C:/.../output.txt', 'w') 

x=0 

for line in input: 
    x = x+1 
    if x > 2 and x < 5: 
     output.write(line)

列表文件在新文件中

输出

3 
4

来源

2011-03-17 John_U262D

为什么性能是这个问题？如果问题成为问题，你是不是应该学习如何编写可理解和可维护的代码，并担心性能问题。如果可以的话， – 2011-03-17 19:41:28

+10给@David。另外，无论如何处理内存中的数据，文件I/O都很慢。 – delnan 2011-03-17 19:43:43

无需列表理解。

output.write(''.join(itertools.islice(inputfile, 2, 4))

来源

2011-03-17 19:39:35

整洁。唯一的潜在缺陷是，这会将整个数据保存在内存中，所以如果“停止 - 启动”很大...... – delnan 2011-03-17 19:45:21

如果你想与发电机做到这一点：

output.writelines(line for line in input if 2 < int(line) < 5)

来源

2011-03-17 19:46:17 Narcolei

我不知道快速，但是这会占用较少的内存，因为它只能在一个期限一次。 – theheadofabroom 2011-03-17 19:52:17

@Jochen：谢谢，我修好了。 – Narcolei 2011-03-17 19:55:13

这会创建OP代码中不存在的输入文件内容的依赖关系。 – martineau 2011-03-17 20:55:06

不是更快，但如果你想使用列表理解：

output.writelines([line for (x, line) in enumerate(input) if 1 < x < 4])

这里假设你正在使用文件位置的实际行数，而不是文件中的读取值（根据您对x的赋值来判断是真的）。

来源

2011-03-17 20:04:19

您特别询问了关于生成器与列表理解的问题，但总的来说，有一些解决问题的方法。

发电机版本：

input = open('input.txt', 'r') 
output = open('output.txt', 'w') 

def gen() : 
    for line in input : 
     yield "FOO " + line 

for l in gen() : 
    output.write(l)

列表理解：

output.writelines("FOO " + line for line in input)

迭代器风格：

class GenClass(object) : 
    def __init__(self, _in) : 
     self.input = _in 

    def __iter__(self): 
     return self 

    def next(self) : 
     line = self.input.readline() 
     if len(line) == 0 : 
      raise StopIteration 
     return "FOO " + line 

output.writelines(GenClass(input))

思考：

列表解析会拥有一切在内存
列表解析会限制的代码量（功能ONELINE）
发生器是在编码实践
迭代器风格更加灵活，为您提供了可能是最灵活
稍微高一点的初始化成本（物体）

来源

2011-03-17 20:21:13 koblas

找出最快的方法是测试它！

在这段代码中，我假设你关心行的值，而不是哪行号。

import timeit 

def test_comprehension(): 
    input = open('list.txt') 
    output = open('output.txt','w') 
    [output.write(x) for x in input if int(x) > 2 and int(x) < 5] 

def test_forloop(): 
    input = open('list.txt') 
    output = open('output.txt','w') 

    for x in input: 
     if int(x) > 2 and int(x) < 5: 
      output.write(x) 

if __name__=='__main__': 
    times = 10000 

    from timeit import Timer 
    t = Timer("test_comprehension()", "from __main__ import test_comprehension") 
    print "Comprehension: %s" % t.timeit(times) 

    t = Timer("test_forloop()", "from __main__ import test_forloop") 
    print "For Loop: %s" % t.timeit(times)

在此我只设置了几个功能，一个是与列表理解这样做，而另一个做它作为一个for循环。 timeit模块按您指定的次数运行小代码，对其进行计时并返回运行所花费的时间。所以，如果你运行上面的代码，你会得到的东西线沿线的输出：

理解：0.957081079483 For循环：0.956691980362

令人沮丧的是，这是大致相同的两种方式。

来源

2011-03-17 20:32:54

可能是因为它的I/O限制... – martineau 2011-03-17 20:56:13

def copyLines(infname, outfname, lines): 
    lines = list(set(lines)) # remove duplicates 
    lines.sort(reverse=True) 
    with open(infname, 'r') as inf, open(outfname, 'w') as outf: 
     try: 
      i = 1 
      while lines: 
       seek = lines.pop() 
       while i<seek: 
        inf.next() 
        i += 1 
       outf.write(inf.next()) 
       i += 1 
     except StopIteration: # hit end of file 
      pass 

def main(): 
    copyLines('C:/.../list.txt', 'C:/.../output.txt', range(3,5)) 

if __name__=="__main__": 
    main()

请注意，一旦它用完所需的线条就会退出。

来源

2011-03-17 22:10:48

for循环到列表理解

回答

相关问题