取消python脚本中的next（）函数

这一个
是
XXX 123gt数1121
12345 FRE 233fre
问题的文件。
它包含
XXX HY 456 EFE
RTG 1215687 FWE
许多错误
，我想
toget我写了一个脚本摆脱

。每当XXX遇到：

线被替换为自定义字符串（东西）。
下一行将被替换为另一个自定义字符串（stg）。

下面是脚本：

subject='problematic.txt' 
pattern='xxx' 
subject2='resolved.txt' 
output = open(subject2, 'w') 
line1='something' 
line2='stg' 


with open(subject) as myFile: 
    for num, line in enumerate(myFile, 1): #to get the line number 
     if pattern in line: 
      print 'found at line:', num 
      line = line1 #replace the line containing xxx with 'something' 
      output.write(line) 
      line = next(myFile, "") # move to the next line 
      line = line2 #replace the next line with 'stg' 
      output.write(line) 
     else: 
      output.write(line) # save as is 
output.close() 
myFile.close()

它与第一XXX发生效果很好，但不与subsequents。原因来自next()，它推动迭代，因此我的脚本在错误的地方进行了更改。

这里是输出：

发现在行：6

代替：在线路3

发现

发现在行：3

在行：7

因此所做的更改不会在写的地方做......理想的情况下，取消next()后，我改变了与2号线将解决我的问题行了，但我没有找到以前的（）功能。任何人？谢谢！！

来源

2014-01-29 Sara

这听起来像你想要做的是* peek *在下一行而不改变循环？ – mhlester

使用with-statement来打开文件的重点在于，您完成后不必显式关闭它。换句话说，'myFile.close（）'行是完全不必要的。 :) – iCodez

你不需要''with''下的'for'循环吗？就像'for line in myFile：' – wnnmaw

当您认为需要向前看时，回顾问题几乎总是比较简单。在这种情况下，只需跟踪上一行并查看即以查看它是否与您的目标字符串匹配。

infilename = "problematic.txt" 
outfilename = "resolved.txt" 

pattern = "xxx" 
replace1 = "something" 
replace2 = "stg" 

with open(infilename) as infile: 
    with open(outfilename, "w") as outfile: 

     previous = "" 

     for linenum, current in enumerate(infile): 
      if pattern in previous: 
       print "found at line", linenum 
       previous, current = replace1, replace2 
      if linenum:   # skip the first (blank) previous line 
       outfile.write(previous) 
      previous = current 

     outfile.write(previous) # write the final line

来源

2014-01-29 21:12:24 kindall

它的工作原理，谢谢！我用'with open'开始的行出现错误：_NameError：name'outfile'未定义_ – Sara

我已经对代码进行了一些更新...您可能遇到了错误的编辑。特别是我在'with'语句的语法中犯了一个错误。 – kindall

我仍然得到同样的错误，但也许这是一个版本问题...我采取了另一种方式。你的代码太好了！尽管如此，我仍然遇到了困难。 '用'打开一条线，'打开'打开另一条线？ – Sara

您可以压缩线这样一次获得两个指针：

with open(subject) as myFile: 
    lines = myFile.readlines() 
    for current, next in zip(lines, lines[1:]) 
     ...

编辑：这只是证明荏苒线的理念，为大文件使用ITER（MYFILE），意思是：

with open(subject) as myFile: 
    it1 = myFile 
    myFile.next() 
    for current, next in zip(it1,myFile): 
     ...

注意到该文件是可迭代的，无需任何额外的包装给它添加

来源

2014-01-29 20:47:53

不readline读取整个文件到内存中？ – Hyperboreus

对于非常大的文件，这不是一个好主意。 – tzaman

我的文件确实很大（超过10 Gigs） – Sara

这似乎与字符串合作，在奇数和偶数行号出现两者代替：

with open ('test.txt', 'r') as f: 
    for line in f: 
     line = line.strip() 
     if line == 'apples': #to be replaced 
      print ('manzanas') #replacement 1 
      print ('y más manzanas') #replacement 2 
      next (f) 
      continue 
     print (line)

样品输入：

apples 
pears 
apples 
pears 
pears 
apples 
pears 
pears

输出示例：

manzanas 
y más manzanas 
manzanas 
y más manzanas 
pears 
manzanas 
y más manzanas 
pears

来源

2014-01-29 20:50:30 Hyperboreus

没有previous功能，因为这是迭代器协议不是如何工作的。特别是对于发电机，“先前”元素的概念可能不存在。

相反，你要遍历文件有两个光标：zip平在一起：

from itertools import tee 

with open(subject) as f: 
    its = tee(f) 
    next(its[1]) # advance the second iterator to first line 
    for first,second in zip(*its): # in python 2, use itertools.izip 
     #do something to first and/or second, comparing them appropriately

以上仅仅是喜欢做for line in f:，除非你现在有你的第一线first后立即行它在second。

来源

2014-01-29 20:51:40 roippi

我只想设置一个标志，表明你想跳过下一行，并在循环，而不是使用next检查为：

with open(foo) as myFile: 
    skip = False 
    for line in myFile: 
    if skip: 
     skip = False 
     continue 
    if pattern in line: 
     output.write("something") 
     output.write("stg") 
     skip = True 
    else: 
     output.write(line)

来源

2014-01-29 20:55:39 tzaman

通常不是一个抱怨drive-by downvotes，但这实际上解决了问题（不同于所有的'zip'建议..）小心解释？ – tzaman

顺便说一句，拉链怎么没有解决问题？ –

@Guy - 如果当前和下一行的内容都需要在一起，''zip'会很有用;事实并非如此。所有需要发生的情况是遇到特定符号时，该行和下一行（不管内容）是否被预定的替换行取代。不需要“拉链”任何东西。 – tzaman

您需要缓冲以某种方式线。这很容易让一个单一的线做：

class Lines(object): 

    def __init__(self, f): 
     self.f = f  # file object 
     self.prev = None # previous line 

    def next(self): 
     if not self.prev: 
      try: 
       self.prev = next(self.f) 
      except StopIteration: 
       return 
     return self.prev 

    def consume(self): 
     if self.prev is not None: 
     self.prev = next(self.f)

现在你需要调用Lines.next()获取下一行，并Lines.consume()来使用它。一条线保持缓冲直到它被消耗：

>>> f = open("table.py") 
>>> lines = Lines(f) 
>>> lines.next() 
'import itertools\n' 
>>> lines.next()  # same line 
'import itertools\n' 
>>> lines.consume() # remove the current buffered line 
>>> lines.next() 
'\n'     # next line

来源

2014-01-29 20:58:34 michaelmeyer

您目前的代码几乎可行。我相信它可以正确识别并过滤出输入文件的正确行，但是它会报告找不到错误匹配的行号，因为enumerate生成器看不到跳过的行。

虽然您可以用其他答案所建议的各种方式重写它，但您不需要进行重大更改（除非您想要，因为其他设计原因）。这里的代码需要最小的变化，需要新的评论指出：

with open(subject) as myFile: 
    gen = enumerate(myFile, 1) # save the enumerate generator to a variable 
    for num, line in gen:  # iterate over it, as before 
     if pattern in line: 
      print 'found at line:', num 
      line = line1 
      output.write(line) 
      next(gen, None)  # advance the generator and throw away the results 
      line = line2 
      output.write(line) 
     else: 
      output.write(line)

来源

2014-01-29 21:38:00 Blckknght

这是一个很好的选择和一个很好的方法来回答这个问题。 – kindall

太棒了！作品也:) – Sara

取消python脚本中的next（）函数

回答

相关问题