在python中追加

我想打开一个文件并逐行读取它。对于每一行我都想使用split（）方法将行分割成单词列表。然后我想检查每一行的每一个单词，看看这个单词是否已经在列表中，如果没有将它追加到列表中。这是我写的代码。在python中追加

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = list() 
for line in fh: 
    stuff = line.rstrip().split() 
    for word in stuff: 
     if stuff not in stuff: 
      line1.append(stuff) 
print line1

我的问题是，当我打印出line1它打印出这样的格式约30重复列表。

['But', 'soft', 'what', 'light', 'through', 'yonder', 'window', 'breaks'], 
['But', 'soft', 'what', 'light', 'through', 'yonder', 'window', 'breaks'], ['It', 'is', 'the', 'east', 'and', 'Juliet', 'is', 'the', 'sun'], 
    ['It', 'is', 'the', 'east', 'and', 'Juliet', 'is', 'the', 'sun'] 
    ['Arise', 'fair', 'sun', 'and', 'kill', 'the', 'envious', 'moon'], 
    ['Arise', 'fair', 'sun', 'and', 'kill', 'the', 'envious', 'moon'],

我想知道为什么会发生这个问题，以及如何删除重复的单词和列表。

来源

2016-03-14 David Asmah

不确定你想要做什么，但我有一种感觉，“如果东西不是东西”至少会伤害你一点点 – inspectorG4dget

你的情况是'如果东西没有东西：'。我认为你的意思是“如果不在list1中：”？如果情况并非如此，你能否更清楚地解释你想要发生什么？ –

您有if stuff not in stuff。如果您将该行更改为if word not in line1:，并将下一行更改为line1.append(word)，则您的代码应该可以正常工作。

或者，使用集合。

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = set() 
for line in fh: 
    stuff = line.rstrip().split() 
    for word in stuff: 
     line1.add(word) 
print line1

甚至

fname = raw_input("Enter file name: ") 
fh = open(fname) 
line1 = set() 
for line in fh: 
    stuff = line.rstrip().split() 
    line1 = line1.union(set(stuff)) 
print line1

集将只包含唯一的值（虽然他们没有排序或索引的概念），这样你就不会需要处理检查单词是否已经拿出早已：设置的数据类型自动处理。

来源

2016-03-14 18:09:54

在python中追加

回答

相关问题