这里的问题是,当你使用text_file.readlines()
时,你会得到包括行尾的行的列表。
所以,你得到像这样回:
>>> old_list
['This is 1st line\n', 'This is 2nd line\n',
'This is 3rd line\n', 'This is 4th line\n',
'This is 5th line\n']
然后在这行:
temp3 = [x.strip() for x in new_files if x.strip() not in old_list]
您比较在new_files
每个文件路径,剥离,那些在old_list
它们都具有换行符最后,它们当然永远不会匹配(你也不会使用创建的集合s
,尽管这更多的是性能问题)。
你真的想从old_list
剥离,不new_files
:
old_list = text_file.readlines()
s = set(item.rstrip() for item in old_list)
temp3 = [x for x in new_files if x not in s]
全部放在一起,并凝结了一下:
import os
import os
def diff_dir_with_filelist(directory, filepath):
new_files = os.listdir(directory)
with open(filepath, 'r') as text_file:
old_list = text_file.readlines()
old_files = set(item.rstrip() for item in old_list)
return [x for x in new_files if x not in old_files]
results = diff_dir_with_filelist("C:\\Users\\INstokes\\Desktop\\CityPts\\", "C:\\Users\\INstokes\\Desktop\\CityPts\\file.txt")
print(results)
如何在'''old_files''文件名'格式化?他们是否包含整个文件路径或仅包含名称? – wnnmaw