我有以下代码:如何比较两个CSV文件在Python 3 - 模块格式 -
import csv
import subprocess
from subprocess import check_output
# Writing the pacman command output to file in csv format
sysApps = check_output(["pacman", "-Qn"])
sysAppsCSV = csv.DictReader(sysApps.decode('ascii').splitlines(),
delimiter=' ', skipinitialspace=True,
fieldnames=[ 'name', 'version']) # Thanks to https://stackoverflow.com/a/8880768/5565713 jcollado
with open('pacman.csv', 'w') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(sysAppsCSV)
# Writing the pip command output in csv format
pipApps = check_output(["pip", "list"])
pipAppsCSV = csv.DictReader(pipApps.decode('ascii').splitlines(),
delimiter=' ', skipinitialspace=True,
fieldnames=[ 'name', 'version']) # Thanks to https://stackoverflow.com/a/8880768/5565713 jcollado
with open('pip.csv', 'w') as csvfile:
rows_pip = csv.writer(csvfile)
rows_pip.writerow(pipAppsCSV)
# Comparing the files
我要比较两个文件,不是必需的文件,也可以是变量的内容已经创建,并从pip.csv
文件得到结果作为差异,实际上我想知道什么是pip.csv
而不是pacman.csv
。 here的例子不适用于我的情况,但我会通过列出名称和版本以类似的方式输出结果。
编辑: @Greg Sadetsky感谢您的建议我用你的例子来简化我的代码,但不能解决我的问题,我不能以这种方式比较列表。我取得了一些进展,但我仍然没有得到期望的输出:
import csv
import subprocess
from subprocess import check_output
#Initializing variables
results_sys = ""
results_pip = ""
# Running the linux commands
sys_apps = set(check_output(["pacman", "-Qn"]).splitlines())
pip_apps = set(check_output(["pip", "list"]).splitlines())
# Saving the outputs of the commands in to a CSV format
for row in sys_apps:
result = row.decode('ascii').split(sep=" ")
with open('pacman.csv', 'a') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(result)
for row in pip_apps:
result = row.decode('ascii').split(sep=" ")
with open('pip.csv', 'a') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(result)
# Opening the files and comparing the results
with open('pacman.csv', 'r') as pacmanCSV:
sys_apps = pacmanCSV.readlines()
for row in sys_apps:
apps = row.split(",")
results_sys = results_sys + " " + apps[0]
with open('pip.csv', 'r') as pipCSV:
pip_apps = pipCSV.readlines()
for row in pip_apps:
apps = row.split(",")
results_pip = results_pip + " " + apps[0]
results_final = "List of apps installed from pip:\n################################"
for val in results_pip:
if val not in results_sys:
results_final = results_final + "\n" + val
print(results_final)
当我运行这段代码,我得到一些大写字母,例如:Imgur
确定,所以后阅读有关集我做到了这一点:
r1 = set(results_pip)
r2 = set(results_sys)
print(r1 - r2)
但我得到了类似的结果,只有大写字母的第一个字母出现。
http://stackoverflow.com/questions/15864641/python-difflib-comparing-files http://stackoverflow.com/questions/977491/comparing-2-txt-files-using-difflib-in-python – Sam
问题是'results_sys'和'results_pip'都是你连续追加字符串的字符串(即'results_sys +“”+ apps [0]')。如果你在'for results_pip'中的val中迭代字符串,那么你将一个一个遍历字符串中的字母......这不是你想要做的。我将用您的新版本的解决方案编辑我的答案 –