Python的 - 我使用的代码从这里涉及到路径

http://pythoncentral.io/finding-duplicate-files-with-python/

找到一个文件夹中的重复文件。

这些是我在Python中的第一步（我来自VBA for Excel），我的问题可能很简单，但我尝试了几件事情没有成功。运行代码后，我得到的消息：

-f is not a valid path, please verify 
An exception has occurred, use %tb to see the full traceback.

％TB产生：

SystemExit        Traceback (most recent call last) 
<ipython-input-118-31268a802b4a> in <module>() 
    11    else: 
    12     print('%s is not a valid path, please verify' % i) 
---> 13     sys.exit() 
    14   printResults(dups) 
    15  else: 

SystemExit:

我使用的代码是：

# dupFinder.py 
import os, sys 
import hashlib 

def findDup(parentFolder): 
    # Dups in format {hash:[names]} 
    dups = {} 
    for dirName, subdirs, fileList in os.walk(parentFolder): 
     print('Scanning %s...' % dirName) 
     for filename in fileList: 
      # Get the path to the file 
      path = os.path.join(dirName, filename) 
      # Calculate hash 
      file_hash = hashfile(path) 
      # Add or append the file path 
      if file_hash in dups: 
       dups[file_hash].append(path) 
      else: 
       dups[file_hash] = [path] 
    return dups 


# Joins two dictionaries 
def joinDicts(dict1, dict2): 
    for key in dict2.keys(): 
     if key in dict1: 
      dict1[key] = dict1[key] + dict2[key] 
     else: 
      dict1[key] = dict2[key] 


def hashfile(path, blocksize = 65536): 
    afile = open(path, 'rb') 
    hasher = hashlib.md5() 
    buf = afile.read(blocksize) 
    while len(buf) > 0: 
     hasher.update(buf) 
     buf = afile.read(blocksize) 
    afile.close() 
    return hasher.hexdigest() 


def printResults(dict1): 
    results = list(filter(lambda x: len(x) > 1, dict1.values())) 
    if len(results) > 0: 
     print('Duplicates Found:') 
     print('The following files are identical. The name could differ, but the content is identical') 
     print('___________________') 
     for result in results: 
      for subresult in result: 
       print('\t\t%s' % subresult) 
      print('___________________') 

    else: 
     print('No duplicate files found.') 


if __name__ == '__main__': 
path='C:/DupTestFolder/' #this is the path to analyze for duplicated files 
    if len(sys.argv) > 1: 
     dups = {} 
     folders = sys.argv[1:] 
     for i in folders: 
      # Iterate the folders given 
      if os.path.exists(i): 
       # Find the duplicated files and append them to the dups 
       joinDicts(dups, findDup(i)) 
      else: 
       print('%s is not a valid path, please verify' % i) 
       sys.exit() 
     printResults(dups) 
    else: 
     print('Usage: python dupFinder.py folder or python dupFinder.py folder1 folder2 folder3')

我想有和没有结束路径“ “最后，但结果是一样的。

我正在Jupyter与Python 3

提前

感谢您的帮助！

来源

2017-09-15 Pegaso

路径变量未在您的代码中使用。

您所做的只是对sys.argv[1:]的迭代，它们是脚本的参数。您将每个参数视为目录路径。

在Windows控制台，您可以尝试：

python dupFinder.py C:\DupTestFolder

它应该工作。

来源

2017-09-15 03:34:40

我在Jupyter运行，但我看到的解决方案： \t蟒蛇dupFinder.py C：\ DupTestFolder \t 这里是行不通的。我尝试了以下结果：语法无效 \t 我正在使用路径来访问dupFinder.py，但这也没有帮助。蟒蛇C：\ AnacondaProjects \ dupFinder.py C：\ DupTestFolder \t ALSE返回无效语法我相信我做错了什么，但我想不出什么。 – Pegaso

谢谢！我现在可以运行代码了。我必须做两件事： 1.将dupFinder.py保存到运行我的python安装的相同文件夹中，在我的情况下是C：\ Users \ Pepe – Pegaso

Sys.argv在命令行窗口中工作并使用参数。它自然不适用于jupyter笔记本，或者您需要在jupyter笔记本中找出一些命令。

来源

2017-09-15 04:40:49 Cece

谢谢！我现在可以运行代码了。我必须做两件事情：

保存dupFinder.py到运行我的Python安装，在我的情况C相同的文件夹：\用户\佩佩
打开从蟒蛇cmd窗口（即创建cmd窗口放在python运行的文件夹中），我推测我可以做同样的事情打开命令窗口并导航（cd \ command）到文件夹位置
最后运行python dupFinder.py C：\ DupTestFolder。

现在我需要了解如何将结果保存到.txt文件以供将来使用，我会在发布前搜索它。谢谢你的帮助！

来源

2017-09-16 17:32:13 Pegaso

Python的 - 我使用的代码从这里涉及到路径

回答

相关问题