获取文件系统上市的Python

可能重复：
Create a tree-style directory listing in Python 获取文件系统上市的Python

我想分析文件系统并打印结果作为格式的文件。我的第一个实现可以简单地使用纯文本，但稍后我想要合并HTML。

我使用的是基本的方法来收集包含在每个文件夹，我可以返回，并发送至文本处理器中的文件：

def gather_tree(source): 
     tree = {} 
     for path, dirs, files in os.walk(source): 
      tree[path] = [] 
      for file in files: 
       tree[path].append(file) 
     return tree

显然，这里的问题是，我产生一个结构没有深度的概念，我想我需要能够正确格式化足够的空间和嵌套列表。

我目前非常基本的打印模式如下：

def print_file_tree(tree): 
    # Prepare tree 
    dir_list = tree.keys() 
    dir_list.sort() 

    for directory in dir_list: 
     print directory 
     for f in tree[directory]: 
      print f

我一种新的数据结构，并希望一些输入！

来源

2012-09-26 james_dean

嗯，你的树是不是一个真正的树，因为你已经注意到:)树由**节点组成**有**父母**和**孩子**。维基百科关于[图论]（http://en.wikipedia.org/wiki/Graph_theory）的文章可能是第一个好读物。但是，什么数据结构最适合取决于你想要对数据做什么 - 也许你甚至不需要树。你的文本处理器会做什么样的工作？ –

首先，我希望能够以合适的格式打印文件，例如XML和HTML。然后，我想收集有关lising中包含的文件类型的信息。 –

然后这似乎是密切相关的[这个问题]（http://stackoverflow.com/questions/2104997/os-walk-python-xml-representation-of-a-directory-structure-recursion）（虽然我会使用[lxml]（http://lxml.de/）来生成XML树）。 –

如果您计划从您的数据创建XML，你实际上将不得不创建一个树状结构。这可以是一个中间树，您可以构建，注释并可能重复或遍历，然后转换为用于创建XML的ElementTree。或者您可以使用lxml的ElementTree API直接构建ElementTree。

无论哪种方式，使用os.walk()是不是要走的路。这听起来可能与直觉相反，但重点是：os.walk()序列化（展平）文件系统树，以便您可以轻松地迭代它，而不必处理编写这样做的递归函数。在你的情况下，你要想要保留树结构，因此如果你自己编写递归函数会容易得多。

这是如何使用lxml建立一个ElementTree一个例子。

（此代码是松散的基础上@MikeDeSimone's answer到类似的问题）

import os 
from lxml import etree 


def dir_as_tree(path): 
    """Recursive function that walks a directory and returns a tree 
    of nested etree nodes. 
    """ 
    basename = os.path.basename(path) 
    node = etree.Element("node") 
    node.attrib['name'] = basename 
    # Gather some more information on this path here 
    # and write it to attributes 
    # ... 
    if os.path.isdir(path): 
     # Recurse 
     node.tag = 'dir' 
     for item in sorted(os.listdir(path)): 
      item_path = os.path.join(path, item) 
      child_node = dir_as_tree(item_path) 
      node.append(child_node) 
     return node 
    else: 
     node.tag = 'file' 
     return node 

# Create a tree of the current working directory 
cwd = os.getcwd() 
root = dir_as_tree(cwd) 

# Create an element tree from the root node 
# (in order to serialize it to a complete XML document) 
tree = etree.ElementTree(root) 

xml_document = etree.tostring(tree, 
           pretty_print=True, 
           xml_declaration=True, 
           encoding='utf-8') 
print xml_document

输出示例：

<?xml version='1.0' encoding='utf-8'?> 
<dir name="dirwalker"> 
    <dir name="top1"> 
    <file name="foobar.txt"/> 
    <dir name="sub1"/> 
    </dir> 
    <dir name="top2"> 
    <dir name="sub2"/> 
    </dir> 
    <dir name="top3"> 
    <dir name="sub3"> 
     <dir name="sub_a"/> 
     <dir name="sub_b"/> 
    </dir> 
    </dir> 
    <file name="topfile1.txt"/> 
    <file name="walker.py"/> 
</dir>

来源

2012-09-26 22:48:04

获取文件系统上市的Python

回答

相关问题