通过文件夹中的文件打开文件

我在使用python编程时遇到了新问题，但是目前我收到了编写一个写入所有ID的脚本的任务，其中发生了type = 0或type = 1。它看起来像这个例子中的XML文件：通过文件夹中的文件打开文件

<root> 
<bla1 type="0" id = "1001" pvalue:="djdjd"/> 
<bla2 type="0" id = "1002" pvalue:="djdjd" /> 
<bla3 type="0" id = "1003" pvalue:="djdjd"/> 
<bla4 type="0" id = "1004" pvalue:="djdjd"/> 
<bla5 type="0" id = "1005" pvalue:="djdjd"/> 
<bla6 type="1" id = "1006" pvalue:="djdjd"/> 
<bla7 type="0" id = "1007" pvalue:="djdjd"/> 
<bla8 type="0" id = "1008" pvalue:="djdjd"/> 
<bla9 type="1" id = "1009" pvalue:="djdjd"/> 
<bla10 type="0" id = "1010" pvalue:="djdjd"/> 
<bla11 type="0" id = "1011" pvalue:="djdjd"/> 
<bla12 type="0" id = "1009" pvalue:="djdjd"/> 

<root>

因此，代码做的第一件事就是基本上替代：用“=”的原因，使我的XML上传导致错误“=”。无论如何，它会记下ID的类型为0，ID的类型为1.这对于一个xml文件来说是完美的。不幸的是，我有更多的只有一个文件，我需要像循环一样，总是打开文件夹中的下一个xml文件（不同的名称），并将新的ID添加到最后一个xml中的ID中。所以基本上它总是添加新的xml文件中找到的新ID。

import xml.etree.cElementTree as ET # required import 

    XmlFile = 'ID3.xml' # insert here the name of the XML-file, which needs to be inside the same folder as the .py file 

    my_file = open('%s' % XmlFile, "r+") # open the XML-file 
    Xml2String = my_file.readlines() # convert the file into a list strings 

    XmlFile_new = [] # new list, which is filled with the modified strings 
    L = len(Xml2String) # length of the string-list 
    for i in range(1, L): # Increment starts at 0, therefore, the first line is ignored 
     if ':=' in Xml2String[i]: 
      XmlFile_new.append(Xml2String[i].replace(':=', '=')) # get rid of colon 
     else: 
      XmlFile_new.append(Xml2String[i]) 

    tree = ET.ElementTree(XmlFile_new) 
    root = tree.getroot() 

    id_0 = [] # list for id="0" 
    id_1 = [] # list for id="1" 
    id_one2zero = [] # list for ids, that occur twice 

    for i in range(len(root)): 
     if 'type="0"' in root[i]: # check for type 
      a = root[i].index("id") + 5 # search index of id 
      b = a+6 
      id_0.append((root[i][a:b])) # the id is set via index slicing 
     elif 'type="1"' in root[i]: # check for type 
      a = root[i].index("id") + 5 
      b = a+6 
      id_1.append((root[i][a:b])) 
     else: 
      print("Unknown type occurred") # If there's a line without type="0" or type="1", this message gets printed 
      # (Remember: first line of the xml-file is ignored) 

    for i in range(len(id_0)): # check for ids, that occur twice 
     for j in range(len(id_1)): 
      if id_0[i] == id_1[j]: 
       id_one2zero.append(id_0[i]) 
    print(id_0) 
    print(id_1) 
    f = open('write.xml','w') 
    print >>f, 'whatever' 
    print('<end>')

来源

2017-07-23 Mueller

解决此问题的简单方法是使用os.walk()函数。有了它，您可以在一个目录中甚至递归地打开所有文件。

下面是一个例子，如何使用它：

for root, dirs, files in os.walk("your/path"): 
    for file in files: 
     # process your file

如果您还有其他的文件比在你的目录XML的文件，你可以用file.endswith(".xml")确保您只处理XML的文件。

来源

2017-07-23 13:00:19

非常感谢您的评论！我会尽快测试它:-) – Mueller

...所以我想我会拥有所有的xml文件......就像一个数组，对吧？我如何告诉python下一步然后继续下一个等？你可以给我一个例子，如果在我的情况下，我有3 xml的，我想工作？可以说：ID1.xml，ID2.xml，ID3.xml在文件夹中。 XML的名称不同，但结构始终相同。我需要循环吗？ – Mueller

是的，你需要一个循环。我试图在我的例子中解释这个，我提供了一个可以工作的循环。 'os.walk（）'返回一个文件列表（在你的情况下是'[“ID1.xml”，“ID2.xml”，“ID3.xml”]和'文件中的文件：'循环所有这些文件在列表中，无论你在哪里写下我的评论'＃处理你的文件'，都会为所有文件执行 –

通过文件夹中的文件打开文件

回答

相关问题