从Python创建Excel文件

我的项目是处理不同的Excel文件。为此，我想创建一个包含以前文件的一些数据的文件。所有这些都是为了拥有我的数据库。目标是获取这些数据的图表。所有这一切都自动。从Python创建Excel文件

我用Python编写了这个程序。但是，它需要20分钟才能运行。我怎样才能优化它？另外，我在一些文件中有相同的变量。所以我想在最后的文件中，不重复相同的变量。怎么做？

这里是我的程序：

import os 
import xlrd 
import xlsxwriter 
from xlrd import open_workbook 

wc = xlrd.open_workbook("U:\\INSEE\\table-appartenance-geo-communes-16.xls") 
sheet0=wc.sheet_by_index(0) 

# création 

with xlsxwriter.Workbook('U:\\INSEE\\Department61.xlsx') as bdd: 
    dept61 = bdd.add_worksheet('deprt61') 

folder_path = "U:\\INSEE\\2013_telechargement2016" 

col=8 
constante3=0 
lastCol=0 
listeV = list() 

for path, dirs, files in os.walk(folder_path): 
    for filename in files:    
     filename = os.path.join(path, filename)   
     wb = xlrd.open_workbook(filename, '.xls')    
     sheet1 = wb.sheet_by_index(0)   
     lastRow=sheet1.nrows   
     lastCol=sheet1.ncols   
     colDep=None 
     firstRow=None 
     for ligne in range(0,lastRow):     
      for col2 in range(0,lastCol):      
       if sheet1.cell_value(ligne, col2) == 'DEP': 
        colDep=col2 
        firstRow=ligne 
        break 
      if colDep is not None: 
       break 
     col=col-colDep-2-constante3 
     constante3=0 
     for nCol in range(colDep+2,lastCol): 
        constante=1 
        for ligne in range(firstRow,lastRow): 
          if sheet1.cell(ligne, colDep).value=='61': 
            Q=(sheet1.cell(firstRow, nCol).value in listeV) 
            if Q==False: 
              V=sheet1.cell(firstRow, nCol).value 
              listeV.append(V) 
              dept61.write(0,col+nCol,sheet1.cell(firstRow, nCol).value) 
              for ligne in range(ligne,lastRow): 
                if sheet1.cell(ligne, colDep).value=='61': 
                  dept61.write(constante,col+nCol,sheet1.cell(ligne, nCol).value) 
                constante=constante+1 

            elif Q==True: 
              constante3=constante3+1 # I have a problem here. I would like to count the number of variables that already exists but I find huge numbers. 
        break 
     col=col+lastCol 

bdd.close()

感谢你为你的未来帮助。 :)

来源

2017-05-02 Jen

imho，'for file in files：'之后的整个代码块需要缩进1级，除了'bdd.close（）'之外，循环才有意义。我已经做了编辑。如果这是错误的，再次编辑。 – aneroid

这个可能对于SO来说太宽泛了，所以这里有一些指导你可以优化的地方。也许添加一张样张的样张截图。

wrt if sheet1.cell_value(ligne, col2) == 'DEP': DEP是否可以在一张纸上多次出现？如果肯定会发生只有一次，那么当您得到colDep和firstRow的值时，则会跳出两个循环。在两个循环中添加break，通过添加一个中断来结束内部循环，然后检查标志值并在迭代之前跳出外部循环。像这样：

colDep = None # initialise to None 
firstRow = None # initialise to None 
for ligne in range(0,lastRow):     
    for col2 in range(0,lastCol):      
     if sheet1.cell_value(ligne, col2) == 'DEP': 
      colDep=col2 
      firstRow=ligne 
      break # break out of the `col2 in range(0,lastCol)` loop 
    if colDep is not None: # or just `if colDep:` if colDep will never be 0. 
     break # break out of the `ligne in range(0,lastRow)` loop

我觉得范围在你写对BDD块for ligne in range(0,lastRow):应该firstRow开始，因为你知道，0至FIRSTROW-1将是空的sheet1您刚才读寻找标题。
```
for ligne in range(firstRow, lastRow): 
```
这样可以避免浪费时间读取空的标题行。

更清洁的代码的其他注意事项：

使用with xlsxwriter.Workbook('U:\INSEE\\Department61.xlsx') as bdd: syntax的清晰度。
- 和总是使用双斜杠，即使控制字符不前\\内字符串：'U:\\INSEE\\Department61.xlsx'
您已经使用sheet1.cell_value()以及sheet1.cell().value您的读操作。选择一个，除非在value=='61'的情况下需要扩展单元信息。
阅读PEP-8了解如何编写更多可读代码。

来源

2017-05-02 22:21:51 aneroid

感谢您的帮助。我会阅读。 – Jen

我改变了一些我的代码。但是，我有一些问题。 – Jen

从Python创建Excel文件

回答

相关问题