2014-08-28 51 views
1

我试图从文件中拆分行并将它们放入Excel文件(xlsx)中。根据PS PAD,该文件的编码是'cp1250'。所以,有适当的字符XLSX文件,我从CP1250解码该行 - line = line.decode("cp1250")编码错误 - xlsxwriter - Python

的问题是,从12000点返回该错误CCA 3000行:

'charmap' codec can't decode byte 0x81 in position 25: character maps to <undefined> 

因此,作为未来的事情我试着解码(“UTF-8”),我不知道为什么,但它更好。只有330线返回错误:

'utf8' codec can't decode byte 0x8e in position 0: invalid start byte 

你们有什么想法我做错了什么?

编辑:错误大多发生在线路中包含“Z”或“S”

下面是代码:(在PY文件的顶部,我已经把“# - - 编码: UTF-8 - - “)

def toXls(file): 
workbook = xlsxwriter.Workbook(file) 
worksheet = workbook.add_worksheet() 
a=0 
with open("filtrovane.txt") as f: 
    x=0 
    for line in f: 

     try: 
      line = line[:-1].decode("utf-8") """It should be "cp1250" according to PSPAD editor""" 
      # line = line.encode("ISO 8859-2") 
      splitted = line.split("::") 

      if len(splitted)==7: 
       try: 
        a=a+1 
        worksheet.write(a,0,splitted[0]) 
        worksheet.write(a,1,splitted[1]) 
        worksheet.write(a,2,splitted[2]) 
        worksheet.write(a,3,splitted[3]) 
        worksheet.write(a,4,splitted[4]) 
        worksheet.write(a,5,splitted[5]) 
        worksheet.write(a,6,splitted[6]) 
       except Exception as e: 
        print "!!"+line+" "+a + e 
     except Exception as e: 
      print e 
      x=x+1 
print x 
workbook.close() 
+0

当您尝试将其保存到文本文件时会发生什么情况,是否会发生同样的问题? – diek 2014-08-29 02:16:02

回答

0

里有XlsxWriter文档/回购两个例子,说明如何阅读UTF-8Shift JIS文件并将它们转换成XLSX文件。

它应该适用于cp1250