2010-06-24 154 views
2
  "Type","Name","Description","Designation","First-term assessment","Second-term assessment","Total" 
      "Subject","Nick","D1234","F4321",10,19,29 
      "Unit","HTML","D1234-1","F4321",18,, 
      "Topic","Tags","First Term","F4321",18,, 
      "Subtopic","Review of representation of HTML",,,,, 

所有上述从Excel工作表,将其转化为CSV和即上述Python脚本从csv文件中读取

头显示为您会注意到一个值包含七个coulmns,所述下面这些数据变化,

我有这样的脚本从python脚本生成这些,脚本低于

from django.db import transaction 
import sys 
import csv 
import StringIO 



file = sys.argv[1] 
no_cols_flag=0 
flag=0 
header_arr=[] 


print file 
f = open(file, 'r') 



while (f.readline() != ""): 
    for i in [line.split(',') for line in open(file)]: # split on the separator 
    print "===========================================================" 
    row_flag=0 
    row_d="" 
    for j in i: # for each token in the split string 
     row_flag=1 
     print j 


     if j: 
     no_cols_flag=no_cols_flag+1 
     data=j.strip() 
     print j 

    break 

如何修改上面的脚本,这不能不说数据属于ŧ Ø特定的列标题..

感谢..

回答

9

要导入的csv module,但从来没有使用它。为什么?

如果你

import csv 
reader = csv.reader(open(file, "rb"), dialect="excel") # Python 2.x 
# Python 3: reader = csv.reader(open(file, newline=""), dialect="excel") 

你得到一个reader对象,将包含所有你需要的;第一行将包含标题,并且后续行将包含相应位置中的数据。

更妙的可能(如果我理解你正确):

import csv 
reader = csv.DictReader(open(file, "rb"), dialect="excel") # Python 2.x 
# Python 3: reader = csv.DictReader(open(file, newline=""), dialect="excel") 

DictReader可以遍历,返回使用的列标题作为键dict个序列,后面的数据作为值,所以

for row in reader: 
    print(row) 

将输出

{'Name': 'Nick', 'Designation': 'F4321', 'Type': 'Subject', 'Total': '29', 'First-term assessment': '10', 'Second-term assessment': '19', 'Description': 'D1234'} 
{'Name': 'HTML', 'Designation': 'F4321', 'Type': 'Unit', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'D1234-1'} 
{'Name': 'Tags', 'Designation': 'F4321', 'Type': 'Topic', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'First Term'} 
{'Name': 'Review of representation of HTML', 'Designation': '', 'Type': 'Subtopic', 'Total': '', 'First-term assessment': '', 'Second-term assessment': '', 'Description': ''} 
+0

我有修复编辑缩进 – Hulk 2010-06-24 07:03:14

+3

在Python 2.x中,*总是*以二进制模式('rb'或'wb',如适用)打开文件。 – 2010-06-24 11:07:49

+0

@John Machin:为什么? csv模块的文档没有提到这一点,我从来没有遇到过打开没有'b'标志的文件的问题。一些例子使用它,有些例子不使用它。你可能是非常正确的,但我想知道这背后的基本原理。 – 2010-06-24 11:47:51