2013-07-05 22 views
1

这是我第一次在这里发布。我正在尝试学习一些Python。使用Python 3和numpy。Python 3代码读取CSV文件,操作然后创建新文件....工程,但寻找改进

做了几个教程,然后决定潜入并尝试一个小型项目,我可能会发现在工作中有用,因为这是一个很好的学习方式。

我写了一个程序,它从一个有几行标题的CSV文件中读取数据,然后我想根据标题名称从该文件中提取某些列,然后将其输出回一个新的csv文件以特定的格式。

该程序我工作得很好,做我想做的事,但因为我是一个新手,我想了解一些提示,以了解如何改进我的代码。

我的主要数据文件(csv)长约57列,深度约36行,所以不大。

它工作正常,但寻找建议&改进。

import csv 
import numpy as np 

#make some arrays..at least I think thats what this does 
A=[] 
B=[] 
keep_headers=[] 

#open the main data csv file 'map.csv'...need to check what 'r' means 
input_file = open('map.csv','r') 

#read the contents of the file into 'data' 
data=csv.reader(input_file, delimiter=',') 

#skip the first 2 header rows as they are junk 
next(data) 
next(data) 

#read in the next line as the 'header' 
headers = next(data) 

#Now read in the numeric data (float) from the main csv file 'map.csv' 
A=np.genfromtxt('map.csv',delimiter=',',dtype='float',skiprows=5) 

#Get the length of a column in A 
Alen=len(A[:,0]) 

#now read the column header values I want to keep from 'keepheader.csv' 
keep_headers=np.genfromtxt('keepheader.csv',delimiter=',',dtype='unicode_') 

#Get the length of keep headers....i.e. how many headers I'm keeping. 
head_len=len(keep_headers) 

#Now loop round extracting all the columns with the keep header titles and 
#append them to array B 
i=0 
while i < head_len: 
    #use index to find the apprpriate column number. 
    item_num=headers.index(keep_headers[i]) 
    i=i+1 

    #append the selected column to array B 
    B=np.append(B,A[:,item_num]) 

#now reshape the B array 
B=np.reshape(B,(head_len,36)) 

#now transpose it as thats the format I want. 
B=np.transpose(B) 

#save the array B back to a new csv file called 'cmap.csv' 
np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",") 

谢谢。

+0

http://codereview.stackexchange.com/ – Vaandu

回答

1

您可以使用更多的numpy功能大大简化您的代码。

A = np.loadtxt('stack.txt',skiprows=2,delimiter=',',dtype=str) 
keep_headers=np.loadtxt('keepheader.csv',delimiter=',',dtype=str) 

headers = A[0,:] 
cols_to_keep = np.in1d(headers, keep_headers) 

B = np.float_(A[1:,cols_to_keep]) 
np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",") 
+0

谢谢,我可能从来没有找到'in1d'部分,这非常方便。我可以看到我的初始代码,虽然工作,可真的改善。大。 – user2551578

+1

@ user2551578谢谢。如果你认为它符合你的需求,你可以接受这个答案... –

相关问题