从CSV文件中选择特定列

我的代码能够获取文本文件的28列并格式化/删除一些数据。我如何选择特定的列？我想要的列是0到25和列28.什么是最好的方法？从CSV文件中选择特定列

在此先感谢！

import csv 
import os 

my_file_name = os.path.abspath('NVG.txt') 
cleaned_file = "cleanNVG.csv" 
remove_words = ['INAC-EIM','-INAC','TO-INAC','TO_INAC','SHIP_TO-inac','SHIP_TOINAC'] 


with open(my_file_name, 'r', newline='') as infile, open(cleaned_file, 'w',newline='') as outfile: 
    writer = csv.writer(outfile) 
    cr = csv.reader(infile, delimiter='|') 
    writer.writerow(next(cr)[:28]) 
    for line in (r[0:28] for r in cr): 

     if not any(remove_word in element for element in line for remove_word in remove_words): 
     line[11]= line[11][:5] 

     writer.writerow(line) 
infile.close() 
outfile.close()

来源

2017-02-28 Cesar

看看pandas。

import pandas as pd 

usecols = list(range(26)) + [28] 
data = pd.read_csv(my_file_name, usecols=usecols)

您还可以方便的使用数据写入filter()返回到一个新的文件

with open(cleaned_file, 'w') as f: 
    data.to_csv(f)

来源

2017-02-28 20:24:30 Ohjeah

'Pandas'使得数据操作如此简单并可行。从我+1。 –

排除列26和column27从行：

for row in cr: 
    content = list(filter(lambda x: row.index(x) not in [25,26], row)) 
    # work with the selected columns content

来源

2017-02-28 20:26:02 haifzhan

如果你不得不调用列表，为什么不在这里使用列表理解：'content = [x for x in cr if cr.index（x）not in [25,26]]' – Ohjeah

您可能是想过滤排，而不是读者。现在，您会在for循环的第一次迭代中耗尽读者。使用find也是浪费的，为什么不'enumerate（）'？ –

@IljaEverilä是的，'排'，修正了错字。谢谢！ – haifzhan

从CSV文件中选择特定列

回答

相关问题