2016-05-24 78 views
0

因此,我正在收集数据并将此数据保存到csv文件中,但出于演示目的,我想根据相关的“订单”对各个csv文件中的列进行重新排序。Python:对csv文件的列进行重新排序

我用的是这个问题(write CSV columns out in a different order in Python)作为指导,但我不知道为什么我收到错误

writeindices = [name2index[name] for name in writenames] 
KeyError: % Processor Time 

当我运行它。请注意,此错误似乎不仅限于字符串% Processor Time'

我哪里错了?

这里是我的代码:

CPU_order=["%"+" Processor Time", "%"+" User Time", "Other"] 
Memory_order=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"] 

def reorder_csv(path,title,input_file): 
    if title == 'CPU': 
     order=CPU_order 
    elif title == 'Memory': 
     order=Memory_order 

    output_file=path+'/'+title+'_reorder'+'.csv' 

    writenames = order 

    reader = csv.reader(input_file) 
    writer = csv.writer(open(output_file, 'wb')) 

    readnames = reader.next() 
    name2index = dict((name, index) for index, name in enumerate(readnames)) 
    writeindices = [name2index[name] for name in writenames] 
    reorderfunc = operator.itemgetter(*writeindices) 
    writer.writerow(writenames) 

    for row in reader: 
     writer.writerow(reorderfunc(row)) 

这里是输入CSV文件看起来像一个示例:

,CPU\% User Time,CPU\% Processor Time,CPU\Other 
05/23/2016 06:01:51.552,0,0,0 
05/23/2016 06:02:01.567,0.038940741537158409,0.62259056657940626,0.077882481554869071 
05/23/2016 06:02:11.566,0.03900149141703179,0.77956981074955856,0 
05/23/2016 06:02:21.566,0,0,0 
05/23/2016 06:02:31.566,0,1.1695867249963632,0 
+2

请发布您的'input_file'的内容! **更新:**特别是标题行。 – schwobaseggl

回答

1

你的代码工作。这是您的数据没有名为“%处理器时间”的列。下面是一个简单的数据我用:

Other,% User Time,% Processor Time 
o1,u1,p1 
o2,u2,p2 

这里是我称之为代码:

reorder_csv('.', 'CPU', open('data.csv')) 

通过这些设置,一切工作正常。请检查您的数据。

更新

现在,我看到您的数据,它看起来像你有列名,例如“CPU \%处理器时间”,想写出它为“%处理器时间”前翻译。所有你需要做的就是创建name2index这样:

name2index = dict((name.replace('CPU\\', ''), index) for index, name in enumerate(readnames)) 

这里的区别是,而不是name,你应该有name.replace('CPU\\', ''),它摆脱了CPU的\一部分。

更新2

我返工你的代码使用csv.DictReadercsv.DictWriter。我还假设“CPU \%特权时间”将转换为“其他”。如果不是这种情况,您可以在transformer字典中修复它。

import csv 
import os 

def rename_columns(row): 
    """ Take a row (dictionary) of data and return a new row with columns renamed """ 
    transformer = { 
     'CPU\\% User Time': '% User Time', 
     'CPU\\% Processor Time': '% Processor Time', 
     'CPU\\% Privileged Time': 'Other', 
     } 
    new_row = {transformer.get(k, k): v for k, v in row.items()} 
    return new_row 

def reorder_csv(path, title, input_file): 
    header = dict(
     CPU=["% Processor Time", "% User Time", "Other"], 
     Memory=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"], 
     ) 

    reader = csv.DictReader(input_file) 
    output_filename = os.path.join(path, '{}_reorder2.csv'.format(title)) 

    with open(output_filename, 'wb') as outfile: 
     # Create a new writer where each row is a dictionary. 
     # If the row contains extra keys, ignore them 
     writer = csv.DictWriter(outfile, header[title], extrasaction='ignore') 
     writer.writeheader() 
     for row in reader: 
      # Each row is a dictionary, not list 
      print row 
      row = rename_columns(row) 
      print row 
      print 
      writer.writerow(row) 
+0

谢谢,我的数据在反斜杠之前有文本(我已经更新了上面的问题),但是我想因为我正在查找给定的字符串“in”,它应该仍然有效? – Catherine

+0

使用新的name2index替换''''''我仍然得到'KeyError:'%Processor Time'' – Catherine

+0

我注意到你的csv缺少时间标记(第一列)的标题。这是问题吗?它有助于您以原始形式发布csv样本。 –