2012-04-13 83 views
0

的Python 2.4 对于我的例子,我有一个2列csv文件获得文件大小和附加到CSV的新列文件

如:

HOST, FILE 
server1, /path/to/file1 
server2, /path/to/file2 
server3, /path/to/file3 

我想获得的文件大小对象在csv FILE中的每一行,然后将该值添加到新列上的csv FILE。 制作:

HOST, PATH, FILESIZE 
server1, /path/to/file1, 6546542 
server2, /path/to/file2, 46546343 
server3, /path/to/file3, 87523 

我试了几种方法,但havnt有很多成功。

下面的代码在PATH执行fileSizeCmd(DU-B)并正确输出filezie,但我havnt想出如何利用这些数据来添加到CSV文件

import datetime 
import csv 
import os, time 
from subprocess import Popen, PIPE, STDOUT 

now = datetime.datetime.now() 
fileSizeCmd = "du -b" 
SP = " " 

# Try to get disk size and append to another row after entry above 
#st = os.stat(row[3]) 
#except IOError: 
#print "failed to get information about", file 
#else: 
#print "file size:", st[ST_SIZE] 
#print "file modified:", time.asctime(time.localtime(st[ST_MTIME])) 

incsv = open('my_list.csv', 'rb') 
try: 
    reader = csv.reader(incsv) 
    outcsv = open('results/results_' + now.strftime("%m-%d-%Y") + '.csv', 'wb') 
    try: 
     writer = csv.writer(outcsv) 

     for row in reader: 
     p = Popen(fileSizeCmd + SP + row[1], shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE) 
     stdout, empty = p.communicate() 


     print 'Command: %s\nOutput: %s\n' % (fileSizeCmd + SP + row[1], stdout) 

     # Results in bytes example 
     # 
     # Output: 
     # 8589935104  /path/to/file 
     # 

    # Write 8589935104 to new column of csv FILE 

    finally: 
     outcsv.close() 

finally: 
incsv.close() 

回答

1

素描W/O错误处理:

#!/usr/bin/env python 

import csv 
import os 

filename = "sample.csv" 
# localhost, 01.html.bak 
# localhost, 01.htmlbak 
# ... 

def filesize(filename): 
    # no need to shell out for filesize 
    return os.stat(filename).st_size 

with open(filename, 'rb') as handle: 
    reader = csv.reader(handle) 
    # result is written to sample.csv.updated.csv 
    writer = csv.writer(open('%s.updated.csv' % filename, 'w')) 
    for row in reader: 
     # need to strip filename, just in case 
     writer.writerow(row + [ filesize(row[1].strip()) ]) 

# result 
# localhost, 01.html.bak,10021 
# localhost, 01.htmlbak,218982 
# ... 
+0

尼斯代码@miku – 2012-04-13 22:29:58

+0

我似乎无法得到这与2.4工作。我想我已经改变了你的发言权,但我仍然没有太多的运气 – Tommy 2012-04-13 23:41:04

+0

@miku我得到了这个工作。谢谢。如果文件不存在,它确实失败,但是 – Tommy 2012-04-14 00:49:24

0

您可以

1)读出的内容的CVS到(服务器,文件名)

2)的元组的列表收集的文件大小此列表

3)包的每一个元素结果到另一元组(服务器,文件名,文件大小)到另一个列表(“结果”)

4)写出来的结果,以新的文件

0

首先,获取文件大小比使用subprocess容易得多(见os.stat):

>>> os.stat('/tmp/file').st_size 
100 

其次,你在正确的轨道上你writer对象写入到不同的文件,但你只需要添加一列到row列出你从reader找回然后将它们送到的(见here)。事情是这样的:

>>> writerfp = open('out.csv', 'w') 
>>> writer = csv.writer(writerfp) 
>>> for row in csv.reader(open('in.csv', 'r')): 
...  row.append('column') 
...  writer.writerow(row) 
... 
>>> writerfp.close()