2014-06-19 42 views
0

我是新来的Python,但我想对一些csv文件做一些数据分析。我想从只包含一些关键字的csv文件打印行。我使用第一个块来打印所有有效的行。从这些行我想打印包括关键字的。谢谢你的帮助。包含指定关键字的csv文件的打印行

csv.field_size_limit(sys.maxsize) 
invalids = 0 
valids = 0 
for f in ['1.csv']: 
    reader = csv.reader(open(f, 'rU'), delimiter='|', quotechar='\\') 
    for row in reader: 
     try: 
      print row[2] 
      valids += 1 
     except: 
      invalids += 1 
print 'parsed %s records. ignored %s' % (valids, invalids) 

随着关键字:

for w in ['ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 'volkswagen','chevrolet','chrysler']: 

我想我需要一个if语句来过滤我的前代码,但我一直在这个挣扎了几个小时,似乎无法得到它工作。

+0

在其列你想搜索的关键字? –

+0

该文件是单列的CSV(所以第一个)。谢谢 – user133474

+0

所以你根本不需要'csv'模块。 –

回答

0

你的猜测是正确的。你所需要做的就是用if语句过滤行,检查每个字段是否与关键字匹配。这里是你如何做到这一点(我也做了一些改进你的代码,并在评论中解释它们。):

# First, create a set of the keywords. Sets are faster than a list for 
# checking if they contain an element. The curly brackets create a set. 
keywords = {'ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 
      'volkswagen','chevrolet','chrysler'} 
csv.field_size_limit(sys.maxsize) 
invalids = 0 
valids = 0 
for filename in ['1.csv']: 
    # The with statement in Python makes sure that your file is properly closed 
    # (automatically) when an error occurs. This is a common idiom. 
    # In addition, CSV files should be opened only in 'rb' mode. 
    with open(filename, 'rb') as f: 
     reader = csv.reader(f, delimiter='|', quotechar='\\') 
     for row in reader: 
      try: 
       print row[2] 
       valids += 1 
      # Don't use bare except clauses. It will catch 
      # exceptions you don't want or intend to catch. 
      except IndexError: 
       invalids += 1 
      # The filtering is done here. 
      for field in row: 
       if field in keywords: 
        print row 
        break 
# Prefer the str.format() method over the old style string formatting. 
print 'parsed {0} records. ignored {1}'.format(valids, invalids) 
相关问题