2016-06-18 42 views
0

我工作的一个python3脚本,执行以下操作:“‘NoneType’对象不是标化”或“KeyError异常:”与openpyxl和ipwhois

  1. 打开一个Excel文件中的工作目录
  2. 选择在Excel中第一片材文件
  3. 通过所有的IP地址的选择在第三列(一范围的IP地址在此情况下)的所有数据
  4. 迭代并调用为每个
  5. 商店的whois API每个IP的结果变量(以.json)
  6. 通过解析结果找名称,IP范围,联系方式
  7. 上面写到Excel新行从6文件中的值
  8. 保存Excel用在一个新的文件的文件名当前目录

当前文档具有427个唯一IP地址的列表,并且来自whois api的RIPE名称的结果是唯一的(有时在相同的响应中)。为了适应这种情况,我已经遍历每个RIPE名称以获取['contact']列表中的次级数据集。只要联系人列表包含我想要的值,此工作正常。如果没有,我会得到一个NoneType的错误。我试图用if语句来构建预防性逻辑,其中result == None为我的变量赋予一个'NULL'值,但随后我在RIPE名称上得到了一个KeyError异常。我卡住了,需要你的帮助。这里是我的代码的副本:

import openpyxl 

from pprint import pprint 
from ipwhois import IPWhois 

wb = openpyxl.load_workbook('Abuse Log with Notes FWC 2016-06-09.xlsx')#change name here to file to be used 
sheet = wb.get_sheet_by_name('Sheet1') #get first sheet in workbook 

#Add new column headings for API results 
sheet['E1'] = 'HOST NAME' 
sheet['F1'] = 'HOST COUNTRY' 
sheet['G1'] = 'IP START' 
sheet['H1'] = 'IP END' 
sheet['I1'] = 'HOST EMAIL' 
sheet['J1'] = 'HOST PHONE' 
sheet['K1'] = 'HOST ADDRESS' 

#Store all start range IP's for Amazon in one list variable 
AmazonStartIPs = [ 
    '54.64.0.0', '54.160.0.0','54.144.0.0', 
    '52.64.0.0','54.208.0.0','54.192.0.0', 
    '54.240.0.0','54.224.0.0','54.72.0.0', 
    '54.176.0.0','52.32.0.0','52.0.0.0', 
    '52.192.0.0','52.84.0.0','53.32.0.0'] 

def checkForAmazon(): 
    if StartAddress in AmazonStartIPs: 
     Name = 'Amazon Web Services - Elastic Compute Cloud' 
     CountryCode = 'US' 
     AbuseEmail = '[email protected]' 
     AbusePhone = '+1-206-266-4064' 
     AbuseAddress = ['410 Terry Avenue','North Seattle', 'WA', '98109-5210','UNITED STATES'] 

iterateColumn = sheet.columns[2]#get all cell values in column C 
currentRowIndex = 2 

for Address in iterateColumn[1:5]:#test range 1:5 to reduce API load 
    ip_address = Address.value#set var to value of item in iterateColumn 
     IP = IPWhois(ip_address)#store whois call in var of IP 
     results = IP.lookup_rdap(depth=1)#call whois and store .json results 

     Name = results['network']['name']#set name to IP Host name 
     Name=''.join(Name)#formatting for excel 

     CountryCode = results['asn_country_code']#var for country code 
     CountryCode=''.join(CountryCode) 

     StartAddress = results['network']['start_address']#var for IP range Start 
     StartAddress=''.join(StartAddress) 

     EndAddress = results['network']['end_address']#var for IP range End 
     EndAddress = ''.join(EndAddress) 

     #write values above to iterable rows in spreadsheet 
     sheet.cell(row = currentRowIndex, column = 5).value = Name 
     sheet.cell(row = currentRowIndex, column = 6).value = CountryCode 
     sheet.cell(row = currentRowIndex, column = 7).value = StartAddress 
     sheet.cell(row = currentRowIndex, column = 8).value = EndAddress 

     for key in results['objects']:#get unique key values in results object 
     r = key#store as var of r to prevent having to call by ripe name 

     AbuseEmail = results['objects'][r]['contact']['email'][0]['value'] 
     if results['objects'][r]['contact']['email'] == None: 
     AbuseEmail = 'NULL' 
     elif results['objects'] is None: 
     AbuseEmail = 'NULL' 

     sheet.cell(row = currentRowIndex, column = 9).value = AbuseEmail 
     AbuseEmail = ''.join(AbuseEmail) 

     if results['objects'][r]['contact']['phone'] == None: 
     AbusePhone = 'NULL' 
     else: 
     AbusePhone = results['objects'][r]['contact']['phone'][0]['value'] 

     sheet.cell(row=currentRowIndex, column = 10).value = AbusePhone 
     AbusePhone = ''.join(AbusePhone) 

     if results['objects'][r]['contact']['address'] == None: 
     AbuseAddress = 'NULL' 
     else: 
     AbuseAddress = results['objects'][r]['contact']['address'][0]['value'] 

     sheet.cell(row=currentRowIndex, column = 11).value = AbuseAddress 
     AbuseAddress =''.join(AbuseAddress) 

     checkForAmazon() 

     currentRowIndex += 1 


rowsUpdated = sheet.max_row 
print('{} records have been updated.'.format(rowsUpdated)) 

wb.save('ABUSE_IP_LOG_HOST_DATA.xlsx') 
+1

使用'try' /'except'每个条款可能存在重大错误的情况。 – MattDMo

+0

我添加了尝试/除了所有情况下,并解决了问题。谢谢MattDMo! – Fergus

+1

我建议你分开执行查找的代码,并将代码转换成可用的形式,并将其添加到工作表中。你正在使用的'sheet.cell(...)'是我非常想要阻止的事情。 –

回答

0

至于建议的MattDMo,我增加了以下异常处理和解决了这个问题:

try: 
    AbuseEmail = results['objects'][r]['contact']['email'][0]['value'] 
    AbusePhone = results['objects'][r]['contact']['phone'][0]['value'] 
    AbuseAddress = results['objects'][r]['contact']['address'][0]['value'] 
except (KeyError, TypeError): 
    AbuseEmail = 'NULL' 
    AbusePhone = 'NULL' 
    AbuseAddress = 'NULL'