在Python中每循环迭代清空列表？

-1

我通过Reddit上多篇文章试图循环，经过每一篇文章，并提取相关的顶级实体（通过筛选获得最高关联得分完成），然后添加到列表master_locations：在Python中每循环迭代清空列表？

from __future__ import print_function 
from alchemyapi import AlchemyAPI 
import json 
import urllib2 
from bs4 import BeautifulSoup 

alchemyapi = AlchemyAPI() 
reddit_url = 'http://www.reddit.com/r/worldnews' 
urls = [] 
locations = [] 
relevance = [] 
master_locations = [] 

def get_all_links(page): 
    html = urllib2.urlopen(page).read() 
    soup = BeautifulSoup(html) 
    for a in soup.find_all('a', 'title may-blank ', href=True): 
     urls.append(a['href']) 
     run_alchemy_entity_per_link(a['href']) 

def run_alchemy_entity_per_link(articleurl): 
    response = alchemyapi.entities('url', articleurl) 
    if response['status'] == 'OK': 
     for entity in response['entities']: 
      if entity['type'] in entity == 'Country' or entity['type'] == 'Region' or entity['type'] == 'City' or entity['type'] == 'StateOrCountry' or entity['type'] == 'Continent': 
       if entity.get('disambiguated'): 
        locations.append(entity['disambiguated']['name']) 
        relevance.append(entity['relevance']) 
       else: 
        locations.append(entity['text']) 
        relevance.append(entity['relevance'])   
      else: 
       locations.append('No Location') 
       relevance.append('0') 
     max_pos = relevance.index(max(relevance)) # get nth position of the highest relevancy score 
     master_locations.append(locations[max_pos]) #Use n to get nth position of location and store that location name to master_locations 
     del locations[0] # RESET LIST 
     del relevance[0] # RESET LIST 
    else: 
     print('Error in entity extraction call: ', response['statusInfo']) 

get_all_links('http://www.reddit.com/r/worldnews') # Gets all URLs per article, then analyzes entity 

for item in master_locations: 
    print(item)

但我认为出于某种原因，列表locations和relevance未被重置。我做错了吗？

印刷本的结果是：

Holland 
Holland 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Johor Bahru

（可能从列表中不被清除）

来源

2014-09-06 Phillipe Dongwoo Han

我已经低估了，因为这是一段长长的代码，大多不相关，可能已经被简化了很多。 http://sscce.org/ – Davidmh 2014-09-06 10:05:46

del list[0]只删除列表中的第一项。

如果要删除所有项目，使用下列内容：

del list[:]

或

list[:] = []

来源

2014-09-06 08:34:32 falsetru

尝试将列表更改为'locations [：] = []'和'relevance [：] = []'，但是我得到一个'ValueError：max（）arg是一个空序列错误。 – 2014-09-06 09:33:25

@PhillipeDongwooHan，在'del'语句前用'if relevance：'守卫两行。 – falsetru 2014-09-06 09:35:07

谢谢！这固定它！但是，你能简单解释一下为什么这样做有效吗为什么要放置一个if条件？ – 2014-09-06 09:51:19

在你的情况，不要重复使用的清单，只要创建新的：

from __future__ import print_function 
from alchemyapi import AlchemyAPI 
import json 
import urllib2 
from bs4 import BeautifulSoup 

alchemyapi = AlchemyAPI() 
reddit_url = 'http://www.reddit.com/r/worldnews' 

def get_all_links(page): 
    html = urllib2.urlopen(page).read() 
    soup = BeautifulSoup(html) 
    urls = [] 
    master_locations = [] 
    for a in soup.find_all('a', 'title may-blank ', href=True): 
     urls.append(a['href']) 
     master_locations.append(run_alchemy_entity_per_link(a['href'])) 
    return urls, master_locations 

def run_alchemy_entity_per_link(articleurl): 
    response = alchemyapi.entities('url', articleurl) 
    if response['status'] != 'OK': 
     print('Error in entity extraction call: ', response['statusInfo']) 
     return 
    locations_with_relevance = [] 
    for entity in response['entities']: 
     if entity['type'] in ('Country', 'Region', 'City', 'StateOrCountry', 'Continent'): 
      if entity.get('disambiguated'): 
       location = entity['disambiguated']['name'] 
      else: 
       location = entity['text'] 
      locations_with_relevance.append((int(entity['relevance']), location)) 
     else: 
      locations_with_relevance.append((0, 'No Location')) 
    return max(locations_with_relevance)[1] 

def main(): 
    _urls, master_locations = get_all_links(reddit_url) # Gets all URLs per article, then analyzes entity 

    for item in master_locations: 
     print(item) 

if __name__ == '__main__': 
    main()

当您有多个项目存储在列表中时，将项目放入一个元组中，并将元组放入一个列表中，而不是两个或多个sep愤怒的名单。

来源

2014-09-06 08:52:16 Daniel

嗯..试着运行你的代码，我得到了'TypeError：'列表'对象不可调用'？ – 2014-09-06 09:32:09

@PhillipeDongwooHan：改正。无论如何，它更多的是看代码并找出差异。 – Daniel 2014-09-06 10:03:10

在Python中每循环迭代清空列表？

回答

相关问题