维基百科用维基百科1.4.0废弃：如何跳过不好的结果？

我正在使用维基百科 for python 2.7，至报废文章，使用来自非常大的数据集的文字。维基百科用维基百科1.4.0废弃：如何跳过不好的结果？

下面的代码：

for node_id in top_k: 
    human_string = label_lines[node_id] 
    score = predictions[0][node_id] 
    print('%s (score = %.5f)' % (human_string, score))  


    # Wiki = wikipedia.page(human_string) 
    # print (Wiki.content) 

    lista.append(human_string) 

for i in xrange(5): 
    wiki = wikipedia.page(lista[i]) 
    print (wiki.content) 
    a = wiki.content 
    #appendowanie = '%s (score = %.5f)' % (human_string, score) 
    # appendowanie = str(human_string) 
    appendFile = open('/home/inception/wikipedia.txt', 'a') 
    appendFile.write('\n\n'+str(i)) 
    appendFile.write(a.encode("utf-8")) 
    appendFile.close()

我想借此从列表中5个项目，搜索它在维基百科和报废整个文章wikipedia.txt文件。有时维基百科搜索给我一个错误，因为从列表中未登录词： 例如错误

Traceback (most recent call last): File "label_image.py", line 68, in <module> 
    wiki = wikipedia.page(lista[i]) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 276, in page 
    return WikipediaPage(title, redirect=redirect, preload=preload) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 299, in __init__ 
    self.__load(redirect=redirect, preload=preload) File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 345, in __load 
    raise PageError(self.title) wikipedia.exceptions.PageError: Page id "gracile crown blackbird" does not match any pages. Try another id!

竹叶冠黑鸟

我要改剧本忽略的话哪个wikipedia scrapper无法加载 有没有办法用一个脚本找出所有错误的单词？

来源

2017-01-18 Piteight

使用try-除本：

try: 
    <get the article> 
except wikipedia.exceptions.PageError as e: 
    if "does not match any pages" in str(e): 
     <ignore the error> 
    else: 
     # Some other error jumped out, so do not ignore it: 
     raise

现在，这是不是100％肯定，因为页面的名称可能是“不相匹配的页面”，在理论上。

因此，您确实需要输入变量e中捕获的异常，并且只能看到该消息或者是否有错误编号或其他内容。

因为我认为PageError（）可以引发超过页面未找到。

我不知道PageError（）异常是怎么做的，但也许是：

e.msg

或

e.message

应该给你的，而不是在str中检查（E）真实的东西

来源

2017-01-18 21:44:59 Dalen

谢谢，我认为就是这样。我没有得到'raise'的东西，我应该把其他错误信息放在'else'中吗？在if语句中，我添加了'wiki = wikipedia.page（lista [i + 1]）'来获得下一篇文章。我需要编写更复杂的代码。有一种错误信息给我列出了可能的维基百科文章。我认为应该有一个选项来抓住第一个并阅读文章。 – Piteight

你可以把：提高e，如果它看起来更好。但没有任何提高就会提高错误尝试捕捉。转到您的Python站点包目录，并阅读wikipedia/exceptions.py以查看PageError（）是如何正确工作的，以及它在哪种情况下会具有哪些属性。还有文档。您也许可以使用wikipedia.search（）而不是直接调用页面。 – Dalen

维基百科用维基百科1.4.0废弃：如何跳过不好的结果？

回答

相关问题