2017-08-19 87 views
0

我试图将Python 2上的一个非常简单的函数转换为Python 3,它会废弃一个网页并返回一个代理列表,以便我可以在Twitter上使用机器人:从Python 2.x到Python 3.x转换的代理碎片

#!/usr/bin/env python 
#python25 on windows7 
##################################### 
# GPL v2 
# Author: Arjun Sreedharan 
# Email: [email protected] 
##################################### 

import urllib2 
import re 
import os 
import time 
import random 

def main(): 
    request = urllib2.Request("http://www.ip-adress.com/proxy_list/") 
    # request.add_header("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5") 
    #Without Referer header ip-adress.com gives 403 Forbidden 
    request.add_header("Referer","https://www.google.co.in/") 
    f = urllib2.urlopen(request) 

    #outfile = open('outfile.htm','w') 
    str1 = f.read() 
    #outfile.write(str1) 

    # normally DOT matches anycharacter EXCEPT newline. re.DOTALL makes dot 
    include newline 
    pattern = re.compile('.*<td>(.*)</td>.*<td>Elite</td>.*', re.DOTALL) 
    matched = re.search(pattern,str1) 
    print(matched.group(1)) 
    """ 
    ip = matched.group(1) 
    os.system('echo "http_proxy=http://'+ip+'" > ~/.wgetrc') 
    if random.randint(1,2)==1: 
     os.system('wget --proxy=on -t 1 --timeout=14 --header="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5" http://funnytweets.in -O /dev/null') 
    else: 
     os.system('wget --proxy=on -t 1 --timeout=14 --header="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.29 Safari/525.13" http://funnytweets.in -O /dev/null') 
    """ 
if __name__ == '__main__': 
    while True: 
     main() 
     time.sleep(2) 

好吧,我已经知道的urllib2是P3 diferent但我不能使它工作:(任何人都可以帮助:)谢谢!

+0

GitHub上的原始线程https://github.com/arjun024/get-a-random-proxy-address/blob/master/src/getProxyAddress.py –

回答

2

在Python3中Requesturlopen位于urllib.request模块,因此hou必须相应地更改导入。

from urllib.request import Request, urlopen 

你可以使你的代码Python2和Python3兼容的,如果你从urllib2导入时赶上ImportError异常。

try : 
    from urllib2 import Request, urlopen 
except ImportError: 
    from urllib.request import Request, urlopen 

还铭记保持这种URLErrorHTTPError位于urllib.error,如果你需要他们。