2015-07-01 57 views
-1

由于某些原因,我想发送原始http头到服务器,可以python 请求这样做吗?例如,http头这样,如何发送原始http头

GET http://baidu.com/ HTTP/1.1 
Host: baidu.com 
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 
Accept-Language: en-US,en;q=0.5 
Accept-Encoding: gzip, deflate 
Connection: keep-alive 

我发现扭曲可以做到这一点,但它是一个有点复杂。

回答

2

使用twisted

from twisted.internet import reactor 
from twisted.web.client import Agent 
from twisted.web.http_headers import Headers 

agent = Agent(reactor) 

d = agent.request(
    'GET', 
    'http://baidu.com/', 
    Headers({ 
      'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'], 
      'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 
      'Accept-Language': ['en-US,en;q=0.5'], 
      'Accept-Encoding': ['gzip, deflate'], 
      'Connection': ['keep-alive'] 
     }), 
    None) 

def Response(null): 
    print('Response received') 

def Shutdown(null): 
    print('Shutting down the reactor now') 
    reactor.stop() 

d.addCallback(Response)  # exec Response() after request is rcvd 
d.addBoth(Shutdown)   # shut down after response rcvd 
reactor.run() 

更复杂的(尤其是如果你想“做的东西”与响应),但twisted是你应该知道你是否打算在Python中进行Web或并发编程。希望这可以帮助你,如果不是,我希望它可以帮助有人在HTTP标题和twisted挣扎。

编辑 - 2016年3月7日

使用treq

from __future__ import print_function 
from treq import get 
from twisted.internet.task import react 


def handleResponse(response): 
    """ Callback Function 

    Once the response is recived, display the information. 
    This is the part where I suspect people will have the most 
    trouble wrapping their heads around since it's heavily 
    dependent on deferreds (ie. futures or promises). 
    """ 
    print('Code: %s\n' % response.code) 

    print('Simple print:') 
    response.content().addCallback(print)  # simple way to print on py2 & py3 

    text = response.text()      # returns a deferred 
    text.addCallback(displayText)    # the way you should be handling responses, ie. via callbacks 

def displayText(text): 
    """ Callback Function 

    Simply display the text. You would usually do more useful 
    things in this call back, such as maniuplating the response 
    text or setting the text to some global or otherwise accessible 
    variable(s). 
    """ 
    print('Deferred print:') 
    print(text) 

def main(reactor): 
    """ 
    This is the main function which will execute a request using the 
    GET method. After getting the response, the response code and content 
    will be displayed. Finally, the twisted reactor will stop (since 
    the react function is being used). 
    """ 
    url = 'http://baidu.com/' 
    header={ 
     'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'], 
     'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 
     'Accept-Language': ['en-US,en;q=0.5'], 
     'Accept-Encoding': ['gzip, deflate'], 
     'Connection': ['keep-alive']} 

    d = get(url, headers=header) 
    d.addCallback(handleResponse) 
    return d 


react(main)   # run the main function and display results 

treq包装更容易比使用twisted直接使用,而且许多共同的特点和requests语法。

参考

+0

谢谢,你的回答比我预期的更有用。 – Hao

+0

你总是可以upvote;) –

1

你可以做这样的:

import requests  

headers = {'Host': 'baidu.com', 
      'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0,' 
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
      'Accept-Language': 'en-US,en;q=0.5', 
      'Accept-Encoding': 'gzip, deflate', 
      'Connection': 'keep-alive'} 

requests.get('http://baidu.com/', headers=headers) 
1

requests.request方法(和它的所有衍生像request.getrequest.head)可以传递一个headers参数。请参阅requestcustom headers的文档。

您可以使用它像

requests.get('http://baidu.com', headers={'Host':'baidu.com', 
              'Accept-Encoding': 'gzip, deflate', 
              ...})