如何使用带网址和基本认证凭证的scrapy shell？

我想使用scrapy shell并测试需要基本身份验证凭据的url的响应数据。我试图检查scrapy shell文档，但是我找不到它。如何使用带网址和基本认证凭证的scrapy shell？

我试过scrapy shell 'http://user:[email protected]'，但没有奏效。有人知道我能做到吗？

2017-03-16 Rohanil

你能分享你如何登录蜘蛛内吗？ – eLRuLL

我在蜘蛛中使用[HttpAuthMiddleware]（https://doc.scrapy.org/en/latest/topics/downloader-middleware.html#scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware），但我想用shell而不是蜘蛛。 – Rohanil

只要您从项目目录运行shell命令，它就会工作。中间件也不需要url中的'user：password'，中间件为你处理 –

，如果你希望只使用外壳，你可以做这样的事情：

$ scrapy shell

和内壳：

>> from w3lib.http import basic_auth_header 
>> from scrapy import Request 
>> auth = basic_auth_header(your_user, your_password) 
>> req = Request(url="http://example.com", headers={'Authorization': auth}) 
>> fetch(req)

为fetch使用当前的请求，以更新的shell会话。

来源

2017-03-16 02:57:34 eLRuLL

谢谢。有效。 – Rohanil

说实话我会说你的想法，直接在shell上添加'user：pass'到网址看起来很有趣，我会尝试建议或实施到'scrapy' – eLRuLL

看起来很快就会解决：https： //github.com/scrapy/scrapy/pull/1466 – eLRuLL

是的与httpauth middleware。

确保HTTPAuthMiddleware在设置中启用然后只是定义：

class MySpider(CrawSpider): 
    http_user = 'username' 
    http_pass = 'password' 
    ...

在你的蜘蛛类变量。

此外，如果中间件已在设置中启用，则不需要在url中指定登录凭据。

来源

2017-03-16 02:46:38

我想用shell而不是蜘蛛 – Rohanil

shell使用项目资源 –

@Rohanil尝试'scrapy shell' http：//www.example.org'，并确保你已经将中间件包含在你的设置中，同时指定登录凭证作为类变量，因为它们在我的示例中被命名为 –

如何使用带网址和基本认证凭证的scrapy shell？

回答

相关问题