在使用aiohttp将某些同步代码移动到asyncio的过程中。同步代码需要15分钟才能运行,所以我希望能够改善这一点。Python aiohttp/asyncio - 如何处理返回的数据
我有一些工作代码,它从一些网址获取数据并返回每个网址的正文。但这只是针对1个实验室网站,我有70多个实际网站。
因此,如果我有一个循环来创建所有网站的列表,所有的网站将使700个网址在列表中进行处理。现在处理它们我不认为是一个问题?
但是做结果的'东西',我不知道如何编程?我的代码已经可以对每个返回的结果执行“填充”,但我不确定如何针对正确的结果类型进行编程。
当代码运行时,它会处理所有的url,并根据运行时间,返回一个未知的命令?
我需要一个函数来处理任何类型的结果吗?
import asyncio, aiohttp, ssl
from bs4 import BeautifulSoup
def page_content(page):
return BeautifulSoup(page, 'html.parser')
async def fetch(session, url):
with aiohttp.Timeout(15, loop=session.loop):
async with session.get(url) as response:
return page_content(await response.text())
async def get_url_data(urls, username, password):
tasks = []
# Fetch all responses within one Client session,
# keep connection alive for all requests.
async with aiohttp.ClientSession(auth=aiohttp.BasicAuth(username, password)) as session:
for i in urls:
task = asyncio.ensure_future(fetch(session, i))
tasks.append(task)
responses = await asyncio.gather(*tasks)
# you now have all response bodies in this variable
for i in responses:
print(i.title.text)
return responses
def main():
username = 'monitoring'
password = '*********'
ip = '10.10.10.2'
urls = [
'http://{0}:8444/level/15/exec/-/ping/{1}/timeout/1/source/vlan/5/CR'.format(ip,'10.10.0.1'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'10.10.0.1'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'frontend.domain.com'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'planner.domain.com'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'10.10.10.1'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'10.11.11.1'),
'http://{0}:8444/level/15/exec/-/ping/{1}/timeout/1/source/vlan/5/CR'.format(ip,'10.12.12.60'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'10.12.12.60'),
'http://{0}:8444/level/15/exec/-/ping/{1}/timeout/1/source/vlan/5/CR'.format(ip,'lon-dc-01.domain.com'),
'http://{0}:8444/level/15/exec/-/traceroute/{1}/source/vlan/5/probe/2/timeout/1/ttl/0/10/CR'.format(ip,'lon-dc-01.domain.com'),
]
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(get_url_data(urls,username,password))
data = loop.run_until_complete(future)
print(data)
if __name__ == "__main__":
main()
谢谢,我明白你所说的一切,但你在ProcessPoolExecutor部分已经失去了我。我需要一个独立的CPU进程的结果?我该怎么做呢?以及如何按顺序处理它们,还是需要一个处理所有结果的函数? – AlexW