I'm trying to understand how to work with aiohttp
and asyncio
. The code below retrieves all websites in urls
and prints out the "size" of each response.
- Is the error handling within the fetch method correct?
- Is it possible to remove the result of a specific url from
results
in case of an exception - makingreturn (url, '')
unnecessary? - Is there a better way than
ssl=False
to deal with a potentialssl.SSLCertVerificationError
? - Any additional advice on how i can improve my code quality is highly appreciated
import asyncio import aiohttp async def fetch(session, url): try: async with session.get(url, ssl=False) as response: return url, await response.text() except aiohttp.client_exceptions.ClientConnectorError as e: print(e) return (url, '') async def main(): tasks = [] urls = [ 'http://www.python.org', 'http://www.jython.org', 'http://www.pypy.org' ] async with aiohttp.ClientSession() as session: while urls: tasks.append(fetch(session, urls.pop())) results = await asyncio.gather(*tasks) [print(f'{url}: {len(result)}') for url, result in results] if __name__ == '__main__': loop = asyncio.get_event_loop() loop.run_until_complete(main()) loop.close()
Update
- Is there a way how i can add tasks to the list from within the "loop"? e.g. add new urls while scraping a website and finding new subdomains to scrape.