Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIOHttp failing after some requests #2920

Closed
arjuncec opened this issue Apr 6, 2018 · 5 comments
Closed

AIOHttp failing after some requests #2920

arjuncec opened this issue Apr 6, 2018 · 5 comments
Labels

Comments

@arjuncec
Copy link

arjuncec commented Apr 6, 2018

Long story short

I was firing around 6000 requests concurrently and after some amount of time, the program is terminated without any error.

Expected behaviour

The program should run and all the files should be downloaded

Actual behaviour

The program is terminated without error

Steps to reproduce

I have only one channel and the channel has around 6000 files

import asyncio
import os
import ssl
import sys
import time
from configparser import ConfigParser
import aiohttp
from aiohttp import ClientSession


async def fetch_files(channel):
    global files
    tasks = []
    files = []
    async with ClientSession(connector=aiohttp.TCPConnector(ssl_context=get_ssl_context())) as session:
        try:
            url = config.get(env, 'cron.url.channel.files').replace("${channel}", channel)
            task = asyncio.ensure_future(fetch(url, session, "json"))
            tasks.append(task)
        except Exception as e:
            print("The error occurred while processing ", str(e))
        files = await asyncio.gather(*tasks, return_exceptions=True)
        if len(files) > 0:
            channel_dir = folder_name + "/" + channel
            if not os.path.exists(channel_dir):
                os.makedirs(channel_dir)
            futures = [fetch_file_data(channel, file, session) for file in files[0]]
            await asyncio.ensure_future(asyncio.gather(*futures, return_exceptions=True))


async def fetch(url, session, req_type):
    async with session.get(url) as response:
        print("Response status code for the url ", url, " is ", response.status)
        if response.status == 200:
            if req_type is "json":
                return await response.json()
            else:
                return await response.text()
        else:
            return None


async def fetch_file_data(channel, file, session):
    try:
        print("The url is ", config.get(env, 'cron.url.file'))
        url = config.get(env, 'cron.url.file').replace("${channel}", channel).replace("${file}", file)
        file_path = folder_name + "/" + channel + "/" + file
        print("Writing the file ", file_path)
        print("The channel is ", channel)
        print(url)
        response = await fetch(url, session, "text")
        if response is not None:
            with open(file_path, "w+") as f:
                f.write(response)
    except asyncio.TimeoutError as r:
        print("Timeout, skipping ", url , " and the exception is ", r.message)
    except Exception as e:
        print("Error occured while processing ", str(e))


def get_env():
    global env
    return "test"


def get_ssl_context():
    global ssl_context
    ssl_context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
    ssl_context.load_cert_chain(config.get(env, 'cron.crt.file'), config.get(env, 'cron.key.file'))
    return ssl_context


files = []
env = ""
ssl_context = ""
config = ""
folder_name = ""


def main(args):
    start = time.time()
    global config
    global env
    global files
    global folder_name
    channel_list = []
    files = []
    env = get_env()
    if len(args) >= 3:
        folder_name = args[1]
        if not os.path.exists(folder_name):
            os.makedirs(folder_name)
        for x in args[2:]:
            channel_list.append(x)
        config = ConfigParser()
        config.read('config.ini')
        try:
            loop = asyncio.get_event_loop()
            loop.set_debug(True)
            futures = [fetch_files(channel) for channel in channel_list]
            result = asyncio.ensure_future(asyncio.gather(*futures, return_exceptions=True))
            loop.run_until_complete(result)
        except asyncio.CancelledError:
            print("aiohttp cancelled")
        except:
            print("Closed abruptly")
        finally:
            loop.close()
    else:
        print("Excepted more params")
    times = time.time() - start
    print("time ", times, "seconds ie ", times/60, " minutes")

if __name__ == "__main__":
    main(sys.argv)

Your environment

Python 3.5
aiohttp 2.3.9

@asvetlov
Copy link
Member

asvetlov commented Apr 6, 2018

Sorry, without providing more info I have no idea how to help.
Unfortunately, I cannot just run the provided file: it requires some unknown environment variables etc.
Could you run it in debug mode (PYTHONASYNCIODEBUG=1)?

@arjuncec
Copy link
Author

arjuncec commented Apr 9, 2018

Even after running in the debug mode, I couldn't find any error message. The request has been fired for 4000 files and the response has been captured only of 1589 requests. Without any error message, it got stopped.

grep -o 'Writing the file' ~/Personal/Log.log | wc -l 
    4000

grep -o 'Response status code for the url' ~/Personal/Log.log| wc -l
    1589

@asvetlov
Copy link
Member

aiohttp always either returns a result or raises an exception but never dies silently.
You need analyzing asyncio.gather() results maybe?

@asvetlov
Copy link
Member

Cannot reproduce, need more info for pinning the problem.
Feel free to open a new issue with detailed error description

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants