-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Degrading performance over time... #1821
Comments
What version of aiohttp do you use? Also check for open tcp connection, does number grow?
… On Apr 17, 2017, at 12:46 PM, Vishal Goklani ***@***.***> wrote:
I wrote a quick AsyncScraper class below:
import logging, datetime
import aiohttp
import asyncio
import uvloop
# asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger.addHandler(logging.StreamHandler())
class AsyncScraper(object):
headers = {"User-Agent" : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36'}
def __init__(self, number_workers=100, max_connections=1000, timeout=10):
self.number_workers = number_workers
self.max_connections = max_connections
self.timeout = timeout
self.work_queue = asyncio.Queue()
self.cookie_jar = aiohttp.CookieJar(unsafe=True)
self.session = aiohttp.ClientSession(connector=aiohttp.TCPConnector(limit=self.max_connections, verify_ssl=False), cookie_jar=self.cookie_jar)
def __del__(self):
self.session.close()
async def get_response(self, url):
try:
with aiohttp.Timeout(self.timeout):
async with self.session.get(url, allow_redirects=True, headers=AsyncScraper.headers) as response:
content = await response.text()
return {'error': "", 'status': response.status, 'url':url, 'content': content, 'timestamp': str(datetime.datetime.utcnow())}
except Exception as err:
return {'error': err, 'status': "", 'url':url, 'content': "", 'timestamp': str(datetime.datetime.utcnow())}
async def consumer(self, worker_id):
results = []
while not self.work_queue.empty():
queue_url = await self.work_queue.get()
logger.debug("processing {} from worker_{}".format(queue_url, worker_id))
response = await self.get_response(queue_url)
logger.debug("received {} from worker_{}".format(queue_url, worker_id))
if response['error'] is "":
results.append(response)
logger.debug("finished processing {} urls via worker_{}".format(len(results), worker_id))
return results
def get_all(self, urls):
[self.work_queue.put_nowait(url) for url in urls]
tasks = asyncio.gather(*[self.consumer(worker_id) for worker_id in range(self.number_workers)], return_exceptions=True)
loop = asyncio.get_event_loop()
results = loop.run_until_complete(tasks)
return [__ for _ in results for __ in _]
and I've run into two main issues:
If I pass in a flat list of URLs, say 100k, I get flooded with timeout errors:
I then batched the URLs in chunks of 1,000, and timed the response between batches. And I was able to clearly measure that the performance degrades over time:
iteration 0 done in 16.991s
iteration 1 done in 39.376s
iteration 2 done in 35.656s
iteration 3 done in 19.716s
iteration 4 done in 29.331s
iteration 5 done in 19.708s
iteration 6 done in 19.572s
iteration 7 done in 29.907s
iteration 8 done in 23.379s
iteration 9 done in 21.762s
iteration 10 done in 22.091s
iteration 11 done in 22.940s
iteration 12 done in 31.285s
iteration 13 done in 24.549s
iteration 14 done in 26.297s
iteration 15 done in 23.816s
iteration 16 done in 29.094s
iteration 17 done in 24.885s
iteration 18 done in 26.456s
iteration 19 done in 27.412s
iteration 20 done in 29.969s
iteration 21 done in 28.503s
iteration 22 done in 28.699s
iteration 23 done in 31.570s
iteration 26 done in 31.898s
iteration 27 done in 33.553s
iteration 28 done in 34.022s
iteration 29 done in 33.866s
iteration 30 done in 36.351s
iteration 31 done in 40.060s
iteration 32 done in 35.523s
iteration 33 done in 36.607s
iteration 34 done in 36.325s
iteration 35 done in 38.425s
iteration 36 done in 39.106s
iteration 37 done in 38.972s
iteration 38 done in 39.845s
iteration 39 done in 40.393s
iteration 40 done in 40.734s
iteration 41 done in 47.799s
iteration 42 done in 43.070s
iteration 43 done in 43.365s
iteration 44 done in 42.081s
iteration 45 done in 44.118s
iteration 46 done in 44.955s
iteration 47 done in 45.400s
iteration 48 done in 45.987s
iteration 49 done in 46.041s
iteration 50 done in 45.899s
iteration 51 done in 49.008s
iteration 52 done in 49.544s
iteration 53 done in 55.432s
iteration 54 done in 52.590s
iteration 55 done in 50.185s
iteration 56 done in 52.858s
iteration 57 done in 52.698s
iteration 58 done in 53.048s
iteration 59 done in 54.120s
iteration 60 done in 54.151s
iteration 61 done in 55.465s
iteration 62 done in 56.889s
iteration 63 done in 56.967s
iteration 64 done in 57.690s
iteration 65 done in 57.052s
iteration 66 done in 67.214s
iteration 67 done in 58.457s
iteration 68 done in 60.882s
iteration 69 done in 58.440s
iteration 70 done in 60.755s
iteration 71 done in 58.043s
iteration 72 done in 65.076s
iteration 73 done in 63.371s
iteration 74 done in 62.800s
iteration 75 done in 62.419s
iteration 76 done in 61.376s
iteration 77 done in 63.164s
iteration 78 done in 65.443s
iteration 79 done in 64.616s
iteration 80 done in 69.544s
iteration 81 done in 68.226s
iteration 82 done in 78.050s
iteration 83 done in 67.871s
iteration 84 done in 69.780s
iteration 85 done in 67.812s
iteration 86 done in 68.895s
iteration 87 done in 71.086s
iteration 88 done in 68.809s
iteration 89 done in 70.945s
iteration 90 done in 72.760s
iteration 91 done in 71.773s
iteration 92 done in 72.522s
The time here corresponds to the iteration time to process 1,000 URLs. Please advise. Thanks
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#1821>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAkjzsPpQtk7SDgKmTUKUblSZ0Bfu1lVks5rw8GTgaJpZM4M_fXe>.
|
I'm using '2.0.5'. How do I check for the open tcp connection? |
What version of python do you use?
To check connections use “netstat” command, take several measurements
… On Apr 17, 2017, at 12:50 PM, Vishal Goklani ***@***.***> wrote:
I'm using '2.0.5'. How do I check for the open tcp connection?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#1821 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAkjzh_1rIO2GvJzBYhX94q03wH0Fmfsks5rw8JrgaJpZM4M_fXe>.
|
I'm using python 3.6.0 |
Am I making an obvious mistake in the code? |
I do not see anything obvious. I just want to clear what part of aiohttp is the source of the problem.
Also python3.6 leaks ssl sockets.
… On Apr 17, 2017, at 3:02 PM, Vishal Goklani ***@***.***> wrote:
Am I making an obvious mistake in the code?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#1821 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAkjzl8UAYIRQ4JYbn5iuOoqiB1fZCshks5rw-F1gaJpZM4M_fXe>.
|
Is there anything specific I can do to help you test? I'm not clear on what you want me to do with netstat? |
does number of opened connections grow? |
if linux - run this in console while process first, second, midle and last chunks of urls PID=pid_of_server |
I've just encountered this. 10k async reqs using a TCPConnecter with a limit of 100, initially starts well but immediately the connections seen via Using an Edit: Seen on Ubuntu 16.04 with Python 3.5.2 using:
|
Just ran it again with FWIW, it also runs through requests about 4 times faster using asyncio.Semaphore locally. |
highly recommend trying with 3.5.2 to see if it's related to the ssl leak. Also I use a pattern like this for doing multiple tasks in parallel: https://gist.github.com/thehesiod/7081ab165b9a0d4de2e07d321cc2391d. Spawning too many async tasks makes things very slow as it has to navigate through the giant list of tasks. |
@thehesiod I have the same issue, running 3.5.2. I ran this snippet overnight:
0.05777549743652344 seconds taken Today when I checked it, I was getting: 1.3857007026672363 seconds taken |
@thehesiod running the testcase I posted against |
Mind taking snapshots when fast and slow with pyvmmonitor? |
For whatever reason pyvmmonitor won't play ball (X11 issues, missing python modules, unmentioned missing shared object files). Happy to provide cProfiles if those are helpful, but if you need the pyvmmonitor output it looks like your best bet is running the testcase yourself I'm afraid, sorry I can't be of more help 🙁 |
Of course as soon as I said that, I realised what was wrong. >_< The testcase now includes fast and slow variants (fast = semaphore, slow = limit param) and the pstat file from running each. |
nice, that should help us, thanks! |
looks like in the slow case for some reason ~60% of the time is spent in |
@mal have you checked SSL vs non-SSL? Would help narrow the problem |
if SSL only may want to try with python/cpython#2270 |
Sorry for the delay, I was on vacation last week. I've not tried SSL vs non-SSL but the testcase I was using is plain old HTTP not HTTPS. Is there any reason to suspect that performance would be better with the additional overhead of HTTPS? |
Ah ok. No just thought I'd check it were https only could be explained by existing issues |
Guys could you test against new aiohttp 2.2.0 release? |
Just re-ran the testcase using |
Just re-ran the testcase against |
I'll try to find time if nobody from @aio-libs/aiohttp-committers pick up the issue first |
It seems that @dionhaefner fixed it on DHI-GRAS/itsybitsy#1 maybe we can have some info here? |
Not exactly fixed - I just used @mal's workaround to use a Semaphore for rate-limiting rather than the sem = asyncio.Semaphore(max_connections)
connector = aiohttp.TCPConnector(limit=None)
session = aiohttp.ClientSession(connector=connector)
async def get_page(url):
async with sem:
async with session.get(url):
... performs well for me. |
@dteh it's hard to use connections to google.com as a testcase given its response times can vary widely, I suggest trying again with a local server. |
@mal I'll let @asvetlov comment on the non-semaphore case, but my understanding is that adding a lot of tasks will make everything slow as the message loop is going to walk through all the tasks on each blocking call. The semaphore / workerpool (https://gist.github.com/thehesiod/7081ab165b9a0d4de2e07d321cc2391d) is the better strategy to only have as many work tasks active as you are able to process. |
The implementation of the
|
@thehesiod it's been a while since I looked at this, but it sounds like you're suggesting that having to use a semaphore or queue to get it performant is expected? If that is the case, I'd question what the |
To me it seems that this issue is blocking enough to consider deploying a solution that works but might not be perfect. I know it's there for a year now and there wasn't much complaint but... Imagine people who wants to run in production with aiohttp, they can't because of such issue. @asvetlov I propose we double check and confirm this issue, then apply this manual semaphore mechanism directly to aiohttp on all maintained versions (as it seems to be the best way to solve it). We can still improve/replace it later if needed. What do you think? |
As @pfreixes commented if aiohttp has a bug -- it should be fixed. |
@mal the limit parameter is to limit the number of concurrent connections, not in general to apply any type of pressure. Just as a note I use aiohttp to make millions of requests every day in a prod environment across multiple services for years and haven't run into this issue by using things like worker pools and semaphores. |
Maybe the issue is one of expectation? The crux of this seems to be "is this a bug, or is another method meant to used to support this use case?" and no one really seems to be sure. I'd argue that right now it's at best unexpected behaviour that looks like a bug to the user, which results in a response ranging from mild frustration to simply not utilising the library. |
@mal after further reading of your code I didn't realize how similar you made the two codepaths whereby in both cases you have the same number of outstanding tasks, in one case they're waiting on a semaphore to unblock them, and in the other on the So this is rather interesting. we know per the performance trace that the apologies for making an assumption about your example. Thanks again for pushing this issue! I think I'll have some time to experiment a bit with this today but hopefully @asvetlov can pursue this separately. I'm not a contributor but am an interested party :) |
so I created a PR which greatly speeds up both cases. What this does is avoid listing through the the list of waiters in non-exceptional cases |
on my machine it went from ~13s to 3s in the slow case |
anyone want to verify the results from my PR? |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
I wrote a quick AsyncScraper class below:
and I've run into two main issues:
If I pass in a flat list of URLs, say 100k (via the get_all method), I get flooded with errors:
2017-04-17 15:50:53,541 - asyncio - ERROR - Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x10d5439b0>
transport: <_SelectorSocketTransport closing fd=612 read=idle write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/Users/vgoklani/anaconda3/lib/python3.6/asyncio/sslproto.py", line 639, in _process_write_backlog
ssldata = self._sslpipe.shutdown(self._finalize)
File "/Users/vgoklani/anaconda3/lib/python3.6/asyncio/sslproto.py", line 151, in shutdown
raise RuntimeError('shutdown in progress')
RuntimeError: shutdown in progress
I then batched the URLs in chunks of 1,000, and timed the response between batches. And I was clearly able to measure the performance decay over time (see below). Moreover, the number of errors increased over time... What am I doing wrong?
iteration 0 done in 16.991s
iteration 1 done in 39.376s
iteration 2 done in 35.656s
iteration 3 done in 19.716s
iteration 4 done in 29.331s
iteration 5 done in 19.708s
iteration 6 done in 19.572s
iteration 7 done in 29.907s
iteration 8 done in 23.379s
iteration 9 done in 21.762s
iteration 10 done in 22.091s
iteration 11 done in 22.940s
iteration 12 done in 31.285s
iteration 13 done in 24.549s
iteration 14 done in 26.297s
iteration 15 done in 23.816s
iteration 16 done in 29.094s
iteration 17 done in 24.885s
iteration 18 done in 26.456s
iteration 19 done in 27.412s
iteration 20 done in 29.969s
iteration 21 done in 28.503s
iteration 22 done in 28.699s
iteration 23 done in 31.570s
iteration 26 done in 31.898s
iteration 27 done in 33.553s
iteration 28 done in 34.022s
iteration 29 done in 33.866s
iteration 30 done in 36.351s
iteration 31 done in 40.060s
iteration 32 done in 35.523s
iteration 33 done in 36.607s
iteration 34 done in 36.325s
iteration 35 done in 38.425s
iteration 36 done in 39.106s
iteration 37 done in 38.972s
iteration 38 done in 39.845s
iteration 39 done in 40.393s
iteration 40 done in 40.734s
iteration 41 done in 47.799s
iteration 42 done in 43.070s
iteration 43 done in 43.365s
iteration 44 done in 42.081s
iteration 45 done in 44.118s
iteration 46 done in 44.955s
iteration 47 done in 45.400s
iteration 48 done in 45.987s
iteration 49 done in 46.041s
iteration 50 done in 45.899s
iteration 51 done in 49.008s
iteration 52 done in 49.544s
iteration 53 done in 55.432s
iteration 54 done in 52.590s
iteration 55 done in 50.185s
iteration 56 done in 52.858s
iteration 57 done in 52.698s
iteration 58 done in 53.048s
iteration 59 done in 54.120s
iteration 60 done in 54.151s
iteration 61 done in 55.465s
iteration 62 done in 56.889s
iteration 63 done in 56.967s
iteration 64 done in 57.690s
iteration 65 done in 57.052s
iteration 66 done in 67.214s
iteration 67 done in 58.457s
iteration 68 done in 60.882s
iteration 69 done in 58.440s
iteration 70 done in 60.755s
iteration 71 done in 58.043s
iteration 72 done in 65.076s
iteration 73 done in 63.371s
iteration 74 done in 62.800s
iteration 75 done in 62.419s
iteration 76 done in 61.376s
iteration 77 done in 63.164s
iteration 78 done in 65.443s
iteration 79 done in 64.616s
iteration 80 done in 69.544s
iteration 81 done in 68.226s
iteration 82 done in 78.050s
iteration 83 done in 67.871s
iteration 84 done in 69.780s
iteration 85 done in 67.812s
iteration 86 done in 68.895s
iteration 87 done in 71.086s
iteration 88 done in 68.809s
iteration 89 done in 70.945s
iteration 90 done in 72.760s
iteration 91 done in 71.773s
iteration 92 done in 72.522s
The time here corresponds to the iteration time to process 1,000 URLs. Please advise. Thanks
The text was updated successfully, but these errors were encountered: