-
Notifications
You must be signed in to change notification settings - Fork 371
Websocket connection is lost on some websites (ConnectionClosed) #62
Comments
Update: Still strugling in tracing that but it looks like it happens because I reduce greatly the amount of errors (like, to about 0 if not 0, will confirm in a few hours) either if:
I suspect there is a race condition between some callback that happens on goto timeout and [edit] The "slowness fixes the bug" hypothesis is erroneous. There is something related to the websites opened that causes this. The reason why it's so "random" is that I use a message queue (rabbitmq) that will reschedule any cancelled task at same spot they were before. So for long time, no buuggy site, no error. If one buggy site comes to amqp head, then each time it kills a worker another worker will get it, and die too. Loop that. Funny, right? |
Great thanks for detail information and suggestion. |
@miyakogi Thanks for the reply. I can confirm for sure that it is not docker related, I just spent some time installing my code on a machine without containers, and the error happens exactly the same way. Trying to create a minimal code file for reproduction (even if it's happening at random, it's 100% reproductible, if one waits enough time, which can be 10 seconds sometime) The code to reproduce is not much more than what is the hello world of pyppeteer, btw. |
Related websocket log :
Looks like it shows the "send" with OP_CLOSE opcode is actually coming from the pyppeteer (but again, I'm not sure I understand everything here). |
@miyakogi Unlike everything I believed until then, it seems the error happens because of something the remote site has. Can't understand why, but I can reproduce it 100% (whatever the platform is) using this website (probably found in the 90s): import asyncio
from pyppeteer import launch
async def main():
browser = await launch()
page = await browser.newPage()
try:
await page.goto('http://www.kunstenknipwerk.com/', timeout=10000)
finally:
await page.close()
asyncio.get_event_loop().run_until_complete(main()) The randomness of the error for me then comes from the fact I use a queue that never contains the same thing. |
Note that the equivalent puppeteer code on the same website works as expected and does not close the websocket connection: const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
let page = await browser.newPage();
try {
await page.goto('http://www.kunstenknipwerk.com/', {timeout: 10000});
}
catch (e) {
console.error(e)
}
finally {
await page.close();
}
console.log('done');
})(); |
Here are some other examples of urls that produce this bug:
One challenge is that not all urls reproduce the bug on all origins. I suspect websites to be slightly different depending on the origin (think of an ad server serving an ad that change on each request ...) and although the urls I pasted here seems to reproduce the bug 100% on my machines, some urls behave differently and only bug on one or a subset of machines ... |
Thank you for minimal reproducible code. |
After merging your PR (#64), this problem does not reproduce on my machine. |
@miyakogi That's quite unexpected, should be completely unrelated, but on my local box it looks like the it works for the urls I listed here. Unfortunately, I upgraded a few of my instances to use dev version of pyppeteer, and I still see the error happening on other pages. I'm going to put in place a simple logging infrastructure for the dubious urls, which is a bit tricky as I don't know which tab is the culprit when it happens. But my "stupid" strategy on this is (a.k.a "post somewhere all urls that are currently processing when the error happen and try them one by one in a single tab spider") should help me find urls that still have this problem. |
Oh and thanks for looking into it, I understand it may take some time as I already spent around 1 full week struggling with this, really hard to trace (but you for sure know better the internals of pyppeteer so it may be easier for you). |
So, update here. Indeed it looks like the patches "mostly" fixes the problem. I still have it on some websites (but not 100% reproductible, looks like the behavior is a bit different from my local network than from the spiders networks). The thing is, if an error happening in some handler causes the browser ws connection to crash, then there is probably something in pyppeteer that we can do to catch all errors and at least log the original error. It's basically non debuggable if it just says "wops, connection closed (1006)", but it's a matter of seconds to fix it if it says "cannot do X on NoneType". Do you have a good idea on where this code could go? I can work on a patch, but I'm not confident enough with the pyppeteer codebase yet to think of the best place to catch everything. Thanks. |
I have the the same problem. Has this been fixed yet? |
As far as I know, it is not fixed. I still have the problem from time to time even if using develop makes it happen way less often. @miyakogi any idea on this regard ? |
I got this problem too... |
Same problem |
I got the exception at random time when I use multiprocess and open many pages.
|
I found some hints. It may be not websockets problem. |
I have an interesting setup. When I run without pipenv, I get no problems. When I run with pipenv, I get connection closed within a few seconds of opening target website. Any way I can assist in finding the culprit? |
It appears pipenv has websockets 7.0 and the normal userspace site-packages has websockets 6.0, it may be because of this difference. |
Yeah, downgraded to websockets 6.0, it works fine now. websockets 7.0 consistently disconnects after some time for all websites. |
@RunningToTheEdgeOfTheWorld I agree with you. I found that chrome navigated some unstable urls would cost a lot of time and the exception appeared frequently. May be pyppeteer should keep the connection by sending ping message. |
@nurettin Dumb question, how do you do this? |
duh... I didnt realize websockets was a pip package. |
Don't mean to hijack this thread, but i wasn't sure if this was related.. |
- Need to wait for page to fully load in order to get component url - Pyeterr has problem with latest websockets==7.0 , see miyakogi/pyppeteer#62 (comment) - pinned websockets to 6.0 and to make it work
Bump this. It needs to be resolved in master and the solution is to downgrade websockets.
|
- Applied path from miyakogi/pyppeteer#160 in order to fix miyakogi/pyppeteer#62 which made the websocket connection to Chromium close after ~20s - Reworked logic to use a producer/consumer pattern
This issue seems to be still a problem on some websites even after applying the proposed patches for the |
Does this problem be fixed? I got same problem, and I used uvicorn which need websockets>=8.0... |
Still problems with newest websocket package. Does somebody has an idea what goes wrong? Would be nice to open an issue for the websocket package. |
No idea what's wrong, I think it has something to do with the |
I have a script that basically open a tab, plays with a website, close the tab. Loop.
On my laptop (OSX), it works just fine. Now when I run that in a container (either with /dev/shm disabled, or with --shm-size=2g), it does work, but at some point the websocket connection is lost and an exception happens in
pyppeteer.connection.Connection
, in_async_send()
(websocket.ConnectionClosed
I think).[EDIT 04/12] this is not container related, I can have the same error happen on my local laptop. For some reason, it happens less, but still happens.
Error:
I tried to add the debug info of asyncio and look at the logs, and it seems that at some point, for no reason that I can think of, there is indeed a
OP_CLOSE
frame that goes throughwrite_frame()
coroutine in the ws protocol. I'm a bit lacking knowledge on asyncio, and can't really find a way to trace it back to the reason of this close.Then, the next
async_send()
complains, with an uncaught error.I tried to modify the above code to catch the connection error aroud the send() call, and try to reconnect blindly (ok that was a bit naive, but who knows), but that did not do the trick. Probably because I do it at a random time, without knowing any state.
Note that the browser process does not die, and I can reproduce it more systematically if I enclose the
browser.goto
call with aasyncio.wait_for
with small timeout (small enough so that it can't goto for real). I guess the later is a bad idea, and I stopped trying to do it, but still it happens at complete random time even without. Low or high goto timeouts, 50 tabs or only 1, 10 seconds after running the script or 30 minute after, etc. and never on the same URL.I don't know what kind of information I can provide to help, or what kind of tools I can use to debug and fix that. The asyncio's "hey, here is your empty stack trace, have fun" way of helping me is giving me a hard time.
The text was updated successfully, but these errors were encountered: