Skip to content
This repository has been archived by the owner on Dec 11, 2019. It is now read-only.

browser unresponsive when write: broken pipe errors occur while pulling the latest state #14882

Closed
kjozwiak opened this issue Jul 31, 2018 · 8 comments

Comments

@kjozwiak
Copy link
Member

Description

While I was pulling the latest state using a new profile, Brave suddenly became unresponsive after attempting to switch into a different preference page. I tried clicking around a few times and noticed that the entire browser become unresponsive. I could still open a tab, but it was blank and nothing was being loaded. I took a look at the terminal and noticed the following:

unresponsive
unresponsive
unresponsive
unresponsive

After a few seconds, the terminal was spammed with A LOT of the following errors:

ERROR[07-31|15:12:27.976] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.977] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.978] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.978] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.979] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.980] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.980] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:27.997] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:28.002] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:28.003] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe
ERROR[07-31|15:12:28.007] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe

Once the errors finished spamming by terminal, Brave become responsive/usable and started pulling the latest blocks:

INFO [07-31|15:12:29.272] Imported new block headers count=2048 elapsed=2.526s    number=3575999 hash=983476…203135 ignored=0
INFO [07-31|15:12:31.974] Imported new block headers count=2048 elapsed=2.658s    number=3578047 hash=e02b8a…0800ca ignored=0
INFO [07-31|15:12:34.430] Imported new block headers count=2048 elapsed=2.416s    number=3580095 hash=52e428…97cfee ignored=0
INFO [07-31|15:12:36.811] Imported new block headers count=2048 elapsed=2.337s    number=3582143 hash=b6e631…79fe1a ignored=0

Steps to Reproduce

  1. using the ETH Wallet PR, start Brave using a new profile
  2. enable ETH Wallet using about:preferences#ethwallet and restart the browser

Sometimes when you're pulling the latest state, you'll run into this issue.

Actual result:

Entire browser becomes unresponsive when ERROR[07-31|15:12:28.007] write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe are being spammed in the terminal due to issues with geth.

Expected result:

Even though geth fails to connect/pull the latest state, it should never cause the entire browser to become completely unresponsive.

Reproduces how often:

I've seen it happen a few times but the % isn't high. However, as mentioned below under the Additional Information section, @LaurenWags also ran into the same issue.

Brave Version

about:brave info:

Brave: 0.25.0 
V8: 6.7.288.46 
rev: b85dfa16ae78413d47b0ef76fd2e5971b2a5f44b 
Muon: 7.1.6 
OS Release: 17.7.0 
Update Channel:  
OS Architecture: x64 
OS Platform: macOS 
Node.js: 7.9.0 
Brave Sync: v1.4.2 
libchromiumcontent: 67.0.3396.103

Reproducible on current live release:

Currently not reproducible on the live release as this feature hasn't been released.

Additional Information

@LaurenWags ran into this as well when we were debugging her connectivity issues. Once Brave connected to the nodes and started pulling the latest state, the browser became unresponsive for about 10s before the terminal was spammed with several write tcp 127.0.0.1:8546->127.0.0.1:52489: write: broken pipe errors.

@LaurenWags
Copy link
Member

LaurenWags commented Aug 1, 2018

Currently running commit 50840f4 and my browser started to show unresponsive in the terminal, but recovered (no mention of the broken pipe). During this time All opened tabs went white and any new tab I opened was also white. Brave did recover after about a minute, during which time the geth process skyrocketed and consumed a ton of CPU.

2-3 minutes later, browser went unresponsive again (still no mention of broken pipes), and all tabs are white again. Brave has not recovered from this state (been about 20 minutes).

When I quit Brave, I got a ton of messages about the broken pipe - same as in the description.

@Slava
Copy link
Contributor

Slava commented Aug 2, 2018

I suspect the browser might become unresponsive, because the Eth-Wallet app is equally chatty on errors when this happens. I haven't been able to repro this state myself tho.

@Slava
Copy link
Contributor

Slava commented Aug 2, 2018

@flamsmark the current theory is that geth is starving the OS out of some resources (maybe descriptors?). I looked it up, it does not seem like geth has any options to limit the number of connections but we can limit the number of peers to something smaller.

Slava added a commit to Slava/browser-laptop that referenced this issue Aug 2, 2018
@tildelowengrimm tildelowengrimm added the priority/P2 Crashes. Loss of data. Severe memory leak. label Aug 2, 2018
@tildelowengrimm tildelowengrimm assigned Slava and mrose17 and unassigned Slava Aug 2, 2018
@Slava Slava assigned Slava and ryanml Aug 6, 2018
ryanml pushed a commit to Slava/browser-laptop that referenced this issue Aug 7, 2018
ryanml pushed a commit to Slava/browser-laptop that referenced this issue Aug 11, 2018
ryanml pushed a commit to Slava/browser-laptop that referenced this issue Aug 13, 2018
ryanml pushed a commit to Slava/browser-laptop that referenced this issue Aug 13, 2018
@BrendanEich
Copy link
Member

Reducing peers may be good in general, even helpful ameliorating this bug, but an EPIPE from write(2) means the write-caller should stop looping, close the fd for the write end of the pipe, and arrange for higher level recovery. Sounds like that is not happening. In what code does that write that's logged as getting EPIPE live, can someone debug and get a stack and source coordinates?

@GeetaSarvadnya
Copy link
Collaborator

Verified on Windows x64 with
• 0.23.100 5e197a1
• Muon 8.0.8
• libchromiumcontent 68.0.3440.84

@tildelowengrimm
Copy link

@mrose17 We've probably reduced the probability/severity of this issue, but I'd prefer to actually solve it at the root cause.

@evq
Copy link
Member

evq commented Aug 14, 2018

fwiw I see unresponsive and the browser freezing up even when geth is not running:

GETH: spawned
Failed to configure static nodes peers ENOENT: no such file or directory, open '/home/user/.config/brave-development/ethereum/ropsten/geth/static-nodes.json'
GETH exit: Code: 2 | Signal: null
GETH close: Code: 2 | Signal: null

a few moments later...

unresponsive

I checked that geth had indeed exited

@mrose17
Copy link
Member

mrose17 commented Aug 15, 2018

i believe that #15029 should resolve most, if not all, of these cases.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants