Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instability, restarts and errors in logs #34

Open
Simon-L opened this issue Jun 25, 2018 · 4 comments
Open

Instability, restarts and errors in logs #34

Simon-L opened this issue Jun 25, 2018 · 4 comments

Comments

@Simon-L
Copy link

Simon-L commented Jun 25, 2018

I've tried to track this and document as best as possible.

As of now, cloning this repo and running the docker-compose commands on a fresh DigitalOcean server results in unwanted behaviour.

This is the log from the first start, the error shown here appears a few minutes after it has started.

npm info it worked if it ends with ok
npm info using npm@5.6.0
npm info using node@v8.10.0
npm info lifecycle easy-ssb-pub@4.0.0~prestart: easy-ssb-pub@4.0.0
npm info lifecycle easy-ssb-pub@4.0.0~start: easy-ssb-pub@4.0.0

easy-ssb-pub@4.0.0 start /usr/src/app
tsc node dist/index --host $HOST

server error, from net:::ffff:198.1...414~shs:
Error: expected hello
    at abort (/usr/src/app/node_modules/secret-handshake/protocol.js:80:45)
    at Object.cb (/usr/src/app/node_modules/secret-handshake/protocol.js:90:26)
    at drain (/usr/src/app/node_modules/pull-reader/index.js:42:23)
    at /usr/src/app/node_modules/pull-reader/index.js:59:18
    at /usr/src/app/node_modules/pull-reader/index.js:20:7
    at /usr/src/app/node_modules/pull-stream/throughs/through.js:19:9
    at drain (/usr/src/app/node_modules/stream-to-pull-stream/index.js:141:18)
    at Socket.canonymous\ (/usr/src/app/node_modules/stream-to-pull-stream/index.js:158:5)
    at emitNone (events.js:111:20)
    at Socket.emit (events.js:208:7)

Various other errors are reported:

Error replicating with @o2ddy...=.ed25519:
   Error: parent stream is closing
    at PacketStream.stream (/usr/src/app/node_modules/packet-stream/index.js:46:28)
    at Object.ws.remoteCall (/usr/src/app/node_modules/muxrpc/stream.js:114:17)
    at /usr/src/app/node_modules/muxrpc/index.js:45:15
    at remoteCall (/usr/src/app/node_modules/muxrpc/api.js:26:19)
    at EventEmitter.obj.(anonymous function) [as createHistoryStream] (/usr/src/app/node_modules/muxrpc/api.js:40:20)
    at replicate (/usr/src/app/node_modules/scuttlebot/plugins/replicate/legacy.js:247:13)
    at EventEmitter.fallback (/usr/src/app/node_modules/scuttlebot/plugins/replicate/legacy.js:307:7)
    at Object.onceWrapper (events.js:315:30)
    at emitOne (events.js:116:13)
    at EventEmitter.emit (events.js:211:7)

Or (with timestamps):

2018-06-25T00:56:08.922909057Z Error replicating with @uMff...=.ed25519:
2018-06-25T00:56:08.923025645Z    Error: parent stream is closing
2018-06-25T00:56:08.933272975Z     at PacketStream.stream (/usr/src/app/node_modules/packet-stream/index.js:46:28)
2018-06-25T00:56:08.933313523Z     at Object.ws.remoteCall (/usr/src/app/node_modules/muxrpc/stream.js:114:17)
2018-06-25T00:56:08.933322540Z     at /usr/src/app/node_modules/muxrpc/index.js:45:15
2018-06-25T00:56:08.933328284Z     at remoteCall (/usr/src/app/node_modules/muxrpc/api.js:26:19)
2018-06-25T00:56:08.933334166Z     at EventEmitter.obj.(anonymous function) [as createHistoryStream] (/usr/src/app/node_modules/muxrpc/api.js:40:20)
2018-06-25T00:56:08.933339838Z     at replicate (/usr/src/app/node_modules/scuttlebot/plugins/replicate/legacy.js:247:13)
2018-06-25T00:56:08.933345290Z     at EventEmitter.fallback (/usr/src/app/node_modules/scuttlebot/plugins/replicate/legacy.js:307:7)
2018-06-25T00:56:08.933351243Z     at Object.onceWrapper (events.js:315:30)
2018-06-25T00:56:08.933356772Z     at emitOne (events.js:116:13)
2018-06-25T00:56:08.933361978Z     at EventEmitter.emit (events.js:211:7)

These logs were obtained using docker logs, the first time I ran it after running the docker-compose command for the first time the command stopped on these lines:

Killed
npm info lifecycle easy-ssb-pub@4.0.0~start: Failed to exec start script
npm ERR! code ELIFECYCLE
npm ERR! errno 137
npm ERR! easy-ssb-pub@4.0.0 start: `tsc  node dist/index --host $HOST`
npm ERR! Exit status 137
npm ERR! 
npm ERR! Failed at the easy-ssb-pub@4.0.0 start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2018-06-24T23_46_17_506Z-debug.log
npm info it worked if it ends with ok
npm info using npm@5.6.0
npm info using node@v8.10.0
npm info lifecycle easy-ssb-pub@4.0.0~prestart: easy-ssb-pub@4.0.0
npm info lifecycle easy-ssb-pub@4.0.0~start: easy-ssb-pub@4.0.0

Looks like it's killed either by the docker daemon or by npm inside the container, it will then just reboot.

I've made sure to run these tests on a fresh install. I have personnally run the same docker image on my server and, added to those reported here, had several other issues probably related to #27.

I have taken note of the maintenance and status warning in the README, for the time being I think this repo works for many people and figured I should share my investigation!

@Simon-L
Copy link
Author

Simon-L commented Jun 26, 2018

I have made some more experiments with this package.

Indeed it seems the software is crashing and the container restarts repeatedly, sometimes after running for 3 minutes, sometimes after 20 minutes.

I am also noticing it doesn't always crash after the errors reported in the previous message, it might happen seemingly randomly.
The first line is always:

Killed

which makes me think it is hanging at some point and also puts me in an uncomfortable situation where I have absolutely no idea where to look in the code!

I'm using docker-compose but I can confirm these instability issues also happen on bare metal Debian.

@deSagaz
Copy link

deSagaz commented Jun 30, 2018

I'm experiencing similar issues at ssb.learningsocieties.org, which is also running at DigitalOcean on a standalone droplet.

Every once in a while I check whether the server is still up in practice, with varying results. The statistics paint a good picture:

screenshot_2018-06-30 digitalocean - ssb

Sometimes the server restarts and resumes correct behaviour, but sometimes the bandwidth drops to 0 and CPU/memory usage spike for extended periods of time. The logged errors are similar.

I don't really have time to troubleshoot it properly, sadly... For the time being I'll attempt to keep the pub up and running as best I can.

@deSagaz
Copy link

deSagaz commented Jul 1, 2018

I've moved over to dinosaur's image last night with a separate webserver docker image. I also upgraded to 2GB RAM, just to be sure. The statistics look a lot healthier.

screenshot_2018-07-01 digitalocean - ssb-pub

Not sure why that is exactly; perhaps the healer container makes the difference?

For now I've generated an invite that allows 1000 additional people to join, which should practically do the same as the previous auto-generation. Plus, once the server gets that size it'd be time to lock down anyhow...

@staltz
Copy link
Owner

staltz commented Jul 1, 2018

I've had problems with dinosaur's image as well as easy-ssb-pub's image, but I noticed it's probably my 3-hop social graph is huge for the server. Configuring hops as 1 helped (with dinosaur's image), the server is pretty stable now. I could try hops=2 as well.

Just 2 cents if anyone is looking to improve stability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants