-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow performance syncing with xdposchain v2.2.4 #162
Comments
@nikki-quant , have you checked connection from the container itself? |
Thanks for your response. Do you mean the connection to the stats.xinfin.network? If so, yes, I get a response from netcat and curl: Confirming that the stats server corresponds to the IP address in the logs:
Netcat uses TCP unless we pass the flag
Hitting that endpoint with curl, we receive a HTML document:
The node's securitygroup has outbound access open on all ports but some restrictions on inbound connections. AWS securitygroups will allow a response to an outgoing connection, and since TCP keeps the connection open rather than the server initiating a new connection, I would not expect this to be an an issue with networks ecurity. |
Yes, I'm running in docker. I receive the same response as on the underlying host.
I'm using a recent copy of this repo:
I have updated the docker-compose file to use the image |
Have you enabled firewall? |
Looks like you need to allow connection between 172.18.0.2 and 45.82.64.150 |
172.18.0.2 - is the docker subnet on your server and it needs to have an access to 45.82.64.150:3000 |
@pro100skm as the log shows, when I exec onto the Docker container, I can connect to Could you explain why you think there is a firewall issue currently? From what I can see it is successfully accessing that endpoint. |
Yeah, now I see it. Let's wait for dev's response from core team |
Hi, this seems very slow to me too, could you try to restart and see if you have same issue? |
the same after a while |
the same |
Restarting the process did not help and I could not see any evidence the node was resource constrained with top, lsof, vmstat or similar tools, but our node eventually began syncing somewhat quicker (1k blocks/minute) and caught up without us making changes. I'm still not sure if this was to do with a lack of peers on the newer version or some other factor. No changes to underlying infrastructure I made while debugging seemed to have an effect. This gives me some concern about continuing to maintain a node since it's not clear to me how to effectively debug and resolve issues with XDC software stack. |
Dear XinFin team,
We run a XDC Mainnet node in house, and around 3 weeks ago we provisioned a new v2.2.4 node on an AWS EC2 r5a.xlarge instance. We had used the same instance type previously and found the performance good enough for our purposes, syncing 1.5 million blocks in 15 minutes.
On this occasion our new node is syncing very slowly - an average of 300 blocks in a minute. It has ~17 peers, and does not seem CPU, disk or memory constrained according to
top
,vmstat
or Cloudwatch metrics.In terms of application logs, the only errors I see are related to the stats endpoint, which seems to be incorrect:
I can reach the stats server with netcat or curl, I'm unsure why the application is getting TCP errors:
Does the node name passed as $INSTANCE_NAME need to be unique?
Are there any configuration options we can change in the node settings, or recommendations that you would make for system configuration to increase performance?
Is there any documentation on troubleshooting this kind of issue out there we should take a look at?
The text was updated successfully, but these errors were encountered: