You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 3 etcd nodes, and clients operate etcd through http.
The operations of the client are only http requests, and the request types include GET, PUT, and long polling for Watch scenarios. Actually, the client is APISIX, and APISIX internally uses lua to operate etcd through HTTP protocol.
However, when there are many connections on the etcd node, operating the etcd node through http will be particularly slow.
Show the number of connections on 3 nodes respectively:
Then, on each etcd node, query the same key by curl:
1、Request the etcd server of 10-132-15-158 node, which has the most 263 connections, the result is that the http request takes the longest (7 seconds).
[root@knode10-132-15-158 operation]# time curl -X POST -d '{"key": "L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA=="}' https://10.132.15.158:2379/v3/kv/range --cacert /etc/etcd/etcdSSL/ca.pem --cert /etc/etcd/etcdSSL/clientssl/apisix/apisix.pem --key /etc/etcd/etcdSSL/clientssl/apisix/apisix-key.pem --key-type PEM --cert-type PEM -k
{"header":{"cluster_id":"5357725075044320405","member_id":"16420066870069275712","revision":"2695209","raft_term":"7"},"kvs":[{"key":"L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA==","create_revision":"2318448","mod_revision":"2694680","version":"142","value":"eyJ0eXBlIjoicm91bmRyb2JpbiIsImxhYmVscyI6eyJlbnYtdHlwZSI6InByb2R1Y3QiLCJhcHBjb2RlIjoiYmFpem95b2F1dGgyYWRhcHRlci1kYXRhIiwicGxhdGZvcm0iOiJhb3MiLCJ2ZXJzaW9uIjoiMzk0MTYyIn0sImlkIjoibXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MCIsInVwZGF0ZV90aW1lIjoxNjU2NTgyMDUyLCJjcmVhdGVfdGltZSI6MTY1NjMyNjQyOCwibmFtZSI6Im1zLWFvcy1iYWl6b3lvYXV0aDJhZGFwdGVyLWRhdGEtcHJvZHVjdC1ncC0zOTQxNjItODAiLCJoYXNoX29uIjoidmFycyIsInBhc3NfaG9zdCI6InBhc3MiLCJkZXNjIjoidXBzdHJlYW0gZm9yIGFwcGNvZGU6IGJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YSBhbmQgdmVyc2lvbjogMzk0MTYyIiwibm9kZXMiOnsiMTAuMTMxLjg2LjEwMDo4MCI6MTAsIjEwLjEzMS44Ny4yMjg6ODAiOjEwfSwic2NoZW1lIjoiaHR0cCJ9"}],"count":"1"}
real 0m7.132s
user 0m0.061s
sys 0m0.063s
2、Request the etcd server of 10-132-15-159 node, which has the 9 connections, the result is that the http request takes the longest (0.097 seconds).
[root@knode10-132-15-159 operation]# time curl -X POST -d '{"key": "L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA=="}' https://10.132.15.159:2379/v3/kv/range --cacert /etc/etcd/etcdSSL/ca.pem --cert /etc/etcd/etcdSSL/clientssl/apisix/apisix.pem --key /etc/etcd/etcdSSL/clientssl/apisix/apisix-key.pem --key-type PEM --cert-type PEM
{"header":{"cluster_id":"5357725075044320405","member_id":"1021713008895288175","revision":"2695484","raft_term":"7"},"kvs":[{"key":"L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA==","create_revision":"2318448","mod_revision":"2694680","version":"142","value":"eyJ0eXBlIjoicm91bmRyb2JpbiIsImxhYmVscyI6eyJlbnYtdHlwZSI6InByb2R1Y3QiLCJhcHBjb2RlIjoiYmFpem95b2F1dGgyYWRhcHRlci1kYXRhIiwicGxhdGZvcm0iOiJhb3MiLCJ2ZXJzaW9uIjoiMzk0MTYyIn0sImlkIjoibXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MCIsInVwZGF0ZV90aW1lIjoxNjU2NTgyMDUyLCJjcmVhdGVfdGltZSI6MTY1NjMyNjQyOCwibmFtZSI6Im1zLWFvcy1iYWl6b3lvYXV0aDJhZGFwdGVyLWRhdGEtcHJvZHVjdC1ncC0zOTQxNjItODAiLCJoYXNoX29uIjoidmFycyIsInBhc3NfaG9zdCI6InBhc3MiLCJkZXNjIjoidXBzdHJlYW0gZm9yIGFwcGNvZGU6IGJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YSBhbmQgdmVyc2lvbjogMzk0MTYyIiwibm9kZXMiOnsiMTAuMTMxLjg2LjEwMDo4MCI6MTAsIjEwLjEzMS44Ny4yMjg6ODAiOjEwfSwic2NoZW1lIjoiaHR0cCJ9"}],"count":"1"}
real 0m0.097s
user 0m0.053s
sys 0m0.037s
3、Request the etcd server of 10-132-15-160 node, which has the 90 connections, the result is that the http request takes the longest (0.092 seconds).
[root@knode10-132-15-160 operation]# time curl -X POST -d '{"key": "L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA=="}' https://10.132.15.160:2379/v3/kv/range --cacert /etc/etcd/etcdSSL/ca.pem --cert /etc/etcd/etcdSSL/clientssl/apisix/apisix.pem --key /etc/etcd/etcdSSL/clientssl/apisix/apisix-key.pem --key-type PEM --cert-type PEM
{"header":{"cluster_id":"5357725075044320405","member_id":"2577531103840814189","revision":"2695532","raft_term":"7"},"kvs":[{"key":"L2FwaXNpeC91cHN0cmVhbXMvbXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MA==","create_revision":"2318448","mod_revision":"2694680","version":"142","value":"eyJ0eXBlIjoicm91bmRyb2JpbiIsImxhYmVscyI6eyJlbnYtdHlwZSI6InByb2R1Y3QiLCJhcHBjb2RlIjoiYmFpem95b2F1dGgyYWRhcHRlci1kYXRhIiwicGxhdGZvcm0iOiJhb3MiLCJ2ZXJzaW9uIjoiMzk0MTYyIn0sImlkIjoibXMtYW9zLWJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YS1wcm9kdWN0LWdwLTM5NDE2Mi04MCIsInVwZGF0ZV90aW1lIjoxNjU2NTgyMDUyLCJjcmVhdGVfdGltZSI6MTY1NjMyNjQyOCwibmFtZSI6Im1zLWFvcy1iYWl6b3lvYXV0aDJhZGFwdGVyLWRhdGEtcHJvZHVjdC1ncC0zOTQxNjItODAiLCJoYXNoX29uIjoidmFycyIsInBhc3NfaG9zdCI6InBhc3MiLCJkZXNjIjoidXBzdHJlYW0gZm9yIGFwcGNvZGU6IGJhaXpveW9hdXRoMmFkYXB0ZXItZGF0YSBhbmQgdmVyc2lvbjogMzk0MTYyIiwibm9kZXMiOnsiMTAuMTMxLjg2LjEwMDo4MCI6MTAsIjEwLjEzMS44Ny4yMjg6ODAiOjEwfSwic2NoZW1lIjoiaHR0cCJ9"}],"count":"1"}
real 0m0.092s
user 0m0.053s
sys 0m0.034s
It can be explained above that the etcd node with the largest number of connections also takes longer to respond to http requests.
Now I'll move the connections from knode10-132-15-158 to knode10-132-15-160 and try again(actually, just let APISIX connect to node knode10-132-15-160 ).
In my usage scenario, I recompile and deploy your repo, and after about 1 hour of testing, this problem is confirmed to be resolved. thank you very much for your help.
I would like to ask, will the 3.5.5 version include this change? When will version 3.5.5 be released?
What happened?
I have 3 etcd nodes, and clients operate etcd through http.
The operations of the client are only http requests, and the request types include GET, PUT, and long polling for Watch scenarios. Actually, the client is APISIX, and APISIX internally uses
lua
to operate etcd through HTTP protocol.However, when there are many connections on the etcd node, operating the etcd node through http will be particularly slow.
Show the number of connections on 3 nodes respectively:
Then, on each etcd node, query the same key by curl:
1、Request the etcd server of 10-132-15-158 node, which has the most 263 connections, the result is that the http request takes the longest (7 seconds).
2、Request the etcd server of 10-132-15-159 node, which has the 9 connections, the result is that the http request takes the longest (0.097 seconds).
3、Request the etcd server of 10-132-15-160 node, which has the 90 connections, the result is that the http request takes the longest (0.092 seconds).
It can be explained above that the etcd node with the largest number of connections also takes longer to respond to http requests.
Now I'll move the connections from knode10-132-15-158 to knode10-132-15-160 and try again(actually, just let APISIX connect to node knode10-132-15-160 ).
Request the etcd server of node 10-132-15-158, the response cost
0.159s
Request the etcd server of node 10-132-15-160, the response cost
10.335s
I would like to ask, what is the source of this problem? what should be done to fix this problem?
What did you expect to happen?
none
How can we reproduce it (as minimally and precisely as possible)?
I'm not sure about simulating hundreds of http long polling connections with some tool.
Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
ETCD_ARGS="--cert-file=/etc/etcd/etcdSSL/servers/10.132.15.158/etcdserver-10.132.15.158.pem
--key-file=/etc/etcd/etcdSSL/servers/10.132.15.158/etcdserver-10.132.15.158-key.pem
--peer-cert-file=/etc/etcd/etcdSSL/peers/10.132.15.158/etcdpeer-10.132.15.158.pem
--peer-key-file=/etc/etcd/etcdSSL/peers/10.132.15.158/etcdpeer-10.132.15.158-key.pem
--trusted-ca-file=/etc/etcd/etcdSSL/ca.pem
--peer-trusted-ca-file=/etc/etcd/etcdSSL/ca.pem
--name etcd1
--enable-grpc-gateway
--data-dir /DATA1/etcd
--client-cert-auth
--peer-client-cert-auth
--experimental-initial-corrupt-check
--experimental-corrupt-check-time 6h
--initial-advertise-peer-urls https://10.132.15.158:2380
--listen-peer-urls https://10.132.15.158:2380
--listen-client-urls https://10.132.15.158:2379,https://127.0.0.1:2379,https://10.132.15.158:4001
--advertise-client-urls https://10.132.15.158:2379,https://10.132.15.158:4001
--initial-cluster-token etcd-apisix-product
--initial-cluster etcd1=https://10.132.15.158:2380,etcd2=https://10.132.15.159:2380,etcd3=https://10.132.15.160:2380
--initial-cluster-state new
--heartbeat-interval 2000
--election-timeout 10000
--snapshot-count 10000
--auto-compaction-mode periodic
--auto-compaction-retention 30m
--max-request-bytes 33554432
--quota-backend-bytes 34359738368
--logger zap
--log-level debug
--backend-bbolt-freelist-type=map"
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
No response
The text was updated successfully, but these errors were encountered: