Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Revisions Across Members ( v3.3.3 ) #10594

Closed
davissp14 opened this issue Mar 28, 2019 · 17 comments
Closed

Inconsistent Revisions Across Members ( v3.3.3 ) #10594

davissp14 opened this issue Mar 28, 2019 · 17 comments

Comments

@davissp14
Copy link
Contributor

davissp14 commented Mar 28, 2019

Etcd Version: 3.3.3
OS: Linux 14.04.1 Ubuntu

Health

http://10.213.214.4:2379 is healthy: successfully committed proposal: took = 743.509µs
http://10.213.214.2:2379 is healthy: successfully committed proposal: took = 1.00709ms
http://10.213.214.3:2379 is healthy: successfully committed proposal: took = 1.011886ms

Table output

+--------------------------+------------------+---------+---------+-----------+-----------+------------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
| http://10.213.214.2:2379 | 372c0ce563084c78 |   3.3.3 |   95 MB |     false |        18 |   12165951 |
| http://10.213.214.3:2379 | 8546c864475e6358 |   3.3.3 |  136 MB |     false |        18 |   12165952 |
| http://10.213.214.4:2379 | e030b17253d751e9 |   3.3.3 |  175 MB |      true |        18 |   12165953 |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+

JSON Ouput

[ 
   {
      "Endpoint":"http://10.213.214.2:2379",
      "Status": {
         "header {
                 "cluster_id":16002117407141323128,"
                 "member_id":3975566750409837688,
                 "revision":1377021,
                 "raft_term":18
           },
          "version":"3.3.3",
          "dbSize":95182848,
          "leader":16154606967968977385,
          "raftIndex":12165645,
          "raftTerm":18
      }
 },
{
  "Endpoint":"http://10.213.214.3:2379",
  "Status":{
     "header":{
     "cluster_id":16002117407141323128,
     "member_id":9603583588433814360,
     "revision":1801804,
     "raft_term":18
  },
  "version":"3.3.3",
  "dbSize":136417280,
  "leader":16154606967968977385, 
  "raftIndex":12165646,
  "raftTerm":18
  }
},
{
  "Endpoint":"http://10.213.214.4:2379",
  "Status":{
     "header"{
        "cluster_id":16002117407141323128,
        "member_id":16154606967968977385,
        "revision":891479,
        "raft_term":18
     },
     "version":"3.3.3",
     "dbSize":174526464,
     "leader":16154606967968977385,
     "raftIndex":12165647,
     "raftTerm":18
   }
}
]

As you can see the revisions across members are vastly different.

Each of these nodes are configured with --auto-compaction-retention 1.

Any thoughts on what's going on here?

@hexfusion
Copy link
Contributor

Any thoughts on what's going on here?

@davissp14 Startup logs would be good for verification of flags. The DB size range is very different. Have you been running defrag on some members but not all?

Each of these nodes is configured with --auto-compaction-retention 1.

There were a few changes around compaction in 3.3.2 -> 3.3.3 a theory could be you have ran the cluster with a mixture on 3.3.2 and 3.3.3 members? Checkout the various changes https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v332-2018-03-08

Specifically, this bug looks suspect Previously, etcd --auto-compaction-mode revision --auto-compaction-retention 1 was translated to revision retention 3600000000000. via #9339

@davissp14
Copy link
Contributor Author

davissp14 commented Mar 28, 2019

Have you been running defrag on some members but not all?

Defrags are running daily per-node and I have confirmed this in the logs. It's the revisions that I am most concerned about though. Our users are seeing inconsistent key value read results across members.

@jingyih
Copy link
Contributor

jingyih commented Mar 29, 2019

It is concerning that there is such difference between revision on different members. Can you check metrics backend_commit_duration_seconds on each node?

Our users are seeing inconsistent key value read results across members.

Can you provide more details on this?

In etcd, difference in revision does not mean inconsistency in key value result. Member with lower revision cannot serve latest read request (assuming you are using linearizable read, which is the default, see [1]) until its backend catches up.

[1]

// serializable sets the range request to use serializable member-local reads.
// Range requests are linearizable by default; linearizable requests have higher
// latency and lower throughput than serializable requests but reflect the current
// consensus of the cluster. For better performance, in exchange for possible stale reads,
// a serializable range request is served locally without needing to reach consensus
// with other nodes in the cluster.
bool serializable = 7;

@davissp14
Copy link
Contributor Author

Node: 10.213.214.2

# TYPE etcd_disk_backend_commit_duration_seconds histogram
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 134334
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 134502
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 134518
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 134520
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 134521
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 134521
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 134521
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 134521
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 134522
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 134666
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 134666
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 134666
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 134666
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 134666
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 134666
etcd_disk_backend_commit_duration_seconds_sum 109.0337400130012
etcd_disk_backend_commit_duration_seconds_count 134666

Node: 10.213.214.3

# TYPE etcd_disk_backend_commit_duration_seconds histogram
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 190694
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 190923
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 190934
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 190939
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 190939
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 190943
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 190946
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 190948
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 190948
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 191168
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 191169
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 191169
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 191169
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 191169
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 191169
etcd_disk_backend_commit_duration_seconds_sum 161.19067459100071
etcd_disk_backend_commit_duration_seconds_count 191169
Node: 10.213.214.4

# TYPE etcd_disk_backend_commit_duration_seconds histogram
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 194137
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 194347
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 194370
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 194370
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 194370
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 194370
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 194370
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 194371
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 194371
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 194610
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 194610
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 194610
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 194610
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 194610
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 194610
etcd_disk_backend_commit_duration_seconds_sum 149.5392842140006
etcd_disk_backend_commit_duration_seconds_count 194610

@hexfusion
Copy link
Contributor

@davissp14 it would help if we could see some logs around compaction for these members so we can understand timing. Also as I asked before it would be nice to verify the startup logs and or flags for each member so that we could attempt to reproduce. That is unless you have a way to reproduce this already?

3.3.3 is fairly old I am curious because of the changes that were made in 3.3.2 and 3.3.3 WRT to compaction if this would correct itself with a newer version of etcd. Could you test this theory in a dev ENV?

The last question is where did these etcd binaries come from? I hate to assume that these are release assets SHA from etcd --version would be useful.

@anthonyalberto
Copy link

Hello, I'm helping @davissp14 get the logs that were needed here.

So far we don't know how to reproduce the issue, so trying an upgraded etcd in a dev env is going to be difficult. We'll have to reach out to the user and see if he agrees to upgrade his instance.

With that said, I think I got everything else that was needed, see below

Compaction logs

Only one node shows auto-compaction logs, runs every hour on the clock

Node 10.213.214.4:

...
2019-04-02T01:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 01:56:15.617046 N | compactor: Starting auto-compaction at revision 1875367 (retention: 1h0m0s)
2019-04-02T01:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 01:56:15.617866 N | compactor: Finished auto-compaction at revision 1875367
...
2019-04-02T02:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 02:56:15.619448 N | compactor: Starting auto-compaction at revision 1875367 (retention: 1h0m0s)
2019-04-02T02:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 02:56:15.620245 N | compactor: Finished auto-compaction at revision 1875367
...
2019-04-02T03:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 03:56:15.621966 N | compactor: Starting auto-compaction at revision 1875367 (retention: 1h0m0s)
2019-04-02T03:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 03:56:15.622892 N | compactor: Finished auto-compaction at revision 1875367
...
2019-04-02T04:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 04:56:15.624828 N | compactor: Starting auto-compaction at revision 1875367 (retention: 1h0m0s)
2019-04-02T04:56:15.000+00:00 cm-tooling-par01-2 etcd51: etcd.20                 | 2019-04-02 04:56:15.625675 N | compactor: Finished auto-compaction at revision 1875367

Defrag logs

Node 10.213.214.2:

| 2019-04-02 15:18:00.308830 N | etcdserver/api/v3rpc: starting to defragment the storage backend...
| 2019-04-02 15:18:00.413821 N | etcdserver/api/v3rpc: finished defragmenting the storage backend

Node 10.213.214.3:

| 2019-04-01 23:20:00.298297 N | etcdserver/api/v3rpc: starting to defragment the storage backend...
| 2019-04-01 23:20:03.704664 N | etcdserver/api/v3rpc: finished defragmenting the storage backend

Node 10.213.214.4:

| 2019-04-02 07:50:00.308176 N | etcdserver/api/v3rpc: starting to defragment the storage backend...
| 2019-04-02 07:50:00.315358 N | etcdserver/api/v3rpc: finished defragmenting the storage backend

Etcd version

Node 10.213.214.2:

etcd Version: 3.3.3
Git SHA: e348b1aed
Go Version: go1.9.5
Go OS/Arch: linux/amd64

Node 10.213.214.3:

etcd Version: 3.3.3
Git SHA: e348b1aed
Go Version: go1.9.5
Go OS/Arch: linux/amd64

Node 10.213.214.4:

etcd Version: 3.3.3
Git SHA: e348b1aed
Go Version: go1.9.5
Go OS/Arch: linux/amd64

Startup logs

Node 10.213.214.2:

| 2019-04-02 17:47:28.661134 I | etcdmain: setting maximum number of CPUs to 32, total number of available CPUs is 32
| 2019-04-02 17:47:28.661167 N | etcdmain: the server is already initialized as member before, starting as etcd member...
| 2019-04-02 17:47:28.661204 I | embed: listening for peers on http://10.213.214.2:2380
| 2019-04-02 17:47:28.661228 I | embed: listening for client requests on 10.213.214.2:2379
| 2019-04-02 17:47:28.661245 I | embed: listening for client requests on 127.0.0.1:2379
| 2019-04-02 17:47:28.682679 I | etcdserver: recovered store from snapshot at index 14521930
| 2019-04-02 17:47:28.682854 I | mvcc: restore compact to 1693737
| 2019-04-02 17:47:28.703260 I | etcdserver: name = etcd60.redacted
| 2019-04-02 17:47:28.703284 I | etcdserver: data dir = /data/etcd
| 2019-04-02 17:47:28.703291 I | etcdserver: member dir = /data/etcd/member
| 2019-04-02 17:47:28.703295 I | etcdserver: heartbeat = 500ms
| 2019-04-02 17:47:28.703298 I | etcdserver: election = 5000ms
| 2019-04-02 17:47:28.703302 I | etcdserver: snapshot count = 10000
| 2019-04-02 17:47:28.703317 I | etcdserver: advertise client URLs = http://10.213.214.2:2379
| 2019-04-02 17:47:28.851181 I | etcdserver: restarting member 372c0ce563084c78 in cluster de12f0ffbac0e178 at commit index 14529230
| 2019-04-02 17:47:28.853229 I | raft: 372c0ce563084c78 became follower at term 18
| 2019-04-02 17:47:28.853276 I | raft: newRaft 372c0ce563084c78 [peers: [372c0ce563084c78,8546c864475e6358,e030b17253d751e9], term: 18, commit: 14529230, applied: 14521930, lastindex: 14529230, lastterm: 18]
| 2019-04-02 17:47:28.853542 I | etcdserver/api: enabled capabilities for version 3.3
| 2019-04-02 17:47:28.853626 I | etcdserver/membership: added member 372c0ce563084c78 [http://10.213.214.2:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:47:28.853655 I | etcdserver/membership: added member 8546c864475e6358 [http://10.213.214.3:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:47:28.853676 I | etcdserver/membership: added member e030b17253d751e9 [http://10.213.214.4:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:47:28.853703 I | etcdserver/membership: set the cluster version to 3.3 from store
| 2019-04-02 17:47:28.854199 I | mvcc: restore compact to 1693737
| 2019-04-02 17:47:28.873183 W | auth: simple token is not cryptographically signed
| 2019-04-02 17:47:28.873458 I | rafthttp: starting peer 8546c864475e6358...
| 2019-04-02 17:47:28.873488 I | rafthttp: started HTTP pipelining with peer 8546c864475e6358
| 2019-04-02 17:47:28.874089 I | rafthttp: started streaming with peer 8546c864475e6358 (writer)
| 2019-04-02 17:47:28.874130 I | rafthttp: started streaming with peer 8546c864475e6358 (writer)
| 2019-04-02 17:47:28.876134 I | rafthttp: started peer 8546c864475e6358
| 2019-04-02 17:47:28.876155 I | rafthttp: added peer 8546c864475e6358
| 2019-04-02 17:47:28.876167 I | rafthttp: started streaming with peer 8546c864475e6358 (stream MsgApp v2 reader)
| 2019-04-02 17:47:28.876188 I | rafthttp: started streaming with peer 8546c864475e6358 (stream Message reader)
| 2019-04-02 17:47:28.876250 I | rafthttp: starting peer e030b17253d751e9...
| 2019-04-02 17:47:28.876269 I | rafthttp: started HTTP pipelining with peer e030b17253d751e9
| 2019-04-02 17:47:28.876541 I | rafthttp: started streaming with peer e030b17253d751e9 (writer)
| 2019-04-02 17:47:28.877093 I | rafthttp: started streaming with peer e030b17253d751e9 (writer)
| 2019-04-02 17:47:28.878007 I | rafthttp: peer 8546c864475e6358 became active
| 2019-04-02 17:47:28.878480 I | rafthttp: started peer e030b17253d751e9
| 2019-04-02 17:47:28.878507 I | rafthttp: added peer e030b17253d751e9
| 2019-04-02 17:47:28.878523 I | rafthttp: started streaming with peer e030b17253d751e9 (stream Message reader)
| 2019-04-02 17:47:28.878540 I | etcdserver: starting server... [version: 3.3.3, cluster version: 3.3]
| 2019-04-02 17:47:28.878550 I | rafthttp: started streaming with peer e030b17253d751e9 (stream MsgApp v2 reader)
| 2019-04-02 17:47:28.878628 I | rafthttp: established a TCP streaming connection with peer 8546c864475e6358 (stream MsgApp v2 reader)
| 2019-04-02 17:47:28.878843 I | rafthttp: established a TCP streaming connection with peer 8546c864475e6358 (stream Message reader)
| 2019-04-02 17:47:28.879387 I | rafthttp: peer e030b17253d751e9 became active
| 2019-04-02 17:47:28.879404 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream MsgApp v2 reader)
| 2019-04-02 17:47:28.879466 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream Message reader)
| 2019-04-02 17:47:28.880831 I | raft: raft.node: 372c0ce563084c78 elected leader e030b17253d751e9 at term 18
| 2019-04-02 17:47:28.914128 I | etcdserver: published {Name:etcd60.redacted ClientURLs:[http://10.213.214.2:2379]} to cluster de12f0ffbac0e178
| 2019-04-02 17:47:28.914162 I | embed: ready to serve client requests
| 2019-04-02 17:47:28.914200 I | embed: ready to serve client requests

Node 10.213.214.3:

| 2019-04-02 17:50:44.548812 I | etcdmain: setting maximum number of CPUs to 32, total number of available CPUs is 32
| 2019-04-02 17:50:44.548854 N | etcdmain: the server is already initialized as member before, starting as etcd member...
| 2019-04-02 17:50:44.548899 I | embed: listening for peers on http://10.213.214.3:2380
| 2019-04-02 17:50:44.548928 I | embed: listening for client requests on 10.213.214.3:2379
| 2019-04-02 17:50:44.548948 I | embed: listening for client requests on 127.0.0.1:2379
| 2019-04-02 17:50:45.575978 I | etcdserver: recovered store from snapshot at index 14526863
| 2019-04-02 17:50:45.576226 I | mvcc: restore compact to 1875367
| 2019-04-02 17:50:46.337067 I | etcdserver: name = etcd49.redacted
| 2019-04-02 17:50:46.337092 I | etcdserver: data dir = /data/etcd
| 2019-04-02 17:50:46.337099 I | etcdserver: member dir = /data/etcd/member
| 2019-04-02 17:50:46.337104 I | etcdserver: heartbeat = 500ms
| 2019-04-02 17:50:46.337107 I | etcdserver: election = 5000ms
| 2019-04-02 17:50:46.337111 I | etcdserver: snapshot count = 10000
| 2019-04-02 17:50:46.337126 I | etcdserver: advertise client URLs = http://10.213.214.3:2379
| 2019-04-02 17:50:46.434586 I | etcdserver: restarting member 8546c864475e6358 in cluster de12f0ffbac0e178 at commit index 14529252
| 2019-04-02 17:50:46.435505 I | raft: 8546c864475e6358 became follower at term 18
| 2019-04-02 17:50:46.435542 I | raft: newRaft 8546c864475e6358 [peers: [372c0ce563084c78,8546c864475e6358,e030b17253d751e9], term: 18, commit: 14529252, applied: 14526863, lastindex: 14529252, lastterm: 18]
| 2019-04-02 17:50:46.435761 I | etcdserver/api: enabled capabilities for version 3.3
| 2019-04-02 17:50:46.435913 I | etcdserver/membership: added member 372c0ce563084c78 [http://10.213.214.2:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:50:46.435928 I | etcdserver/membership: added member 8546c864475e6358 [http://10.213.214.3:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:50:46.435933 I | etcdserver/membership: added member e030b17253d751e9 [http://10.213.214.4:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:50:46.435943 I | etcdserver/membership: set the cluster version to 3.3 from store
| 2019-04-02 17:50:46.436243 I | mvcc: restore compact to 1875367
| 2019-04-02 17:50:47.167981 W | auth: simple token is not cryptographically signed
| 2019-04-02 17:50:47.168316 I | rafthttp: starting peer 372c0ce563084c78...
| 2019-04-02 17:50:47.168344 I | rafthttp: started HTTP pipelining with peer 372c0ce563084c78
| 2019-04-02 17:50:47.168778 I | rafthttp: started streaming with peer 372c0ce563084c78 (writer)
| 2019-04-02 17:50:47.169046 I | rafthttp: started streaming with peer 372c0ce563084c78 (writer)
| 2019-04-02 17:50:47.169519 I | rafthttp: started peer 372c0ce563084c78
| 2019-04-02 17:50:47.169534 I | rafthttp: added peer 372c0ce563084c78
| 2019-04-02 17:50:47.169543 I | rafthttp: starting peer e030b17253d751e9...
| 2019-04-02 17:50:47.169556 I | rafthttp: started HTTP pipelining with peer e030b17253d751e9
| 2019-04-02 17:50:47.169571 I | rafthttp: started streaming with peer 372c0ce563084c78 (stream MsgApp v2 reader)
| 2019-04-02 17:50:47.169672 I | rafthttp: started streaming with peer 372c0ce563084c78 (stream Message reader)
| 2019-04-02 17:50:47.169817 I | rafthttp: started streaming with peer e030b17253d751e9 (writer)
| 2019-04-02 17:50:47.169961 I | rafthttp: started streaming with peer e030b17253d751e9 (writer)
| 2019-04-02 17:50:47.170212 I | rafthttp: started peer e030b17253d751e9
| 2019-04-02 17:50:47.170225 I | rafthttp: added peer e030b17253d751e9
| 2019-04-02 17:50:47.170235 I | rafthttp: started streaming with peer e030b17253d751e9 (stream Message reader)
| 2019-04-02 17:50:47.170244 I | etcdserver: starting server... [version: 3.3.3, cluster version: 3.3]
| 2019-04-02 17:50:47.170323 I | rafthttp: started streaming with peer e030b17253d751e9 (stream MsgApp v2 reader)
| 2019-04-02 17:50:47.170493 I | rafthttp: peer 372c0ce563084c78 became active
| 2019-04-02 17:50:47.170511 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream Message reader)
| 2019-04-02 17:50:47.170592 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream MsgApp v2 reader)
| 2019-04-02 17:50:47.170906 I | rafthttp: peer e030b17253d751e9 became active
| 2019-04-02 17:50:47.170918 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream Message reader)
| 2019-04-02 17:50:47.170945 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream MsgApp v2 reader)
| 2019-04-02 17:50:47.172404 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream Message writer)
| 2019-04-02 17:50:47.172462 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream Message writer)
| 2019-04-02 17:50:47.172477 I | rafthttp: established a TCP streaming connection with peer e030b17253d751e9 (stream MsgApp v2 writer)
| 2019-04-02 17:50:47.172586 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream MsgApp v2 writer)
| 2019-04-02 17:50:47.209140 I | raft: raft.node: 8546c864475e6358 elected leader e030b17253d751e9 at term 18
| 2019-04-02 17:50:47.210515 I | embed: ready to serve client requests
| 2019-04-02 17:50:47.210601 I | etcdserver: published {Name:etcd49.redacted ClientURLs:[http://10.213.214.3:2379]} to cluster de12f0ffbac0e178
| 2019-04-02 17:50:47.210620 I | embed: ready to serve client requests

Node 10.213.214.4:

| 2019-04-02 17:53:51.986415 I | etcdmain: setting maximum number of CPUs to 32, total number of available CPUs is 32
| 2019-04-02 17:53:51.986449 N | etcdmain: the server is already initialized as member before, starting as etcd member...
| 2019-04-02 17:53:51.986489 I | embed: listening for peers on http://10.213.214.4:2380
| 2019-04-02 17:53:51.986512 I | embed: listening for client requests on 10.213.214.4:2379
| 2019-04-02 17:53:51.986529 I | embed: listening for client requests on 127.0.0.1:2379
| 2019-04-02 17:53:51.988858 I | etcdserver: recovered store from snapshot at index 14521952
| 2019-04-02 17:53:51.989073 I | mvcc: restore compact to 1875367
| 2019-04-02 17:53:51.993655 I | etcdserver: name = etcd51.redacted
| 2019-04-02 17:53:51.993680 I | etcdserver: data dir = /data/etcd
| 2019-04-02 17:53:51.993687 I | etcdserver: member dir = /data/etcd/member
| 2019-04-02 17:53:51.993691 I | etcdserver: heartbeat = 500ms
| 2019-04-02 17:53:51.993694 I | etcdserver: election = 5000ms
| 2019-04-02 17:53:51.993698 I | etcdserver: snapshot count = 10000
| 2019-04-02 17:53:51.993713 I | etcdserver: advertise client URLs = http://10.213.214.4:2379
| 2019-04-02 17:53:52.229087 I | etcdserver: restarting member e030b17253d751e9 in cluster de12f0ffbac0e178 at commit index 14529281
| 2019-04-02 17:53:52.231372 I | raft: e030b17253d751e9 became follower at term 19
| 2019-04-02 17:53:52.231408 I | raft: newRaft e030b17253d751e9 [peers: [372c0ce563084c78,8546c864475e6358,e030b17253d751e9], term: 19, commit: 14529281, applied: 14521952, lastindex: 14529281, lastterm: 19]
| 2019-04-02 17:53:52.231639 I | etcdserver/api: enabled capabilities for version 3.3
| 2019-04-02 17:53:52.231665 I | etcdserver/membership: added member 372c0ce563084c78 [http://10.213.214.2:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:53:52.231681 I | etcdserver/membership: added member 8546c864475e6358 [http://10.213.214.3:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:53:52.231693 I | etcdserver/membership: added member e030b17253d751e9 [http://10.213.214.4:2380] to cluster de12f0ffbac0e178 from store
| 2019-04-02 17:53:52.231708 I | etcdserver/membership: set the cluster version to 3.3 from store
| 2019-04-02 17:53:52.232084 I | mvcc: restore compact to 1875367
| 2019-04-02 17:53:52.235771 W | auth: simple token is not cryptographically signed
| 2019-04-02 17:53:52.236057 I | rafthttp: starting peer 372c0ce563084c78...
| 2019-04-02 17:53:52.236085 I | rafthttp: started HTTP pipelining with peer 372c0ce563084c78
| 2019-04-02 17:53:52.237232 I | rafthttp: started streaming with peer 372c0ce563084c78 (writer)
| 2019-04-02 17:53:52.238733 I | rafthttp: started streaming with peer 372c0ce563084c78 (writer)
| 2019-04-02 17:53:52.239710 I | rafthttp: started peer 372c0ce563084c78
| 2019-04-02 17:53:52.239731 I | rafthttp: added peer 372c0ce563084c78
| 2019-04-02 17:53:52.239746 I | rafthttp: starting peer 8546c864475e6358...
| 2019-04-02 17:53:52.239763 I | rafthttp: started HTTP pipelining with peer 8546c864475e6358
| 2019-04-02 17:53:52.239767 I | rafthttp: started streaming with peer 372c0ce563084c78 (stream MsgApp v2 reader)
| 2019-04-02 17:53:52.240869 I | rafthttp: started streaming with peer 372c0ce563084c78 (stream Message reader)
| 2019-04-02 17:53:52.240977 I | rafthttp: started streaming with peer 8546c864475e6358 (writer)
| 2019-04-02 17:53:52.242261 I | rafthttp: started streaming with peer 8546c864475e6358 (writer)
| 2019-04-02 17:53:52.242682 I | rafthttp: peer 372c0ce563084c78 became active
| 2019-04-02 17:53:52.242699 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream Message reader)
| 2019-04-02 17:53:52.242716 I | rafthttp: established a TCP streaming connection with peer 372c0ce563084c78 (stream MsgApp v2 reader)
| 2019-04-02 17:53:52.243822 I | rafthttp: started peer 8546c864475e6358
| 2019-04-02 17:53:52.243845 I | rafthttp: added peer 8546c864475e6358
| 2019-04-02 17:53:52.243858 I | rafthttp: started streaming with peer 8546c864475e6358 (stream Message reader)
| 2019-04-02 17:53:52.243866 I | etcdserver: starting server... [version: 3.3.3, cluster version: 3.3]
| 2019-04-02 17:53:52.243891 I | rafthttp: started streaming with peer 8546c864475e6358 (stream MsgApp v2 reader)
| 2019-04-02 17:53:52.245313 I | rafthttp: peer 8546c864475e6358 became active
| 2019-04-02 17:53:52.245328 I | rafthttp: established a TCP streaming connection with peer 8546c864475e6358 (stream Message reader)
| 2019-04-02 17:53:52.245342 I | rafthttp: established a TCP streaming connection with peer 8546c864475e6358 (stream MsgApp v2 reader)
| 2019-04-02 17:53:52.294131 I | etcdserver: e030b17253d751e9 initialzed peer connection; fast-forwarding 8 ticks (election ticks 10) with 2 active peer(s)
| 2019-04-02 17:53:52.353461 I | raft: raft.node: e030b17253d751e9 elected leader 372c0ce563084c78 at term 19
| 2019-04-02 17:53:52.355744 I | etcdserver: published {Name:etcd51.redacted ClientURLs:[http://10.213.214.4:2379]} to cluster de12f0ffbac0e178
| 2019-04-02 17:53:52.355759 I | embed: ready to serve client requests
| 2019-04-02 17:53:52.355809 I | embed: ready to serve client requests

Flags

Node 10.213.214.2:

etcd --initial-cluster-state new --initial-cluster-token <redacted> --name etcd60.redacted --data-dir /data/etcd --enable-v2=false --heartbeat-interval 500 --election-timeout 5000 --snapshot-count 10000 --auto-compaction-retention 1 --max-snapshots 3 --experimental-enable-v2v3=True --max-wals 3 --quota-backend-bytes 8589934592 --listen-peer-urls http://10.213.214.2:2380 --advertise-client-urls http://10.213.214.2:2379 --initial-advertise-peer-urls http://10.213.214.2:2380 --listen-client-urls http://10.213.214.2:2379,http://127.0.0.1:2379 --initial-cluster etcd51.redacted=http://10.213.214.4:2380,etcd49.redacted=http://10.213.214.3:2380,etcd60.redacted=http://10.213.214.2:2380

Node 10.213.214.3:

etcd --initial-cluster-state new --initial-cluster-token <redacted> --name etcd49.redacted --data-dir /data/etcd --enable-v2=false --heartbeat-interval 500 --election-timeout 5000 --snapshot-count 10000 --auto-compaction-retention 1 --max-snapshots 3 --experimental-enable-v2v3=True --max-wals 3 --quota-backend-bytes 8589934592 --listen-peer-urls http://10.213.214.3:2380 --advertise-client-urls http://10.213.214.3:2379 --initial-advertise-peer-urls http://10.213.214.3:2380 --listen-client-urls http://10.213.214.3:2379,http://127.0.0.1:2379 --initial-cluster etcd51.redacted=http://10.213.214.4:2380,etcd49.redacted=http://10.213.214.3:2380,etcd60.redacted=http://10.213.214.2:2380

Node 10.213.214.4:

etcd --initial-cluster-state new --initial-cluster-token <redacted> --name etcd51.redacted --data-dir /data/etcd --enable-v2=false --heartbeat-interval 500 --election-timeout 5000 --snapshot-count 10000 --auto-compaction-retention=1h --max-snapshots 3 --experimental-enable-v2v3=True --max-wals 3 --quota-backend-bytes 8589934592 --listen-peer-urls http://10.213.214.4:2380 --advertise-client-urls http://10.213.214.4:2379 --initial-advertise-peer-urls http://10.213.214.4:2380 --listen-client-urls http://10.213.214.4:2379,http://127.0.0.1:2379 --initial-cluster etcd51.redacted=http://10.213.214.4:2380,etcd49.redacted=http://10.213.214.3:2380,etcd60.redacted=http://10.213.214.2:2380

@hexfusion
Copy link
Contributor

@anthonyalberto thanks a lot for the logs.

Only one node shows auto-compaction logs, runs every hour on the clock

Curious that only one of the configs uses the 1hr vs 1. Can we try making these all 1hr and see if that clears things up? I have not dug into the code yet but seems like a smoking gun to me.

Node 10.213.214.2:
--auto-compaction-retention 1

Node 10.213.214.3:
--auto-compaction-retention 1

Node 10.213.214.4:
--auto-compaction-retention=1h

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 3, 2019

@hexfusion That was my fault, as I did not communicate this to @anthonyalberto. That discrepancy was due to my earlier testing. All nodes were consistently set to --auto-compaction-retention 1. I tried using 1h on all nodes, but it didn't clear the issue.

@hexfusion
Copy link
Contributor

from the logs above --auto-compaction-retention 1 does not work as expected. Should we compact on revision N for all members etcdctl compact N defrag and change to 1h so that moving forward the compaction is actually happening as expected? I will think ion this for a bit but it seems reasonable.

@anthonyalberto
Copy link

Just to add more info, I don't see any change in behaviour after switching all 3 nodes to --auto-compaction-retention=1h. Still the same node logging auto-compaction, the other two don't

@davissp14
Copy link
Contributor Author

@hexfusion Compaction should only be running on the leader node, yeah?

@hexfusion
Copy link
Contributor

--experimental-enable-v2v3=True

Yeah your right, does this cluster also have v2 data?

@hexfusion
Copy link
Contributor

I wonder if this flag is somehow causing the issue. Is this still required?

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 3, 2019

Does this cluster also have v2 data?

Nah, we have been enabling that specifically for v2 API emulation. We were supporting the v2 storage backend for a while and this was enabled to help ease the migration to v3.

I wonder if this flag is somehow causing the issue. Is this still required?

I would be surprised if this is causing the issue as it's enabled across all of our Etcd clusters. Hard to say if it's still required or not, it would depend on how the end-user is accessing their data.

@njhill
Copy link
Contributor

njhill commented Apr 3, 2019

There are some similarities with #9630, just in case that could provide any other clues (in that case the backing store was inconsistent at time of upgrade, here there had also been an in-place upgrade from 3.3.2 -> 3.3.3).

@tangcong
Copy link
Contributor

tangcong commented Mar 4, 2020

@davissp14 do you enable auth? if you enable auth, it is possible to encounter data inconsistency in all etcd3 version when you restart etcd(#11651).

@stale
Copy link

stale bot commented Jun 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants