Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range failed to find revision pair #10713

Closed
nolouch opened this issue May 9, 2019 · 14 comments
Closed

Range failed to find revision pair #10713

nolouch opened this issue May 9, 2019 · 14 comments

Comments

@nolouch
Copy link
Contributor

nolouch commented May 9, 2019

Version: etcd master + commit:77d4b742cdbf
The stack:

[2019/05/06 14:07:16.998 +00:00] [FATAL] [kvstore_txn.go:148] ["range failed to find revision pair"] [revision-main=0] [revision-sub=0] 
[stack="go.etcd.io/etcd/mvcc.(*storeTxnRead).rangeKeys
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/kvstore_txn.go:148
go.etcd.io/etcd/mvcc.(*storeTxnRead).Range
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/kvstore_txn.go:50
go.etcd.io/etcd/mvcc.(*metricsTxnWrite).Range
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/metrics_txn.go:36
go.etcd.io/etcd/etcdserver.(*applierV3backend).Range
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/apply.go:275
go.etcd.io/etcd/etcdserver.(*EtcdServer).Range.func3
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:104
go.etcd.io/etcd/etcdserver.(*EtcdServer).doSerialize
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:554
go.etcd.io/etcd/etcdserver.(*EtcdServer).Range
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:105
go.etcd.io/etcd/etcdserver/api/v3rpc.(*kvServer).Range
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/key.go:52
go.etcd.io/etcd/etcdserver/etcdserverpb._KV_Range_Handler.func1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/etcdserverpb/rpc.pb.go:3473
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:31
github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_metrics.go:107
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:34
go.etcd.io/etcd/etcdserver/api/v3rpc.newUnaryInterceptor.func1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/interceptor.go:60
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:34
go.etcd.io/etcd/etcdserver/api/v3rpc.newLogUnaryInterceptor.func1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/interceptor.go:67
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:39
go.etcd.io/etcd/etcdserver/etcdserverpb._KV_Range_Handler
/home/jenkins/workspace/build_pd_master/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/etcdserverpb/rpc.pb.go:3475
google.golang.org/grpc.(*Server).processUnaryRPC
/home/jenkins/workspace/build_pd_master/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:1026
google.golang.org/grpc.(*Server).handleStream
/home/jenkins/workspace/build_pd_master/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:1252
google.golang.org/grpc.(*Server).serveStreams.func1.1
/home/jenkins/workspace/build_pd_master/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:699"]
2019/05/06 14:07:17 daemon.go:225: [warning] daemon pd exit with error exit status 1

What situation will this error occur, is this normal?

@jingyih
Copy link
Contributor

jingyih commented May 9, 2019

[2019/05/06 14:07:16.998 +00:00] [FATAL] [kvstore_txn.go:148] ["range failed to find revision pair"] [revision-main=0] [revision-sub=0]

Is this reproducible? Can you enable --debug flag so we can trace which request ended up looking for [revision-main=0] [revision-sub=0]?
The backend store starts with main revision 1 (see [1]). It will be increased if there is mutating request, but should never decrease.

[1]

currentRev: 1,

@spzala
Copy link
Member

spzala commented May 9, 2019

[2019/05/06 14:07:16.998 +00:00] [FATAL] [kvstore_txn.go:148] ["range failed to find revision pair"] [revision-main=0] [revision-sub=0]

Is this reproducible? Can you enable --debug flag so we can trace which request ended up looking for [revision-main=0] [revision-sub=0]?
The backend store starts with main revision 1 (see [1]). It will be increased if there is mutating request, but should never decrease.

[1]

currentRev: 1,

Thanks @jingyih I also tried reproduce it but I couldn't so far with normal txn operations.

@shafreeck
Copy link
Contributor

shafreeck commented Dec 12, 2019

backtrace for the same error.

[2019/12/12 02:10:12.305 +08:00] [FATAL] [kvstore_txn.go:148] ["range failed to find revision pair"] [revision-main=101406223] [revision-sub=0]
go.etcd.io/etcd/mvcc.(*storeTxnRead).rangeKeys
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/kvstore_txn.go:148
go.etcd.io/etcd/mvcc.(*storeTxnRead).Range
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/kvstore_txn.go:50
go.etcd.io/etcd/mvcc.(*metricsTxnWrite).Range
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/mvcc/metrics_txn.go:36
go.etcd.io/etcd/etcdserver.(*applierV3backend).Range
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/apply.go:275
go.etcd.io/etcd/etcdserver.(*EtcdServer).Range.func3
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:104
go.etcd.io/etcd/etcdserver.(*EtcdServer).doSerialize
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:554
go.etcd.io/etcd/etcdserver.(*EtcdServer).Range
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/v3_server.go:105
go.etcd.io/etcd/etcdserver/api/v3rpc.(*kvServer).Range
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/key.go:52
go.etcd.io/etcd/etcdserver/etcdserverpb._KV_Range_Handler.func1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/etcdserverpb/rpc.pb.go:3473
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:31
github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_metrics.go:107
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:34
go.etcd.io/etcd/etcdserver/api/v3rpc.newUnaryInterceptor.func1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/interceptor.go:60
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:34
go.etcd.io/etcd/etcdserver/api/v3rpc.newLogUnaryInterceptor.func1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/api/v3rpc/interceptor.go:67
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:39
go.etcd.io/etcd/etcdserver/etcdserverpb._KV_Range_Handler
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/go.etcd.io/etcd@v0.0.0-20190320044326-77d4b742cdbf/etcdserver/etcdserverpb/rpc.pb.go:3475
google.golang.org/grpc.(*Server).processUnaryRPC
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:1026
google.golang.org/grpc.(*Server).handleStream
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:1252
google.golang.org/grpc.(*Server).serveStreams.func1.1
        /home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/google.golang.org/grpc@v1.14.0/server.go:699

@xiang90
Copy link
Contributor

xiang90 commented Dec 14, 2019

what is the version of etcd?

@nolouch
Copy link
Contributor Author

nolouch commented Dec 15, 2019

Hi @xiang90 , the commit is 77d4b7, recently we meet it twice.

etcd Version: 3.3.0+git
Git SHA: 77d4b742c
Go Version: go1.13.4

@shafreeck
Copy link
Contributor

@xiang90 In which case will this happen? It is really a terrible problem when the etcd is used in critical services.

@shafreeck
Copy link
Contributor

shafreeck commented Dec 18, 2019

I am wondering if this PR(#10772) fixed the problem? @xiang90 @jingyih Do you guys have any ideas about this?

@jingyih
Copy link
Contributor

jingyih commented Dec 18, 2019

I am wondering if this PR(#10772) fixed the problem? @xiang90 @jingyih Do you guys have any ideas about this?

Commits reverted in #10772 were never released. They were reverted before 3.4 release.

@shafreeck
Copy link
Contributor

@jingyih We used the version: etcd master + commit:77d4b742cdbf, so the bug is included.

@xiang90
Copy link
Contributor

xiang90 commented Dec 20, 2019

@shafreeck possibly. so please switch a released version of etcd.

@shafreeck
Copy link
Contributor

@xiang90 Thanks, I am going to reproduce this bug first.

@shafreeck
Copy link
Contributor

shafreeck commented Dec 25, 2019

Problem reproduced when issuing PUT/DELETE/RANGE requests concurrently, DELETE maybe not necessary.

# 1. Install fperf for etcd benchmark
go get github.com/fperf/etcd/bin/fperf

# 2. Run etcd with compaction enabled
bin/etcd --auto-compaction-mode=revision --auto-compaction-retention=1000

# 3. Run these commands concurrently
fperf -server http://127.0.0.1:2379 -tick 1s -connection 256 etcd -key-size 2 put
fperf -server http://127.0.0.1:2379 -tick 1s -connection 256 etcd -key-size 2 range
fperf -server http://127.0.0.1:2379 -tick 1s -connection 256 etcd -key-size 2 delete

etcd fatal error caused.

2019-12-24 11:13:41.580505 I | etcdserver: start to snapshot (applied: 1405428622, lastsnap: 1405328544)
2019-12-24 11:13:41.591700 I | etcdserver: saved snapshot at index 1405428622
2019-12-24 11:13:41.592093 I | etcdserver: compacted raft log at 1405423622
2019-12-24 11:13:46.715227 I | etcdserver: start to snapshot (applied: 1405528670, lastsnap: 1405428622)
2019-12-24 11:13:46.735011 I | etcdserver: saved snapshot at index 1405528670
2019-12-24 11:13:46.735204 I | etcdserver: compacted raft log at 1405523670
2019-12-24 11:13:47.773566 C | mvcc: range cannot find rev (0,0)

@shafreeck
Copy link
Contributor

@xiang90 @jingyih
Although #10772 fixed the problem, it didn't describe the bug correctly. There are two problems in the Clone version.

  1. Clone is not thread-safe, it is Lock not RLock that should be used.
  2. keyIndex is not thread-safe, race condition occurs even if the Clone is locked.

@xiang90
Copy link
Contributor

xiang90 commented Jan 2, 2020

cloning since the root cause is found and already fixed

@xiang90 xiang90 closed this as completed Jan 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants