-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
database file does not match with snapshot #7834
Labels
Comments
This was referenced Apr 28, 2017
reproduce the steps with this commit c407e09 appears to cause infra2 node to hang or not serving any requests instead of returning |
This commit 0054e7e causes |
gyuho
added a commit
to gyuho/etcd
that referenced
this issue
May 1, 2017
If 'StartEtcd' returns before starting gRPC server (e.g. mismatch snapshot, misconfiguration), receiving from grpcServerC blocks forever. This patch just closes the channel to not block on grpcServerC, and proceeds to next stop operations in Close. This was masking the issues in etcd-io#7834 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho
added a commit
to gyuho/etcd
that referenced
this issue
May 1, 2017
If 'StartEtcd' returns before starting gRPC server (e.g. mismatch snapshot, misconfiguration), receiving from grpcServerC blocks forever. This patch just closes the channel to not block on grpcServerC, and proceeds to next stop operations in Close. This was masking the issues in etcd-io#7834 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho
added a commit
to gyuho/etcd
that referenced
this issue
May 1, 2017
If 'StartEtcd' returns before starting gRPC server (e.g. mismatch snapshot, misconfiguration), receiving from grpcServerC blocks forever. This patch just closes the channel to not block on grpcServerC, and proceeds to next stop operations in Close. This was masking the issues in etcd-io#7834 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho
added a commit
to gyuho/etcd
that referenced
this issue
May 1, 2017
If 'StartEtcd' returns before starting gRPC server (e.g. mismatch snapshot, misconfiguration), receiving from grpcServerC blocks forever. This patch just closes the channel to not block on grpcServerC, and proceeds to next stop operations in Close. This was masking the issues in etcd-io#7834 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho
added a commit
to gyuho/etcd
that referenced
this issue
May 1, 2017
If 'StartEtcd' returns before starting gRPC server (e.g. mismatch snapshot, misconfiguration), receiving from grpcServerC blocks forever. This patch just closes the channel to not block on grpcServerC, and proceeds to next stop operations in Close. This was masking the issues in etcd-io#7834 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
fanminshi
added a commit
to fanminshi/etcd
that referenced
this issue
May 2, 2017
previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES etcd-io#7834
fanminshi
added a commit
to fanminshi/etcd
that referenced
this issue
May 2, 2017
previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES etcd-io#7834
fanminshi
added a commit
to fanminshi/etcd
that referenced
this issue
May 2, 2017
previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES etcd-io#7834
fanminshi
added a commit
to fanminshi/etcd
that referenced
this issue
May 2, 2017
previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES etcd-io#7834
gyuho
pushed a commit
that referenced
this issue
May 3, 2017
previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES #7834
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Overview
If I create a 1 node cluster from a snapshot restored
data-dir
and add a 2nd member, then restarting the 2nd member will cause it to exit after an error like so:Steps to reproduce the issue:
Prepare a simple snapshot
snapshot.db
from a single member etcd cluster:Start a temporary cluster on localhost:
Put some data in the cluster and create a snapshot:
Kill the existing cluster once we have the snapshot.
Now restore the data directory
infra1.etcd
for the 1st member of our new cluster from thesnapshot.db
file:Start the 1st member
infra1
of the new cluster using the restored data-dirinfra1.etcd
:Add a 2nd member
infra2
to cluster:Start the 2nd member:
Kill and restart the etcd server for the 2nd member
infra2
:The 2nd member should exit with the following logs:
The snap directories for both data directories look like so:
The text was updated successfully, but these errors were encountered: