Why ETCD new leader will restart the TTL counting when leader swithed #13294

jiapeish · 2021-08-16T03:05:21Z

Issue Overview
I noticed that etcd will restart the TTL counting when leader swithed.

Steps to reproduce

Start a etcd client and put a key with lease, like TTL = 10s;
When the key's remaining ttl becomes 4s, restart etcd leader node;
The key's remaining ttl becomes to 10s again.

Why not the TTL continue counting down after new leader is elected? And how can we modify the code to fix it?

serathius · 2021-08-16T12:45:17Z

This is problem in distributed systems, you cannot trust the time, each instance has different local time. A workaround for this is using time difference (TTL) instead of deadline. Etcd persists TTL and counts it from last leader change. If leader changes TTL will be preserved, but deadline will change. Thus TTL for leases is not the exact, but the minimal time for lease to be present.

There is and experimental feature called lease checkpointing that should help with this (checkpoints TTL every 5 minutes). Proper solution would require Etcd members to agree on time, which would require more work. Contributions are welcomed!

jiapeish · 2021-08-17T07:38:58Z

This is problem in distributed systems, you cannot trust the time, each instance has different local time. A workaround for this is using time difference (TTL) instead of deadline. Etcd persists TTL and counts it from last leader change. If leader changes TTL will be preserved, but deadline will change. Thus TTL for leases is not the exact, but the minimal time for lease to be present.

There is and experimental feature called lease checkpointing that should help with this (checkpoints TTL every 5 minutes). Proper solution would require Etcd members to agree on time, which would require more work. Contributions are welcomed!

Thank you @serathius, we'll turn the checkpointing feature on and have a test.

As you mentioned, the TTL is checkpointed every 5 minutes, but I didn't find a loop/timer that do this checkpointing work periodically. How does etcd control the checkpoint period? Does it means that ETCD leader node will do this checkpoint and sends the remaining TTL to follower nodes every 5 minutes, so that follower will continue counting down?

ardaguclu · 2021-08-20T11:19:09Z

@jiapeish, This pushes requests to the heap per checkpointinterval

etcd/server/lease/lessor.go

Line 714 in ab9563d

func (le *lessor) scheduleCheckpointIfNeeded(lease *Lease) {

(checkpointInterval is customizable with 5 minutes default value and being used in here).

For consumer side, This runs forever

etcd/server/lease/lessor.go

Line 588 in ab9563d

le.checkpointScheduledLeases()

. Basically, this

etcd/server/lease/lessor.go

Line 754 in ab9563d

remainingTTL := int64(math.Ceil(l.expiry.Sub(now).Seconds()))

updates remainingTTL.

If you wonder why this does not work when not enabling experimental-enable-lease-checkpoint. Because this

etcd/server/lease/lessor.go

Line 831 in ab9563d

func (l *Lease) RemainingTTL() int64 {

always returns ttl instead remainingTTL, since there is no scheduler to set this value.

@serathius I have one question. While I was investigating the issue, I realized this

etcd/server/lease/lessor.go

Line 755 in ab9563d

if remainingTTL >= l.ttl {

condition. Is there any possibility that this can be true?

busgo · 2021-08-22T05:45:06Z

This is problem in distributed systems, you cannot trust the time, each instance has different local time. A workaround for this is using time difference (TTL) instead of deadline. Etcd persists TTL and counts it from last leader change. If leader changes TTL will be preserved, but deadline will change. Thus TTL for leases is not the exact, but the minimal time for lease to be present.
There is and experimental feature called lease checkpointing that should help with this (checkpoints TTL every 5 minutes). Proper solution would require Etcd members to agree on time, which would require more work. Contributions are welcomed!

Thank you @serathius, we'll turn the checkpointing feature on and have a test.

As you mentioned, the TTL is checkpointed every 5 minutes, but I didn't find a loop/timer that do this checkpointing work periodically. How does etcd control the checkpoint period? Does it means that ETCD leader node will do this checkpoint and sends the remaining TTL to follower nodes every 5 minutes, so that follower will continue counting down?

You can check whether there is a leader at regular time. Code like this

// start elect
func (n *PinkNode) electLoop() {
	ticker := time.Tick(time.Second * time.Duration(n.electionTTL))
	log.Printf("the pink node instance %s start elect loop....", n.id)
	for {
		select {
		case <-ticker:
			if n.electionState == ElectionReadyState {
				n.electionState = ElectionDoingState
				log.Printf("the pink node instance %s start try elect loop....", n.id)
				n.tryElect()
			}

		}
	}
}

// try  elect
func (n *PinkNode) tryElect() {

	defer func() {
		n.electionState = ElectionReadyState
	}()
	ctx, _ := context.WithTimeout(context.TODO(), time.Second*3)
	id, err := n.etcdCli.Leader(ctx, n.electionPath)
	if err == nil {
		log.Printf("the pink node instance %s has leader is %s", n.id, id)
		if id == n.id {
			n.NotifyState(protocol.Leader)
		} else {
			n.NotifyState(protocol.Follower)
		}
		return
	}
	n.NotifyState(protocol.Follower)
	log.Printf("the pink node instance %s find leader fail:%+v", n.id, err)
	if !errors.Is(err, concurrency.ErrElectionNoLeader) {
		log.Printf("the pink node %s get leader fail:%+v", n.id, err)
		return
	}

	log.Printf("the pink node instance %s start campaign  leader", n.id)
	err = n.etcdCli.Campaign(ctx, n.id, n.electionPath, n.electionTTL)
	if err == nil {
		log.Printf("the pink node instance %s campaign  leader success", n.id)
		n.NotifyState(protocol.Leader)
		return
	}

}

jiapeish · 2021-08-22T07:04:46Z

It's nice of you to explain this clearly @ardaguclu . It looks that I need to have a read of the lessor code...

jiapeish · 2021-08-22T07:10:13Z

Hey, thank you @busgo , I've just find the checkpoint feature and maybe it will help.

stale · 2021-11-20T21:25:44Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

serathius · 2021-11-22T09:47:18Z

Looks like we lease checkpointing was not correctly implemented, proposed fix #13491

jiapeish · 2022-01-10T11:33:36Z

Looks like we lease checkpointing was not correctly implemented, proposed fix #13491

This fix is really import to me, I was wondering why the leasechecking not working after leader changed...

serathius · 2022-01-10T12:54:16Z

Looks like we lease checkpointing was not correctly implemented, proposed fix #13491

This fix is really import to me, I was wondering why the leasechecking not working after leader changed...

Please read the top comment in #13491

serathius · 2022-01-10T12:54:41Z

Closing as issue was resolved and fix will be released in v3.5.2

jiapeish changed the title ~~ETCD new leader will restart the TTL counting when leader swithed~~ Why ETCD new leader will restart the TTL counting when leader swithed Aug 16, 2021

stale bot added the stale label Nov 20, 2021

stale bot removed the stale label Nov 22, 2021

serathius closed this as completed Jan 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why ETCD new leader will restart the TTL counting when leader swithed #13294

Why ETCD new leader will restart the TTL counting when leader swithed #13294

jiapeish commented Aug 16, 2021 •

edited

Loading

serathius commented Aug 16, 2021 •

edited

Loading

jiapeish commented Aug 17, 2021

ardaguclu commented Aug 20, 2021

busgo commented Aug 22, 2021

jiapeish commented Aug 22, 2021

jiapeish commented Aug 22, 2021

stale bot commented Nov 20, 2021

serathius commented Nov 22, 2021

jiapeish commented Jan 10, 2022 •

edited by serathius

Loading

serathius commented Jan 10, 2022 •

edited

Loading

serathius commented Jan 10, 2022

Why ETCD new leader will restart the TTL counting when leader swithed #13294

Why ETCD new leader will restart the TTL counting when leader swithed #13294

Comments

jiapeish commented Aug 16, 2021 • edited Loading

serathius commented Aug 16, 2021 • edited Loading

jiapeish commented Aug 17, 2021

ardaguclu commented Aug 20, 2021

busgo commented Aug 22, 2021

jiapeish commented Aug 22, 2021

jiapeish commented Aug 22, 2021

stale bot commented Nov 20, 2021

serathius commented Nov 22, 2021

jiapeish commented Jan 10, 2022 • edited by serathius Loading

serathius commented Jan 10, 2022 • edited Loading

serathius commented Jan 10, 2022

jiapeish commented Aug 16, 2021 •

edited

Loading

serathius commented Aug 16, 2021 •

edited

Loading

jiapeish commented Jan 10, 2022 •

edited by serathius

Loading

serathius commented Jan 10, 2022 •

edited

Loading