Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: controller: Fix ticker catch up on ISR latency #23815

Merged

Conversation

cvinayak
Copy link
Contributor

Fix ticker to avoid catch up of periodic timeouts in case of
large ISR latencies like in case of flash erase scenarios.

Signed-off-by: Vinayak Kariappa Chettimada vich@nordicsemi.no

@cvinayak cvinayak added bug The issue is a bug, or the PR is fixing a bug area: Bluetooth RFC Request For Comments: want input from the community Regression Something, which was working, does not anymore backport v2.2-branch labels Mar 26, 2020
@cvinayak cvinayak requested a review from mtpr-ot March 26, 2020 13:14
@cvinayak cvinayak requested review from kruithofa and dagbja March 26, 2020 13:17
ticks_to_expire +=
ticker_remainder_inc(ticker);
lazy++;
}
Copy link
Collaborator

@mtpr-ot mtpr-ot Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a safeguard for when lagging behind, and catching up? The only thing I worry about is that this will violate the "must_expire" contract, even if it most likely will not occur for platforms using must_expire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will fix this to consider the "must_expire" ticker instances.

Copy link
Collaborator

@mtpr-ot mtpr-ot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it, this change will make ticker_job take over the lazy/latency handling, which is fine and more efficient, but then the lazy handling in ticker_worker should be removed (a few lines).

@cvinayak
Copy link
Contributor Author

As I see it, this change will make ticker_job take over the lazy/latency handling, which is fine and more efficient, but then the lazy handling in ticker_worker should be removed (a few lines).

Thanks for pointing out, will address this.

@cvinayak cvinayak force-pushed the github_ticker_no_catchup_fix branch from e10dfc6 to 8186b77 Compare March 27, 2020 09:39
@cvinayak
Copy link
Contributor Author

@mtpr-ot I have reworked, some more fixes, and they are in additional separate commits in this PR. Please review.

@mtpr-ot
Copy link
Collaborator

mtpr-ot commented Mar 27, 2020

@mtpr-ot I have reworked, some more fixes, and they are in additional separate commits in this PR. Please review.

Working on it - I think I need to test it, I'm a bit unsure about impact of the latest change. Feedback later today
My ticker unittests fails with the change, so I need to analyze that.

@mtpr-ot
Copy link
Collaborator

mtpr-ot commented Mar 27, 2020

Testing with this PR shows problems honoring critical priority events (priority = -128), getting shallow "must_expire" timeouts where they should go in the air. Will continue to investigate.

if (ticker->lazy_current != 0U &&
!TICKER_RESCHEDULE_PENDING(ticker)) {
ticks_latency -= ticks_to_expire;
#endif /* !CONFIG_BT_TICKER_COMPATIBILITY_MODE */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep the logic that if lazy_current != 0 and no pending reschedules, then clear the slot_previous info and nothing else. This is needed because must_expire still needs special handling for must_expired nodes that don't go in the air, and catch a lazy count. By checking for this in addition, I think we get the desired functionality for both "worlds". At least my tests pass with this:

#if !defined(CONFIG_BT_TICKER_COMPATIBILITY_MODE)
		ticks_latency -= ticks_to_expire;

		if (ticker->must_expire && ticker->lazy_current != 0U &&
		    !TICKER_RESCHEDULE_PENDING(ticker)) {
			instance->ticker_id_slot_previous = TICKER_NULL;
			instance->ticks_slot_previous = 0U;
		} else
#endif /* !CONFIG_BT_TICKER_COMPATIBILITY_MODE */
		{

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A skipped ticker instance shall not clear the ticks_slot_previous, wouldn't that also apply to a "must expire" tickers? We should have a audio call at the next opportunity to review this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I had a sneaking suspicion after writing that, and went back to the code this morning. I will take a second look, and let's talk about this Monday morning, if you're available.

Copy link
Collaborator

@mtpr-ot mtpr-ot Mar 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I figured it out. The problem is in the updating of slot_previous to the expired ticker id, regardsless of must_expire:

	/* If a reschedule is set pending, we will need to keep
	 * the slot_previous information
	 */
	if ((ticker->ticks_slot != 0U) &&
	    !(ticker->must_expire && ticker->lazy_current != 0U) && // <-- added
	    (((ticker->req - ticker->ack) & 0xff) == 2U) &&
	    !TICKER_RESCHEDULE_PENDING(ticker)) {
		instance->ticker_id_slot_previous =
			id_expired;
		instance->ticks_slot_previous =
			ticker->ticks_slot;
	}

This will overwrite the info from the blocking node, and use the ticks_slot from the must_expire (shallow) expiry instead, which is incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested and added your check, PR updated.

Copy link
Collaborator

@mtpr-ot mtpr-ot Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great. Doing final re-test here. However, unfortunately you need to apply conditional compile with !CONFIG_BT_TICKER_COMPATIBILITY_MODE around that added line. Sorry for not seeing that when suggesting the change :-/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my fault, I was aware, but forgot to add the IS_ENABLE(...). Will handle and push updates soon.

Fix ticker to avoid catch up of periodic timeouts in case of
large ISR latencies like in case of flash erase scenarios.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Removed lazy handling from ticker_worker as it is now taken
care in ticker_job.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Reset ticker state in ticker_job for ticker instances that
have been skipped in the ticker_worker.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Fix ticks_slot_previous calculation in the ticker_job when
tickers are skipped.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
@cvinayak cvinayak force-pushed the github_ticker_no_catchup_fix branch from 8186b77 to 0783634 Compare March 30, 2020 06:35
@cvinayak cvinayak removed the RFC Request For Comments: want input from the community label Mar 30, 2020
@pdunaj pdunaj requested review from MarekPieta and pdunaj March 30, 2020 07:28
If must_expire is set for a ticker instance, then ticker
expiry shall still perform catchup.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
Ticker is now fixed to avoid catch up of periodic timeout under
large ISR latencies.

Revert commit a749e28 ("Bluetooth: controller: split:
nRF: Use ticker compat mode as default").

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
@cvinayak cvinayak force-pushed the github_ticker_no_catchup_fix branch from 0783634 to 32a77ef Compare March 30, 2020 16:01
Copy link
Collaborator

@mtpr-ot mtpr-ot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thoh-ot
Copy link
Collaborator

thoh-ot commented Mar 31, 2020

Please create a backport pr for this.

@carlescufi carlescufi merged commit fd343ee into zephyrproject-rtos:master Apr 1, 2020
cvinayak added a commit to cvinayak/zephyr that referenced this pull request Apr 14, 2020
Fix ticker to avoid catch up of periodic timeouts in case of
large ISR latencies like in case of flash erase scenarios.

Relates to zephyrproject-rtos#23815.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
jhedberg pushed a commit that referenced this pull request Apr 20, 2020
Fix ticker to avoid catch up of periodic timeouts in case of
large ISR latencies like in case of flash erase scenarios.

Relates to #23815.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
pkral78 pushed a commit to cloudfieldcz/zephyr that referenced this pull request Aug 4, 2020
Fix ticker to avoid catch up of periodic timeouts in case of
large ISR latencies like in case of flash erase scenarios.

Relates to zephyrproject-rtos#23815.

Signed-off-by: Vinayak Kariappa Chettimada <vich@nordicsemi.no>
@cvinayak cvinayak deleted the github_ticker_no_catchup_fix branch March 1, 2021 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth bug The issue is a bug, or the PR is fixing a bug Regression Something, which was working, does not anymore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants