Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys: onoff: redesign to meet changed needs #23898

Merged
merged 1 commit into from
Apr 22, 2020

Conversation

pabigot
Copy link
Collaborator

@pabigot pabigot commented Mar 29, 2020

The previous architecture proved unable to support user expectations, so the API has been rebuilt from first principles. Backward compatibility cannot be maintained for this change.

Key changes include:

  • Formerly the service-provided transition functions were allowed to sleep, and the manager took care to not invoke them from ISR context, instead returning an error if unable to initiate a transition. In the new architecture transition functions are required to work regardless of calling context: it is the service's responsibility to guarantee the transition will proceed even if it needs to be transferred to a thread. This eliminates state machine complexities related to calling context.
  • Constants identifying the visible state of the manager are exposed to clients through both notification callbacks and a new monitor API that allows clients to be notified of all state changes. Formerly the release operation was async, and would be delayed for the last release to ensure a client would exist to be notified of any failures. It is now synchronous.
  • Formerly the cancel operation would fail on the last client associated with a transition. The cancel operation is now synchronous. A helper function is provided to safely synchronously release a request regardless of whether it has completed or is in progress, satisfying the use case underlying Add cancel function to onoff service #22974.
  • The user-data parameter to asynchronous notification callbacks has been removed as user data can be retrieved from the CONTAINER_OF the client data.

This branch is based on #23601 and replaces #23229.

Draft until #23601 is merged and the patch that removes diagnostic code has been squashed into this commit.

@zephyrbot zephyrbot added area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test area: Documentation labels Mar 29, 2020
@zephyrbot
Copy link
Collaborator

zephyrbot commented Mar 29, 2020

All checks are passing now.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@pabigot pabigot requested a review from nordic-krch March 29, 2020 17:11
be started the function will result in an error.
error state to off. Transitions must be invokable from both thread
and interrupt context.
* The request and reset operations are asynchronous using sys_notify.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using sys_notify. Could you make this a link to notify documentation?

Copy link
Contributor

@nordic-krch nordic-krch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just few minor comments. Implementation looks good.

* the request was processed, if successful.
* @retval -EIO if service has recorded an an error.
* @retval -EINVAL if the parameters are invalid.
* @retval -EAGAIN if the reference count would overflow.
* @retval -EWOULDBLOCK if the function was invoked from non-thread
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's no longer valid.

@@ -456,77 +335,47 @@ static inline void onoff_client_init_callback(struct onoff_client *cli,
* transition to on can sleep, the transition cannot be started and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this paragraph is no longer valid, right?

* @retval -EWOULDBLOCK if the function was invoked from non-thread
* context and successful initiation could result in an attempt to
* make the calling thread sleep.
*/
int onoff_request(struct onoff_service *srv,
int onoff_request(struct onoff_manager *mgr,
struct onoff_client *cli);

/**
* @brief Release a reserved use of an on-off service.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's worth adding a comment here that releasing may trigger stop operation on service and that result of that operation will be passed to the monitor.

* transition there will always be at least one client that will be
* notified when the operation completes.
* Cancelling is supported only for onoff_request() and onoff_reset()
* operations, and is a synchronous operation. Be aware that any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add explicit information that ownership of the client structure returns to the client when function returns.

u32_t state;
int res;
};
static struct transition_record xions[32];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you somewhere comment on what xion stands for? transitions, monitors?

* @retval non-negative the observed state (ONOFF_STATE_ON) of the
* machine at the time of the release, if the release succeeds.
* @retval -EIO if service has recorded an an error.
* @retval -ENOTSUP if the machine is not in a state that permits release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo -EALREADY would be better. Similar to cancelation: -EALREADY if cli was not a record of an uncompleted notification

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be misleading. It's not that this specific client has already released (where -EALREADY would be appropriate), it's that the machine is in a state where no client can be released: off, transitioning to off, transitioning to on, resetting. In all those cases invoking this is a true error, not a redundant operation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other comments have been addressed.

The previous architecture proved unable to support user expectations,
so the API has been rebuilt from first principles.  Backward
compatibility cannot be maintained for this change.

Key changes include:

* Formerly the service-provided transition functions were allowed to
  sleep, and the manager took care to not invoke them from ISR
  context, instead returning an error if unable to initiate a
  transition.  In the new architecture transition functions are
  required to work regardless of calling context: it is the service's
  responsibility to guarantee the transition will proceed even if it
  needs to be transferred to a thread.  This eliminates state machine
  complexities related to calling context.
* Constants identifying the visible state of the manager are exposed
  to clients through both notification callbacks and a new monitor API
  that allows clients to be notified of all state changes.
* Formerly the release operation was async, and would be delayed for the
  last release to ensure a client would exist to be notified of any
  failures.  It is now synchronous.
* Formerly the cancel operation would fail on the last client associated
  with a transition.  The cancel operation is now synchronous.
* A helper function is provided to safely synchronously release a
  request regardless of whether it has completed or is in progress,
  satisfying the use case underlying zephyrproject-rtos#22974.
* The user-data parameter to asynchronous notification callbacks has
  been removed as user data can be retrieved from the CONTAINER_OF
  the client data.

Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
@pabigot pabigot force-pushed the nordic/20200319b branch from 751225e to fae646a Compare April 6, 2020 14:50
@pabigot pabigot marked this pull request as ready for review April 6, 2020 16:27
@carlescufi carlescufi requested review from tbursztyka and jukkar April 6, 2020 16:33
@carlescufi
Copy link
Member

@tbursztyka and @jukkar if you have time to review this it would be appreciated.

__ASSERT_NO_MSG((state == ONOFF_STATE_TO_ON)
|| (state == ONOFF_STATE_TO_OFF)
|| (state == ONOFF_STATE_RESETTING));
rv = state;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about cleaning client state here. Otherwise sys_notify_fetch_result will return -EAGAIN.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will. Why is that wrong? In an earlier review you asked that cancel not record a completion because the natural value for it -ECANCELED might conflict with a legitimate result. So the client data is left in an imcomplete state, for which no result is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, lets leave it. I was thinking about a way to determine who is currently owning the client and fetch result could be used if state was cleaned here and in cancel_or_release.

int rv = onoff_cancel(mgr, cli);

if (rv == -EALREADY) {
rv = onoff_release(mgr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client state shall be cleaned here. I see that we are missing a call in sys_notify for that. Something like sys_notify_reset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above: there is no notion of cleaning client state in this infrastructure. The client is responsible for initializing it before hand-off to the manager, and for interpreting whatever the manager set when it was returned. The only thing the manager does is set completion state, if the operation completes before the client data is returned.

@@ -40,7 +129,8 @@ static int validate_args(const struct onoff_manager *mgr,
int rv = sys_notify_validate(&cli->notify);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While browsing the onoff.c, I noticed that the code does not follow the coding guideline but introduces variables in the middle of the blocks. Would it be possible to fix these?

Copy link
Collaborator Author

@pabigot pabigot Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be obtuse; can you point me to the relevant guideline?

Mid-block declarations have been legal in C for twenty years, in various discussions I remember consensus that C99 was assumed to be the minimum supported version. Though I don't think anybody documented that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course compiler supports this just fine. We are following linux coding style in zephyr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything about this in https://www.kernel.org/doc/html/latest/process/coding-style.html

There's good reason to define the variable at the point where you can assign it an initial value, which is what I've done. So while it would be possible to "fix" these, I don't plan to do so. You may block the PR if you feel strongly enough.

Personally I wish we could get some traction on #21392 so we could stop having these arguments and instead have an objective standard for acceptability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already rely on too many C99 (and GNU99) features to deny its existence and effectiveness, like some cave-residing individuals do.

I also believe mid-function variable declaration, when it makes sense, is helpful and should even be preferred if it results in a tidier/easier-to-follow code.

In fact, mid-function beginning-of-scope variable declarations are widely preferred in the Linux kernel, although that is arguably different to what is being done here.

@carlescufi carlescufi merged commit 14e2ca4 into zephyrproject-rtos:master Apr 22, 2020
@pabigot pabigot deleted the nordic/20200319b branch April 27, 2020 13:12
@wentongwu
Copy link
Contributor

@pabigot how to use this infrastructure? for i2c example, there are app thread, slave, master, do we need a manager for slave because it provides service for app and need another manager for master because it provides service for slave?

@pabigot
Copy link
Collaborator Author

pabigot commented May 12, 2020

The expectation is that there is a single manager for the device, and anything that wants to use the device puts in a reservation that's keeps the device until the dependency has been released.

It's unclear whether this structure would be externally visible; it may be that reservations are handled with a power management API, or by a queued operation manager #24781 that ensures the device doesn't shut down as long as there's pending work, even if other pending work is currently using the device.

@wentongwu
Copy link
Contributor

@pabigot so the dependency will be built when anyone request service with onoff_request, not from DT. And the PR #24781 is a good idea and implementation, so that means every device driver whose traffic is asynchronous way will follow this model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Documentation area: Tests Issues related to a particular existing or missing test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants