Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Device: generalizing and improving control features (sync, lock and timeout) #24511

Closed

Conversation

tbursztyka
Copy link
Collaborator

@tbursztyka tbursztyka commented Apr 20, 2020

Part of #22941

(This RFC is now presenting ideas, and is currently not the final implementation which will evolve through comments until it reaches a consensus)

Currently synchronizing calls on interrupt based devices is a feature implemented by each and every drivers. This becomes worse about concurrent access mitigation, where many drivers do not implement anything.

This RFC tries then to provide an approach on generalizing both synchronization and concurrent access mitigation to all devices. It also adds a new feature: setting a timeout on calls. (currently working only on interrupt based devices, adding it for polling based ones is entirely possible but to simplify this RFC was put aside)

All - but the sync part - is configurable through Kconfig:

  • concurrent access mitigation can be disabled (on targets where devices are not accessed by +2 threads for instance)
  • timeouts are by default disabled

Important notice: Many driver are by default always working synchronously and thus would not make any usage of the sync part of this RFC. I did not take this into account for now. It would be entirely possible to save memory by having a parameter on the device instance macros (thus why I did not try to play with this yet, due to the amount of macro call to be fixed) to tell about that requirement, and thus save a sensible amount of memory since struct k_sem is not tiny.
We could do the same about the lock on concurrent access, if relevant. Having a NULL pointer instead of struct k_sem pointer would be trivial to manage.

[Note] This RFC if built on top of #23407

Idea and use cases

Synchronization

All device driver synchronous API call that are implemented on interrupt basis in the driver would use is:

  • api_call()
    -> configure and start the interrupt based work
    -> k_sem_take(sync_semaphore) <--- it would wait until the interrupt based work is finished and runs a k_sem_give(sync_semaphore)

A large amount of device drivers already use that scheme. But they are currently implementing such synchronization on their own. Idea is to generalize it. (See the important notice above, however)

Concurrent access

All device driver API calls, no exception (whether these calls are synchronous or not).
This is a basic yet effective way to protect the device driver resource.
(See the important notice above also)

Timeout

Currently, only a few device driver API provide a timeout and only on certain call. But this will be insufficient if we seek for robustness. A tiny hw issue could just lock the whole device and there would be no way for recovering it.

Idea is to simply add such generic timeout on every call. It could be using a default setting as well as a dedicated one.
Note however that the current RFC is simply setting such timeout on the sync semaphore. It would be easy to run it separately though.

@zephyrbot
Copy link
Collaborator

zephyrbot commented Apr 20, 2020

Some checks failed. Please fix and resubmit.

checkpatch issues

-:23: WARNING:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#23: FILE: include/device.h:98:
+#define DEVICE_CONTEXT_DEFINE(dev_name, level, prio)			\
+	static Z_DECL_ALIGN(struct device_context)			\
+		_CONCAT(__device_context_, dev_name) __used		\
+	__attribute__(							\
+		(__section__(".device_context_" #level STRINGIFY(prio)))) = \
+		DEVICE_CONTEXT_INITIALIZE(_CONCAT(__device_context_,	\
+						  dev_name))		\
+

-:143: ERROR:MULTISTATEMENT_MACRO_USE_DO_WHILE: Macros with multiple statements should be enclosed in a do - while loop
#143: FILE: include/linker/linker-defs.h:91:
+#define DEVICE_CONTEXT_SECTIONS()				\
+		__device_context_start = .;			\
+		CREATE_OBJ_LEVEL(device_context, PRE_KERNEL_1)	\
+		CREATE_OBJ_LEVEL(device_context, PRE_KERNEL_2)	\
+		CREATE_OBJ_LEVEL(device_context, POST_KERNEL)	\
+		CREATE_OBJ_LEVEL(device_context, APPLICATION)	\
+		CREATE_OBJ_LEVEL(device_context, SMP)		\
+		__device_context_end = .;

-:143: WARNING:TRAILING_SEMICOLON: macros should not use a trailing semicolon
#143: FILE: include/linker/linker-defs.h:91:
+#define DEVICE_CONTEXT_SECTIONS()				\
+		__device_context_start = .;			\
+		CREATE_OBJ_LEVEL(device_context, PRE_KERNEL_1)	\
+		CREATE_OBJ_LEVEL(device_context, PRE_KERNEL_2)	\
+		CREATE_OBJ_LEVEL(device_context, POST_KERNEL)	\
+		CREATE_OBJ_LEVEL(device_context, APPLICATION)	\
+		CREATE_OBJ_LEVEL(device_context, SMP)		\
+		__device_context_end = .;

-:284: ERROR:CODE_INDENT: code indent should use tabs where possible
#284: FILE: kernel/device.c:225:
+ ^I}$

-:284: WARNING:SPACE_BEFORE_TAB: please, no space before tabs
#284: FILE: kernel/device.c:225:
+ ^I}$

-:284: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#284: FILE: kernel/device.c:225:
+ ^I}$

- total: 2 errors, 4 warnings, 659 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

NOTE: Whitespace errors detected.
      You may wish to use scripts/cleanpatch or scripts/cleanfile

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PRINTK_WITHOUT_KERN_LEVEL SPDX_LICENSE_TAG SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@joerchan
Copy link
Contributor

joerchan commented Jul 17, 2020

Not sure why Github decided to remove so many people, I only wanted to remove myself.

Because editing the list is equivalent to setting a new list manually and, in this case, there is a max number of reviewers that can be submitted (which does not exist when list is automatically generated by github), so the entries exceeding this max number are removed. I've been bitten by this number of times as well..

@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Device Management area: Drivers area: Kernel area: Tests Issues related to a particular existing or missing test Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants