-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZSAI: Zephyr Storage Access Interface #64732
Comments
@de-nordic, It seems more appropiate to create a layer above the existing types of devices to utilize them in a standardized way. This would just be building on proven code and does not involve any risk. |
I do not think we should go down the road of adding ioctl-like calls to Zephyr drivers. Linux made this mistake many years ago, let's not repeat that. IOCTLs seems to make a generic API, but in reality - since you can pass just about anything - it really makes for a driver specific API, which makes it very difficult to write driver-agnostic application code. I also share the concern raised by @Laczen above in trying to embrace so many different technologies under one API. I see lots of potential issues with very little gain. You simple cannot treat all of these technologies the same when it comes to wear leveling, erasure, write time, etc. I still, as voiced often enough, believe implementing technology-dedicated backends to the various subsystems using storage to be a much better solution. That still gives us compile-time early fail opportunities. |
Retention is use case, the others yes I would like to replace with single base API that will have common write, read and IOCTL. The IOCTL would be used to implement characteristics that are not covered by common calls.
To add additional layer you have to first multiplex all the APIs into that additional layer, and then provide user a way to demultiplex, because the fact that devices have different characteristics will not change.
Instead you can just multiplex all devices into API that then will distinguish them by characteristics (features):
Note also that when your code needs to support more than one device, for example you are placing FAT on external ReRAM and internal SoC FLASH, then such code is smaller and cleaner when decisions are taken basing on characteristics rather than technology. Additional layers or APIs do not help in cutting on flash size or when XIP is in use. Note that when adding additional layer you will have that layer and every API that is required by devices in the design, but when you have single API that is all you have. |
We will got the hal way then, because people will try to use whatever chip supports. Also if we do not let to expose extra features of chips what would be a reason to support Zephyr by vendors? Vendors sell chips by extra features also.
Yes and it is not possible to write driver-agnostic code when you select specific chip, basing on its extra features, while designing your embedded hardware. We can only try to invest into making the common code, common subsystems, to support as many different features as possible.
Neither of our API is aware of wear leveling or addresses it because this is higher function, unless you have NAND controller with such a feature, done in software. And again the wear leveling does not depend on technology: you may have higher or lower endurance flash devices or your endurance may depend on implicit write block size (AT25M02). We have nearly no support for device timings and that information is not needed by user, as ,in most cases, the timing has to be exchanged between device driver and controller; API functions will return with timeout anyway. For example if page erase on flash device takes ~10ms, that information should be known by the peripheral that will send the command and will be waiting for response. Wear balancing does not depend on number of write/erase cycles but rather on how your software is able to move structures around to not repeat writes/erases of single area within device: with badly written software you can kill 10k cycle and 1.2M cycle device sooner than expecting.
We can have run-time init checks on subsystems that depend on specific characteristics that will fail on start too. It makes harder to support more than one type of device at once in various subsystem if you have to switch between API, which may also happen at run-time.
Or needs implementation of additional callback object that will redirect calls properly... which is either another layer, or what I am proposing now anyway. Same with DFU, Stream Flash, you name it. And these things are within Zephyr, so if we need to make DFU work with different technologies we would have to make it work with all of them at once, because aside from SoC FLASH (or MRAM, FRAM, EEPROM) user may connect external RRAM (or MRAM, FRAM, FLASH, EPROM) and DFU something here and something there. Yes we do not have proper wear balancing file system, but that is required by data recording and retrieving and we already have to let users know that for example FATFS is not so good for internal soc with 4k pages, but will be used anyway. |
This would mean reworking all drivers and does not add any real benefit. For the retention I meant retained_memory.
This is an implementation detail. It is perfectly possible to create a property on the "additional" layer that contains information like the "erase-needed" and do the same construct as proposed in the code above (other properties like
Regarding the size influence, only in very specific cases the proposal will reduce in size. Regarding XIP: if there is a request to not have any writing/erasing needs there is no need to define a flash device, the flash driver should only be used to set up XIP, any read can be done using a memcpy. A replacement of
and provide read/write and (optional) erase routines, would be a lot simpler to implement, to maintain and provide all functionality (and more as it provides the region definitions) as the proposed ZSAI method. |
Replacing the Flash API has wide-ranging implications and should be discussed in the Architecture WG. @carlescufi will you please add to the agenda? |
It's added for today @MaureenHelm |
This seems to assume a one operation at a time blocking API, won't this limit the ability to batch up changes to storage? E.g. a sequence of reads and writes could potentially be batched to the device in some manner with DMA in many cases perhaps saving round trips and context switches This may not have been much of an issue with say eeproms or flash where round trip latencies can be dwarfed by the storage medium, but with something like fram where the storage medium is very fast and low latency, waiting on context swap and round trip of a spi/i2c transfer seems like quite an overhead to be considered. From the fujitsu fram pdf https://www.fujitsu.com/uk/Images/C29-FRAM-Factsheet-170802-0959.pdf a screenshot of the timing table is pretty telling. |
Architecture WG
|
Flash support is already there, with all the abstractions to specific layers like disk access, DFU, NVS. Replacing it with another API that covers everything is too much, I am very sceptical that this task will ever be finished. I also do not see the benefit of including SRAM, this can easily be done at a higher level, e.g. RAM disk driver is quite compact (and I actually think RFMRAM disk will not be much larger). Including EEPROM has even less benefit IMHO, it is a more constrained device than flash and SRAM together. My suggestion is to reduce the RFC to [RFM]RAM devices and go with the RFMRAM driver API and add specific layer drivers/abstractions/backends where needed, e.g. disk access driver or NVS/DFU backend that works on top of the RFMRAM driver API. This way you are also not blocked by any deprecations, you can start with more practical use cases like DFU/NVS and add things like disk driver later. |
I think that NVME and NAND can be pulled here, but I these are rather block devices and they differ in access from the other devices here. The devices here have byte alignment access, while NVME and NAND do not. You have to address blocks in them, so they are more suitable for Disk subsystem in that matter.
There are actually three possible characteristics here: supports, requires and auto-erase. Note also that erase is required when flipping bit from "erase state" to opposite, and some device allow that once per write block size(these with ECC always), some twice or have some very high limit.
Yes agree. Same with NAND.
We have no universal storage. The only universal part we have is storing and executing code in raw.
Exactly. What I am trying to propose here is to make uniform API to access devices on the lowest level, with the most common functions, read and write, exposed directly and everything else via ioctl, and provide set of functions/macros to allow us to make the functionalities we provided with Zephyr to work with the characteristics that are most common across devices, and allow us to extend support to things that we can make common API for in the future (for example support for r/w locks).
That is problem with how we advertise Zephyr. IMHO Zephyr is that little thing in the corner of my ROM that allows me to have Bluetooth and the basic storage access without having to implement these, but I still have to take care of details. I need to understand what I am doing, and this Zephyr will not solve. Zephyr only offers you the basic things, so you would not have to write you own driver for MX25, but you can still kill the device, if you do not understand it and it will still drain power when added to design. If you want, for example, that MX25 turned off by external FET controlled via GPIO, when not used by FATFS, you are on your own, because however good PM support we have, you still need to connect the dots.
The differences between technologies are described in the table. Does it matter whether technology is eeprom or flash, if it requires erase? If you have SPI connected SRAM that has battery handing from it all the time, does it differ from eeprom?
We have not been able to do that so far. We have VFS that provides as uniform access to file systems, but so far we have not been able to add to the tree file systems that could be efficiently used across all device characteristics. |
Yes, the task is hard. But adding another layer is not a solution.
Constrained in what? It is usually slower, true, but size depends on how much you want to spend, of course it does not reach FLASH NOR levels, but still you can get 256kx8 device, which is already able to handle FAT FS. SPI RAM devices can be used as external buffers for data between devices, or for storing frequently update data in logging devices, before that data gets filtered and dumped to permanent storage or exfiltrated other way.
Can do that to. Further or later, other API will have to be swallowed by that anyway as what you will have is:
For some time though you will have in code something like this:
|
The DMA is served by SoC peripheral rather than the device mentioned here. Most of users will not strictly write thins by byte, will just pass buffer to API and that is passed to driver. Driver here should recognize here how much data it can stuff in single operation, in case of this device there is no limit as far as I can tell, but will be held down by the DMA peripheral anyway. |
I'm talking about having a batching API for doing reads/writes. I'm not sure where that was lost in translation. spi_transceive and i2c_transfer do this today already. This API seems to assume simplistic read(buf, len) and write(buf, len) type APIs and this is I think going to be pretty inefficient compared to say an API that provides transaction semantics over a list/array of operations, e.g. int storage_transaction(reads_writes_erases_ops); |
Yes I am addressing the simplest case when read/write of a single buffer is need. And I am addressing the storage device characteristics not controller features, although nothing stops us from extending API - where addressing characteristics of different controllers may may be harder than storage technologies. The spi_transceive and i2c_transfer are bus functions and are not characteristic to a device you have mentioned. Drivers call these, and currently they also do that in the simplest possible way. We could probably do |
Introduction
The RFC proposes common access interface for different storage devices that moves from focusing on device technology, like FLASH or EEPROM, to device characteristics that are common or different across technologies, with no requirements for connecting bus.
The main goal is to replace the Flash API and allow Zaphyr and user code to run and operate on devices based on different technologies without need to switch between APIs, specifically that some of subsystems may need to support multiple devices during run-time, and these devices may require different paths to be taken.
Problem description
Currently we have separate API for Flash and EEPROM devices and we are not ready for devices like FRAM, MRAM, RRAM or even SRAM that can also serve as storage, any of these connected over SPI/i2c or directly to SoC.
Proposed change
The intention of the API is to allow using single API to access devices and provide enough information to users (subsystems or applications) so they can device in code how to approach features or limitations of a device depending on device characteristic rather than technology.
For example when writing code that will be able to store data on EEPROM or FLASH, instead of focusing on technology related API and then on characteristics of devices, like write block size, using the API user can directly focus on characteristics of a device:
Allowing user to work with device characteristics, rather than with separate APIs for technologies, allows user to provide more linear and clean code.
On the other side: if user code does not support provide support for device with given characteristics, the API should provide means to just fail at compile time or initialization when incompatible device is provided.
Detailed RFC
The ZSAI API provides three basic functions: read, write and ioctl.
The base operations like read and write would serve the common set of capabilities, assuming that most devices will allow read and/or write. The IOCL would allow to access other features of devices, where features will be given system wide identifiers, for more common operations and vendor identifiers for operations provided by specific vendors.
The API provides Kconfigs that allow writers of drivers and applications to share information about driver capabilities:
Note that there are opposite definitions because the system may have multiple devices with different capabilities and code can be optimized for various combinations of options.
And macros that allow to get information at compile/run-time:
There are also utility IOCTL functions that simplify some work and allow users to directly call IOCTL:
Aux functions that make life easy:
Note that all of the above functions will return -ENOTSUP if not supported.
There is also set of macros that allow coders to properly setup drivers, for example for constructing basic device configuration, which is used for storing information on some base device capabilities.
TODOs:
Dependencies
There will be a lot of changes required to code that is currently using flash API, as the main goal is to replace this API.
This will include Flash MAPI API, stream flash and so on, wherever Flash API calls are used. The DTS will not be touched that much as most of code that references flash devices directly accesses device information in DTS by nodes and labels.
All of the Flash API drivers will need to be modified to use ZSAI rather than flash API driver.
There is already Amibq device that is MRAM based and could use the subsystem
Concerns and Unresolved Questions
This is probably controversial change. The questions will be collected here from comments.
Alternatives
Slowly dwelling into madness caused increasing number of #ifdefs requiring to support different storage technology that is already there but not in Zephyr.
Additional information
Table of characteristics:
(0) Microchip 25AA512 is an example of device that is EEPROM and supports, although does not require, erase command prior to wirite - it is done implicitly (3.3 Write Cycle)
(1) No known device with such characteristics at this moment.
(2) Flash devices usually support single write bytes as long as write happens at write-block-size and all other bytes of write block size are written with erase-value (pattern).
(3) There is a cap on how many writes may be performed per write-block-size, device dependent before bits start to flicker
(4) If device has ecc paired with write-block-size then writing by byte is not possible. For example Infineon S25FL512S allows single byte writes (byte walking) when ECC is disabled.
(5) AT25M02 EEPROM always does writes by four bytes, so even if command is send to update single byte four will be written and ECC will be updated; this means that life of device is mandated by number of ECC updates possible for each four byte write block.
(6) Some devices, like 25AA512 or AT25M02, may support writing data in bulk up to N bytes, where N represents page size. Such writes are only possible within page boundaries and wrap around, which means that if you try to write N bytes starting at offset 1 of the page, at the end of operation offset 0 will have value you have sent as byte N.
(7) The MB85AS4MT ReRAM from Fujitsu has page program operation that allows to write 256 bytes at once, but will wrap around if address in not page aligned.
(8) Ambiq SoC device, already supported by Zephyr, has word alignment requirement for writes.
(9) Everspin MR2xH40 devices have no alignment requirement and can be written, from edge to edge, in a single SPI command; device seem to wrap around and keep on writing data from the beginning when the address space is exhausted
(10) Fujitsu MB85RS1MT and Cypress (Infineon) FM25V20A have no alignment and can write continuously as long as bytes come, same as (9).
(11) Cypress CY14V101QS family. They do not require erase and support auto writes of SRAM to the NV FLASH backend.
Links to device data:
25AA512
AT25M02
25AA040A
S25FL512S
FM24V05
MR2xH40
FM25V20A
MB85RS1MT
MB85AS4MT
23LCV1024
Data for SoC embedded memories is not provided here, but it characteristics can be read from SoC drivers within Zephyr tree.
On technologies:
MRAM https://semiwiki.com/semiconductor-manufacturers/tsmc/283868-tsmc-32mb-embedded-stt-mram-at-isscc2020/
Amibq device uses something that is called STT-MRAM or TMSC 22 ULL eMRAM, which is MRAM based on Flash to allow it to keep state at higher temperatures; doc is here https://ieeexplore.ieee.org/document/8993469 but I have not chipped out $ to get it. From the Ambiq driver I see that erase is actually done by filling (write) of flash with 0xff.
The text was updated successfully, but these errors were encountered: