Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZSAI: Zephyr Storage Access Interface #64732

Open
de-nordic opened this issue Nov 2, 2023 · 15 comments
Open

ZSAI: Zephyr Storage Access Interface #64732

de-nordic opened this issue Nov 2, 2023 · 15 comments
Assignees
Labels
area: EEPROM area: Flash RFC Request For Comments: want input from the community

Comments

@de-nordic
Copy link
Collaborator

de-nordic commented Nov 2, 2023

Introduction

The RFC proposes common access interface for different storage devices that moves from focusing on device technology, like FLASH or EEPROM, to device characteristics that are common or different across technologies, with no requirements for connecting bus.
The main goal is to replace the Flash API and allow Zaphyr and user code to run and operate on devices based on different technologies without need to switch between APIs, specifically that some of subsystems may need to support multiple devices during run-time, and these devices may require different paths to be taken.

Problem description

Currently we have separate API for Flash and EEPROM devices and we are not ready for devices like FRAM, MRAM, RRAM or even SRAM that can also serve as storage, any of these connected over SPI/i2c or directly to SoC.

Proposed change

The intention of the API is to allow using single API to access devices and provide enough information to users (subsystems or applications) so they can device in code how to approach features or limitations of a device depending on device characteristic rather than technology.
For example when writing code that will be able to store data on EEPROM or FLASH, instead of focusing on technology related API and then on characteristics of devices, like write block size, using the API user can directly focus on characteristics of a device:

  • Is erase needed?
  • How much data and what alignment is need?

Allowing user to work with device characteristics, rather than with separate APIs for technologies, allows user to provide more linear and clean code.
On the other side: if user code does not support provide support for device with given characteristics, the API should provide means to just fail at compile time or initialization when incompatible device is provided.

Detailed RFC

The ZSAI API provides three basic functions: read, write and ioctl.
The base operations like read and write would serve the common set of capabilities, assuming that most devices will allow read and/or write. The IOCL would allow to access other features of devices, where features will be given system wide identifiers, for more common operations and vendor identifiers for operations provided by specific vendors.

int zsai_read(const struct device *dev, void *data, off_t offset,
			size_t size);
int zsai_write(const struct device *dev, const void *data,
			 off_t offset, size_t size);

/* For erasing device */
int zsai_erase_range(const struct device *dev, const struct zsai_range *range);

int zsai_ioctl(const struct device *dev, uint32_t id,
			      const uintptr_t in, uintptr_t in_out);

The API provides Kconfigs that allow writers of drivers and applications to share information about driver capabilities:

CONFIG_ZSAI_DEVICE_HAS_ERASE  /* To be set by a devices that supports erase command, but not necessarily requires it */
CONFIG_ZSAI_DEVICE_HAS_NO_ERASE /* To be set by devices that do not require erase */
CONFIG_ZSAI_DEVICE_REQUIRES_ERASE /* To be set by a device that requires erase */
CONFIG_ZSAI_DEVICE_HAS_UNIFORM_PAGES /* To be set by devices with uniform page size */
CONFIG_ZSAI_DEVICE_HAS_NON_UNIFORM_PAGES /* To be set by devices with non-uniform page size */
CONFIG_ZSAI_DEVICE_HAS_PAGE_HOLES /* For devices with erase and non linear address space */

Note that there are opposite definitions because the system may have multiple devices with different capabilities and code can be optimized for various combinations of options.

And macros that allow to get information at compile/run-time:

ZSAI_ERASE_REQUIRED(dev) /* True if device requires erase */
ZSAI_ERASE_SUPPORTED(dev) /* True if device supports erase in hardware, it may not require it though */
ZSAI_ERASE_VALUE(dev) /* Returns value, 32bit, that represents erase pattern */
ZSAI_WRITE_BLOCK_SIZE(dev) /* Write block size of device */

There are also utility IOCTL functions that simplify some work and allow users to directly call IOCTL:

/* For getting page information on devices requiring erase */
int zsai_get_page_info(const struct device *dev, off_t offset, struct zsai_ioctl_range *pi)

Aux functions that make life easy:

/* For erasing device */
int zsai_erase(const struct device *dev, size_t start, size_t size);

/* For filling device with some pattern */
int zsai_fill(const struct device *dev, uint8_t pattern, size_t start, size_t size);

/* Erase or fill with pattern - for code that requires device to be in some state to work */
int zsai_erase_or_fill(const struct device *dev, uint8_t pattern, size_t start, size_t size);

Note that all of the above functions will return -ENOTSUP if not supported.

There is also set of macros that allow coders to properly setup drivers, for example for constructing basic device configuration, which is used for storing information on some base device capabilities.

TODOs:

  • All functions taking offset should take it as a pointer and when operation failed the offset of failure should be returned via the pointer.
  • erase-value should be replaced with erase-pattern (sounds better?).

Dependencies

There will be a lot of changes required to code that is currently using flash API, as the main goal is to replace this API.
This will include Flash MAPI API, stream flash and so on, wherever Flash API calls are used. The DTS will not be touched that much as most of code that references flash devices directly accesses device information in DTS by nodes and labels.
All of the Flash API drivers will need to be modified to use ZSAI rather than flash API driver.

There is already Amibq device that is MRAM based and could use the subsystem

Concerns and Unresolved Questions

This is probably controversial change. The questions will be collected here from comments.

Alternatives

Slowly dwelling into madness caused increasing number of #ifdefs requiring to support different storage technology that is already there but not in Zephyr.

Additional information

Table of characteristics:

Device technology Requires erase Supports erase Single byte writes Write block alignment Write protections
FLASH Y Y Possible (2)(3)(4) Usually required (2) Possible, also via dedicated pin
EEPROM N May (0) Yes but depends on device (5) May be hidden (5)(6) Possible, also via dedicated pin
R(e)RAM N Unknown (1) Yes but depends on device No (7), possibly(1) Possible, also via dedicated pin
F(e)RAM N Unknown (1) Yes but depends on device No (10), possibly(1) Possible, also via dedicated pin
MRAM N Unknown (1) Yes but depends on device Yes (8), No (9), possibly (1) Possible, also via dedicated pin
SRAM N No Yes No within SoC possible, externally does not seem so
nvSRAM (11) N No Yes No The NV part yes

(0) Microchip 25AA512 is an example of device that is EEPROM and supports, although does not require, erase command prior to wirite - it is done implicitly (3.3 Write Cycle)
(1) No known device with such characteristics at this moment.
(2) Flash devices usually support single write bytes as long as write happens at write-block-size and all other bytes of write block size are written with erase-value (pattern).
(3) There is a cap on how many writes may be performed per write-block-size, device dependent before bits start to flicker
(4) If device has ecc paired with write-block-size then writing by byte is not possible. For example Infineon S25FL512S allows single byte writes (byte walking) when ECC is disabled.
(5) AT25M02 EEPROM always does writes by four bytes, so even if command is send to update single byte four will be written and ECC will be updated; this means that life of device is mandated by number of ECC updates possible for each four byte write block.
(6) Some devices, like 25AA512 or AT25M02, may support writing data in bulk up to N bytes, where N represents page size. Such writes are only possible within page boundaries and wrap around, which means that if you try to write N bytes starting at offset 1 of the page, at the end of operation offset 0 will have value you have sent as byte N.
(7) The MB85AS4MT ReRAM from Fujitsu has page program operation that allows to write 256 bytes at once, but will wrap around if address in not page aligned.
(8) Ambiq SoC device, already supported by Zephyr, has word alignment requirement for writes.
(9) Everspin MR2xH40 devices have no alignment requirement and can be written, from edge to edge, in a single SPI command; device seem to wrap around and keep on writing data from the beginning when the address space is exhausted
(10) Fujitsu MB85RS1MT and Cypress (Infineon) FM25V20A have no alignment and can write continuously as long as bytes come, same as (9).
(11) Cypress CY14V101QS family. They do not require erase and support auto writes of SRAM to the NV FLASH backend.

Links to device data:
25AA512
AT25M02
25AA040A
S25FL512S
FM24V05
MR2xH40
FM25V20A
MB85RS1MT
MB85AS4MT
23LCV1024

Data for SoC embedded memories is not provided here, but it characteristics can be read from SoC drivers within Zephyr tree.

On technologies:
MRAM https://semiwiki.com/semiconductor-manufacturers/tsmc/283868-tsmc-32mb-embedded-stt-mram-at-isscc2020/
Amibq device uses something that is called STT-MRAM or TMSC 22 ULL eMRAM, which is MRAM based on Flash to allow it to keep state at higher temperatures; doc is here https://ieeexplore.ieee.org/document/8993469 but I have not chipped out $ to get it. From the Ambiq driver I see that erase is actually done by filling (write) of flash with 0xff.

@de-nordic de-nordic added the RFC Request For Comments: want input from the community label Nov 2, 2023
@de-nordic de-nordic self-assigned this Nov 2, 2023
@Laczen
Copy link
Collaborator

Laczen commented Nov 2, 2023

@de-nordic,
your proposal clearly states that it is intended to replace the flash API. From your description it seems that this API is intended to replace not only flash but also eeprom, fram, mram, rram, bbram, retention, ... Are you sure that all of them can be covered by the proposed API and wether it is opportune to do that (e.g. flash already has ioctl for protection on some devices and the ioctl calls for bbram will be completely different).

It seems more appropiate to create a layer above the existing types of devices to utilize them in a standardized way. This would just be building on proven code and does not involve any risk.

@henrikbrixandersen
Copy link
Member

I do not think we should go down the road of adding ioctl-like calls to Zephyr drivers. Linux made this mistake many years ago, let's not repeat that.

IOCTLs seems to make a generic API, but in reality - since you can pass just about anything - it really makes for a driver specific API, which makes it very difficult to write driver-agnostic application code.

I also share the concern raised by @Laczen above in trying to embrace so many different technologies under one API. I see lots of potential issues with very little gain. You simple cannot treat all of these technologies the same when it comes to wear leveling, erasure, write time, etc.

I still, as voiced often enough, believe implementing technology-dedicated backends to the various subsystems using storage to be a much better solution. That still gives us compile-time early fail opportunities.

@de-nordic
Copy link
Collaborator Author

de-nordic commented Nov 6, 2023

@de-nordic, your proposal clearly states that it is intended to replace the flash API. From your description it seems that this API is intended to replace not only flash but also eeprom, fram, mram, rram, bbram, retention, ... Are you sure that all of them can be covered by the proposed API and wether it is opportune to do that (e.g. flash already has ioctl for protection on some devices and the ioctl calls for bbram will be completely different).

Retention is use case, the others yes I would like to replace with single base API that will have common write, read and IOCTL. The IOCTL would be used to implement characteristics that are not covered by common calls.
The reason I want to divide the IOCTL ID range in Zephyr and Vendor space is that once we see how different vendor operations are implemented we can sit down and create common Zephyr ID that would cover devices, basing our design on information gathered from implementation and usage of vendor specific does.
We area already doing this in Flash API, where we have vendor specific codes for STM32 readout protections and one common code for device reset.
Users that need certain characteristic function of a device will call it one way or another, our goal is to do it via common interface, and drop implementation into driver, rather than hal functions or additional syscalls that would be created for specific SoC/Devices (which already happens https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/flash/nrf_qspi_nor.c#L1469-L1480).

It seems more appropiate to create a layer above the existing types of devices to utilize them in a standardized way. This would just be building on proven code and does not involve any risk.

To add additional layer you have to first multiplex all the APIs into that additional layer, and then provide user a way to demultiplex, because the fact that devices have different characteristics will not change.
So basically you drop everything under one API then provide function to check what device is underneath and user ends up in doing:

        // ops required by TECH_A;
       ....
    } else if (device_kind_is(dev) == TECH_B) {
        // extra ops ops for TECH_B
    }  else {
    }

    /* Now other stuff */
    if (device_kind_is(dev) == TECH_B && write_block_size(dev)) {
        ...
    }

Instead you can just multiplex all devices into API that then will distinguish them by characteristics (features):

    if (device_requires_erase(dev)) {
        ...
    }

    if (write_block_size(dev) > 1) {
       ...
    }

Note also that when your code needs to support more than one device, for example you are placing FAT on external ReRAM and internal SoC FLASH, then such code is smaller and cleaner when decisions are taken basing on characteristics rather than technology.

Additional layers or APIs do not help in cutting on flash size or when XIP is in use. Note that when adding additional layer you will have that layer and every API that is required by devices in the design, but when you have single API that is all you have.

@de-nordic
Copy link
Collaborator Author

I do not think we should go down the road of adding ioctl-like calls to Zephyr drivers. Linux made this mistake many years ago, let's not repeat that.

We will got the hal way then, because people will try to use whatever chip supports. Also if we do not let to expose extra features of chips what would be a reason to support Zephyr by vendors? Vendors sell chips by extra features also.
The IOCTL can be divided into Zephyr and Vendor space, which Flash API already does, and we can use it to observe what features are provided and how they are used so that we can later standardize new Zephyr IDs for IOCTL that would provide common interface to these features.
OpenGL worked with extensions quite well.

IOCTLs seems to make a generic API, but in reality - since you can pass just about anything - it really makes for a driver specific API, which makes it very difficult to write driver-agnostic application code.

Yes and it is not possible to write driver-agnostic code when you select specific chip, basing on its extra features, while designing your embedded hardware. We can only try to invest into making the common code, common subsystems, to support as many different features as possible.

I also share the concern raised by @Laczen above in trying to embrace so many different technologies under one API. I see lots of potential issues with very little gain. You simple cannot treat all of these technologies the same when it comes to wear leveling, erasure, write time, etc.

Neither of our API is aware of wear leveling or addresses it because this is higher function, unless you have NAND controller with such a feature, done in software. And again the wear leveling does not depend on technology: you may have higher or lower endurance flash devices or your endurance may depend on implicit write block size (AT25M02).

We have nearly no support for device timings and that information is not needed by user, as ,in most cases, the timing has to be exchanged between device driver and controller; API functions will return with timeout anyway.

For example if page erase on flash device takes ~10ms, that information should be known by the peripheral that will send the command and will be waiting for response.

Wear balancing does not depend on number of write/erase cycles but rather on how your software is able to move structures around to not repeat writes/erases of single area within device: with badly written software you can kill 10k cycle and 1.2M cycle device sooner than expecting.

I still, as voiced often enough, believe implementing technology-dedicated backends to the various subsystems using storage to be a much better solution. That still gives us compile-time early fail opportunities.

We can have run-time init checks on subsystems that depend on specific characteristics that will fail on start too.

It makes harder to support more than one type of device at once in various subsystem if you have to switch between API, which may also happen at run-time.
For example FAT driver does not require erase, Flash does, but if you place FAT on internal SoC flash and external EEPROM device, now your code needs to do:

if (device_is(dev) == FLASH) {
   flash_erase(dev);
}
...

if (device_is(dev) == EEPROM) {
   eeprom_write(...);
} else if (device_is(dev) == FLASH) {
   flash_write(...);
}

Or needs implementation of additional callback object that will redirect calls properly... which is either another layer, or what I am proposing now anyway.

Same with DFU, Stream Flash, you name it. And these things are within Zephyr, so if we need to make DFU work with different technologies we would have to make it work with all of them at once, because aside from SoC FLASH (or MRAM, FRAM, EEPROM) user may connect external RRAM (or MRAM, FRAM, FLASH, EPROM) and DFU something here and something there.

Yes we do not have proper wear balancing file system, but that is required by data recording and retrieving and we already have to let users know that for example FATFS is not so good for internal soc with 4k pages, but will be used anyway.

@Laczen
Copy link
Collaborator

Laczen commented Nov 7, 2023

@de-nordic, your proposal clearly states that it is intended to replace the flash API. From your description it seems that this API is intended to replace not only flash but also eeprom, fram, mram, rram, bbram, retention, ... Are you sure that all of them can be covered by the proposed API and wether it is opportune to do that (e.g. flash already has ioctl for protection on some devices and the ioctl calls for bbram will be completely different).

Retention is use case, the others yes I would like to replace with single base API that will have common write, read and IOCTL.

This would mean reworking all drivers and does not add any real benefit. For the retention I meant retained_memory.

It seems more appropiate to create a layer above the existing types of devices to utilize them in a standardized way. This would just be building on proven code and does not involve any risk.

To add additional layer you have to first multiplex all the APIs into that additional layer, and then provide user a way to demultiplex, because the fact that devices have different characteristics will not change. So basically you drop everything under one API then provide function to check what device is underneath and user ends up in doing:

        // ops required by TECH_A;
       ....
    } else if (device_kind_is(dev) == TECH_B) {
        // extra ops ops for TECH_B
    }  else {
    }

    /* Now other stuff */
    if (device_kind_is(dev) == TECH_B && write_block_size(dev)) {
        ...
    }

Instead you can just multiplex all devices into API that then will distinguish them by characteristics (features):

    if (device_requires_erase(dev)) {
        ...
    }

    if (write_block_size(dev) > 1) {
       ...
    }

Note also that when your code needs to support more than one device, for example you are placing FAT on external ReRAM and internal SoC FLASH, then such code is smaller and cleaner when decisions are taken basing on characteristics rather than technology.

This is an implementation detail. It is perfectly possible to create a property on the "additional" layer that contains information like the "erase-needed" and do the same construct as proposed in the code above (other properties like erase-block-size and write-block-sizecould also be added even in 1 32 bit property value: a 8 bit power of 2 for erase-block-size, a 8 bit power of 2 for write-block-size, and 16 bits for flags like "erase-needed", "erase-value-1/erase-value-0", "allowed-overwrite-with-0", ...).

Additional layers or APIs do not help in cutting on flash size or when XIP is in use. Note that when adding additional layer you will have that layer and every API that is required by devices in the design, but when you have single API that is all you have.

Regarding the size influence, only in very specific cases the proposal will reduce in size. Regarding XIP: if there is a request to not have any writing/erasing needs there is no need to define a flash device, the flash driver should only be used to set up XIP, any read can be done using a memcpy.
An additional layer will be needed anyhow to allow defining area's on the device.

A replacement of flash_map by a zephyr_storage_area that could be defined in dts as:

settings_area {
        compatible = "zephyr,storage_area";
        size = ....;
        write-block-size = ....; (optional)
        erase-block-size = ...; (optional)
        read-only = ...; (optional)
        backend = &eeprom0;
        backend-type = "EEPROM";
        backend-offset = ...;
}

and provide read/write and (optional) erase routines, would be a lot simpler to implement, to maintain and provide all functionality (and more as it provides the region definitions) as the proposed ZSAI method.

@de-nordic de-nordic added the dev-review To be discussed in dev-review meeting label Nov 8, 2023
@MaureenHelm
Copy link
Member

The main goal is to replace the Flash API and allow Zaphyr and user code to run and operate on devices based on different technologies without need to switch between APIs

Replacing the Flash API has wide-ranging implications and should be discussed in the Architecture WG. @carlescufi will you please add to the agenda?

@MaureenHelm MaureenHelm removed the dev-review To be discussed in dev-review meeting label Nov 8, 2023
@carlescufi
Copy link
Member

The main goal is to replace the Flash API and allow Zaphyr and user code to run and operate on devices based on different technologies without need to switch between APIs

Replacing the Flash API has wide-ranging implications and should be discussed in the Architecture WG. @carlescufi will you please add to the agenda?

It's added for today @MaureenHelm

@teburd
Copy link
Collaborator

teburd commented Nov 14, 2023

This seems to assume a one operation at a time blocking API, won't this limit the ability to batch up changes to storage? E.g. a sequence of reads and writes could potentially be batched to the device in some manner with DMA in many cases perhaps saving round trips and context switches

This may not have been much of an issue with say eeproms or flash where round trip latencies can be dwarfed by the storage medium, but with something like fram where the storage medium is very fast and low latency, waiting on context swap and round trip of a spi/i2c transfer seems like quite an overhead to be considered.

From the fujitsu fram pdf https://www.fujitsu.com/uk/Images/C29-FRAM-Factsheet-170802-0959.pdf a screenshot of the timing table is pretty telling.

Screenshot from 2023-11-14 10-57-27

@carlescufi
Copy link
Member

Architecture WG

  • @de-nordic explains how the different types of storage technology have different requirements, goes through the table int he description
  • @jhedberg mentions the addition of NVME to the tree and whether this could be part of the ZSAI API as well. @de-nordic answers that he will take a look and investigate that
  • @bjarki-trackunit mentions that only flash requires erase, so it may make sense to have 2 APIs: flash and non-erase devices. @de-nordic answers that this would require conditional compilation all over the place
  • @jfischer-no mentions that NVME is actually a disk driver, and uses the disk driver API
  • @danieldegrasse asks what kind of storage systems are planned to use with the new API. @de-nordic responds that it depends on the nature of each technology, but raw access is necessary for things like downloading an image into flash or swapping images with a bootloader
  • @de-nordic mentions that the disk access layer can work with anything, but this layer's purpose is completely different: it allows for low-level raw access to the non-volatile memory
  • @henrikbrixandersen has a concern about whether this is the right thing to do. If we introduce a layer like this it may give the false impression that Zephyr has figured it out, but in reality it will lead to degraded performance
  • @nashif has concerns about a new driver class that encapsulates the differences among NV technologies
  • @danieldegrasse is concerned that the storage layers above the flash driver (littlefs, nvs, fcb) need to adapt to the differences between flash and MRAM/RRAM before we can propose an API that encapsulates those together

@jfischer-no
Copy link
Collaborator

There will be a lot of changes required to code that is currently using flash API, as the main goal is to replace this API.

Flash support is already there, with all the abstractions to specific layers like disk access, DFU, NVS. Replacing it with another API that covers everything is too much, I am very sceptical that this task will ever be finished.

I also do not see the benefit of including SRAM, this can easily be done at a higher level, e.g. RAM disk driver is quite compact (and I actually think RFMRAM disk will not be much larger). Including EEPROM has even less benefit IMHO, it is a more constrained device than flash and SRAM together.

My suggestion is to reduce the RFC to [RFM]RAM devices and go with the RFMRAM driver API and add specific layer drivers/abstractions/backends where needed, e.g. disk access driver or NVS/DFU backend that works on top of the RFMRAM driver API. This way you are also not blocked by any deprecations, you can start with more practical use cases like DFU/NVS and add things like disk driver later.

@de-nordic
Copy link
Collaborator Author

Architecture WG

* @de-nordic explains how the different types of storage technology have different requirements, goes through the table int he description

* @jhedberg mentions the addition of NVME to the tree and whether this could be part of the ZSAI API as well. @de-nordic answers that he will take a look and investigate that

I think that NVME and NAND can be pulled here, but I these are rather block devices and they differ in access from the other devices here. The devices here have byte alignment access, while NVME and NAND do not. You have to address blocks in them, so they are more suitable for Disk subsystem in that matter.
There is also low chance of making them source of code for execution, because you could only pull blocks so this would make bus, whatever that would be between SoC and device, occupied for longer and with rather smaller I caches in most of devices and transitions by blocks that would make a lot of cache invalidation and swaps. Note that even on X86 device cache works by lines of 64bytes.

* @bjarki-trackunit mentions that only flash requires erase, so it may make sense to have 2 APIs: flash and non-erase devices. @de-nordic answers that this would require conditional compilation all over the place

There are actually three possible characteristics here: supports, requires and auto-erase. Note also that erase is required when flipping bit from "erase state" to opposite, and some device allow that once per write block size(these with ECC always), some twice or have some very high limit.
Example of device that supports and will auto-erase is 25AA512 - EEPROM device. This device has erase command, and can erase page or chip, but you can just write and it will erase the range you write on, only offsets you write on.
Flash requires erase only to swap 0 to 1, and some flash devices can have that happening multiple times over range of bytes, other once. This depends on multiple factors, and this can not be reflected by any API, but is characteristic that, when understood by engineer, may be used with device of specific characteristic in mind.

* @jfischer-no mentions that NVME is actually a disk driver, and uses the [disk driver API](https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/storage/disk_access.h)

Yes agree. Same with NAND.

* @danieldegrasse asks what kind of storage systems are planned to use with the new API. @de-nordic responds that it depends on the nature of each technology, but raw access is necessary for things like downloading an image into flash or swapping images with a bootloader

We have no universal storage. The only universal part we have is storing and executing code in raw.
Problems with storage that we have like nvs, ext2, fatfs or littlefs is that all of them have been written with specific device in mind, so some subset of characteristics that has been given a name.
We are currently able to place any of them on flash, although ext2 and fatfs do not work nicely with flash that works with 4k pages. These file systems can work with 4k sectors, that would reflect probably the most common page size, but in case of fatfs this used to work bad with "bigger" OSes, which were used to 512B sector.
We do not have universal file system, we never had, we have bent what we had to work with what we support.

* @de-nordic mentions that the disk access layer can work with anything, but this layer's purpose is completely different: it allows for low-level raw access to the non-volatile memory

Exactly. What I am trying to propose here is to make uniform API to access devices on the lowest level, with the most common functions, read and write, exposed directly and everything else via ioctl, and provide set of functions/macros to allow us to make the functionalities we provided with Zephyr to work with the characteristics that are most common across devices, and allow us to extend support to things that we can make common API for in the future (for example support for r/w locks).

* @henrikbrixandersen has a concern about whether this is the right thing to do. If we introduce a layer like this it may give the false impression that Zephyr has figured it out, but in reality it will lead to degraded performance

That is problem with how we advertise Zephyr. IMHO Zephyr is that little thing in the corner of my ROM that allows me to have Bluetooth and the basic storage access without having to implement these, but I still have to take care of details. I need to understand what I am doing, and this Zephyr will not solve. Zephyr only offers you the basic things, so you would not have to write you own driver for MX25, but you can still kill the device, if you do not understand it and it will still drain power when added to design. If you want, for example, that MX25 turned off by external FET controlled via GPIO, when not used by FATFS, you are on your own, because however good PM support we have, you still need to connect the dots.

* @nashif has concerns about a new driver class that encapsulates the differences among NV technologies

The differences between technologies are described in the table. Does it matter whether technology is eeprom or flash, if it requires erase? If you have SPI connected SRAM that has battery handing from it all the time, does it differ from eeprom?
If your device has write-block-size requirement, is it eeprom or flash?
In all these cases what we offer is how to access the device and do basic operation, it is on the designer of a device to know how these operation affect device selected for the design. We will not solve that with Zephyr.
Even desktop oses haven't figured that out, that is why you would use different software to tune WD or IBM HDD, as they have different custom commands for that purpose.

* @danieldegrasse is concerned that the storage layers above the flash driver (littlefs, nvs, fcb) need to adapt to the differences between flash and MRAM/RRAM before we can propose an API that encapsulates those together

We have not been able to do that so far. We have VFS that provides as uniform access to file systems, but so far we have not been able to add to the tree file systems that could be efficiently used across all device characteristics.
We have LittleFS and NVS that provide wear balancing, but are not fine with devices without erase. We have FAT FS that works great with devices without erase and ok with these that have it (depending on page size), but fails at wear balancing.
So far I think people understand limitations of these and are able to choose them properly.

@de-nordic
Copy link
Collaborator Author

There will be a lot of changes required to code that is currently using flash API, as the main goal is to replace this API.

Flash support is already there, with all the abstractions to specific layers like disk access, DFU, NVS. Replacing it with another API that covers everything is too much, I am very sceptical that this task will ever be finished.

Yes, the task is hard. But adding another layer is not a solution.

I also do not see the benefit of including SRAM, this can easily be done at a higher level, e.g. RAM disk driver is quite compact (and I actually think RFMRAM disk will not be much larger). Including EEPROM has even less benefit IMHO, it is a more constrained device than flash and SRAM together.

Constrained in what? It is usually slower, true, but size depends on how much you want to spend, of course it does not reach FLASH NOR levels, but still you can get 256kx8 device, which is already able to handle FAT FS.

SPI RAM devices can be used as external buffers for data between devices, or for storing frequently update data in logging devices, before that data gets filtered and dumped to permanent storage or exfiltrated other way.
They do not have to be mapped to device address space, and often this is not possible as mapping provides only read access.

My suggestion is to reduce the RFC to [RFM]RAM devices and go with the RFMRAM driver API and add specific layer drivers/abstractions/backends where needed, e.g. disk access driver or NVS/DFU backend that works on top of the RFMRAM driver API. This way you are also not blocked by any deprecations, you can start with more practical use cases like DFU/NVS and add things like disk driver later.

Can do that to. Further or later, other API will have to be swallowed by that anyway as what you will have is:

  1. FLASH: no erase (and some common characteristics with others)
  2. RFMRAM: no-erase/erase (and some common characteristics with others)
  3. EEPROM: no-erase (and some common characteristics with others, and how does the 25AA512 fits here, huh?)

For some time though you will have in code something like this:

    if (device_is_flash_api_device(dev)) {
        // code for flash devices
    } else if (device_is_rfmram_api_device(dev)) {
         if (rfmram_device_requires_earse(dev) || rfmram_device_supports_erase(dev)) {
            // the erase code
         } else {
            // no erase code
         }
   }

@de-nordic
Copy link
Collaborator Author

This seems to assume a one operation at a time blocking API, won't this limit the ability to batch up changes to storage? E.g. a sequence of reads and writes could potentially be batched to the device in some manner with DMA in many cases perhaps saving round trips and context switches

This may not have been much of an issue with say eeproms or flash where round trip latencies can be dwarfed by the storage medium, but with something like fram where the storage medium is very fast and low latency, waiting on context swap and round trip of a spi/i2c transfer seems like quite an overhead to be considered.

From the fujitsu fram pdf https://www.fujitsu.com/uk/Images/C29-FRAM-Factsheet-170802-0959.pdf a screenshot of the timing table is pretty telling.

Screenshot from 2023-11-14 10-57-27

The DMA is served by SoC peripheral rather than the device mentioned here. Most of users will not strictly write thins by byte, will just pass buffer to API and that is passed to driver. Driver here should recognize here how much data it can stuff in single operation, in case of this device there is no limit as far as I can tell, but will be held down by the DMA peripheral anyway.
Figuring out how to organize transfer is something between the device driver, that knows the device characteristics, and ability of the driver to figure out what the bus can let it do.
If you want to address these directly you have to know how to utilize the DMA with the device you are using, this is something the API, no API, can help you with. There is also limit on what you can do in system that is supposed to be multithreaded, because if you even figure out how to write the FRAM device quickly in a single instruction you may take the SPI bus for so long that other stuff will time out.

@teburd
Copy link
Collaborator

teburd commented Nov 21, 2023

The DMA is served by SoC peripheral rather than the device mentioned here. Most of users will not strictly write thins by byte, will just pass buffer to API and that is passed to driver. Driver here should recognize here how much data it can stuff in single operation, in case of this device there is no limit as far as I can tell, but will be held down by the DMA peripheral anyway. Figuring out how to organize transfer is something between the device driver, that knows the device characteristics, and ability of the driver to figure out what the bus can let it do. If you want to address these directly you have to know how to utilize the DMA with the device you are using, this is something the API, no API, can help you with. There is also limit on what you can do in system that is supposed to be multithreaded, because if you even figure out how to write the FRAM device quickly in a single instruction you may take the SPI bus for so long that other stuff will time out.

I'm talking about having a batching API for doing reads/writes. I'm not sure where that was lost in translation. spi_transceive and i2c_transfer do this today already. This API seems to assume simplistic read(buf, len) and write(buf, len) type APIs and this is I think going to be pretty inefficient compared to say an API that provides transaction semantics over a list/array of operations, e.g. int storage_transaction(reads_writes_erases_ops);

@de-nordic
Copy link
Collaborator Author

The DMA is served by SoC peripheral rather than the device mentioned here. Most of users will not strictly write thins by byte, will just pass buffer to API and that is passed to driver. Driver here should recognize here how much data it can stuff in single operation, in case of this device there is no limit as far as I can tell, but will be held down by the DMA peripheral anyway. Figuring out how to organize transfer is something between the device driver, that knows the device characteristics, and ability of the driver to figure out what the bus can let it do. If you want to address these directly you have to know how to utilize the DMA with the device you are using, this is something the API, no API, can help you with. There is also limit on what you can do in system that is supposed to be multithreaded, because if you even figure out how to write the FRAM device quickly in a single instruction you may take the SPI bus for so long that other stuff will time out.

I'm talking about having a batching API for doing reads/writes. I'm not sure where that was lost in translation. spi_transceive and i2c_transfer do this today already. This API seems to assume simplistic read(buf, len) and write(buf, len) type APIs and this is I think a going to be pretty inefficient compared to say an API that provides transaction semantics over a list/array of operations, e.g. int storage_transaction(reads_writes_erases_ops);

Yes I am addressing the simplest case when read/write of a single buffer is need. And I am addressing the storage device characteristics not controller features, although nothing stops us from extending API - where addressing characteristics of different controllers may may be harder than storage technologies.

The spi_transceive and i2c_transfer are bus functions and are not characteristic to a device you have mentioned. Drivers call these, and currently they also do that in the simplest possible way.

We could probably do read_batch and write_batch, but that is heavier change than unifying the API across devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: EEPROM area: Flash RFC Request For Comments: want input from the community
Projects
Status: In Progress
Status: No status
7 participants