Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

periph/gpio: support for extension API #12333

Closed
wants to merge 11 commits into from

Conversation

gschorcht
Copy link
Contributor

@gschorcht gschorcht commented Sep 30, 2019

Contribution description

Introduction

This PR is a little rework and the completion of PRs #9860 and #9958 to implement the GPIO extension API according to the proposal in issue #9690.

After a long discussion in issue #9690, all approaches proposed by @kaspar030 (thanks again for his great ideas) in #9690 (comment) were tried out. However, it seems that the approach initially proposed by @ZetaR60 seems to be the best tradeoff of required changes, additional code/data size, performance decrease and unnoticed side effects.

Approaches in Detail

The first approach proposed in #9690 (comment) uses a structured GPIO type like the following

typedef struct {
    const gpio_ext_dev_t *dev;  /**< extension device, 0 in case of CPU pin */
    union {
        gpio_cpu_t cpu_pin;     /**< pin number in case of CPU GPIO pin */
        gpio_ext_t ext_pin;     /**< pin number in case of GPIO extension device */
    } pin;
} gpio_t;

and would allow a completely invisible integration of extension devices by using the GPIO_PIN macro for CPU GPIOs as well as GPIO extension pins. That's really great. But, it required to change 275 existing files. Furthermore, a number of unsolved problems in assignment and comparison of structured pin definitions remained. To solve these assignment and comparison problems, all CPU specific and drivers would have to be changed to use the structured gpio_t consequently what seems to be very risky. And finally, the code/data size was more than with the implementation of this PR, e.g., for ATmega2560:

                    text    data   bss     dec
this PR             6762    690    933    8385
structured gpio_t   6734   1046    933    8713

Although the second approach proposed in #9690 (comment) that uses the structured GPIO type too, but uses a separate GPIO_EXT_PIN macro for GPIO extension pins could solve most assignment and comparison problems, there are still 275 files to be changed (without device drivers). Even though the code size/data could be reduced, e.g., for ATmega2560

                      text    data   bss     dec
this PR               6762    690    933    8385
structured gpio_t 1   6734   1046    933    8713
structured gpio_t 2   6482   1666    933    8081

the approach has still a high risk of side effects caused by the changes.

Approach in this PR

The approach implemented in this PR uses the existing gpio_t value type as follows.

 MSB                                                                          LSB
+-----+---+------------------------------------------+---------------------------+
|     | 1 |            device number                 |          pin number       |
+-----+---+------------------------------------------+---------------------------+
        ^
        | |<-- (GPIO_EXT_BIT - GPIO_EXT_PIN_BITS) -->|<-- (GPIO_EXT_PIN_BITS) -->|
        |
  GPIO_EXT_BIT

A single bit in gpio_t value is used (by default the MSB) to divide between CPU GPIO pins and GPIO extension pins. This bit is used for redirecting a call gpio_* to the extension API. The only change in CPU specific implementations is that the gpio_* functions were renamed to gpio_cpu_*. For the use of GPIO extension pins, macro GPIO_EXT_PIN is introduced. GPIO_PIN macro is used for CPU GPIO pins as before.

This approach makes the introduction of the extension API invisible for existing code. Therefore, only 24 files had to be changed and there is no risk of side effects.

The additional code/data sizes required if GPIO extension devices are used (module extend_gpio enabled) for all architectures are (measured with tests/periph_gpio):

CPU Board w/o with diff
atmega1284p mega-xplained 15089 15829 740
atmega2560 arduino-mega2560 15917 16691 774
cc2538 cc2538dk 19288 19864 576
cc26x0 cc2650stk 18248 18796 548
efm32 stk3600 22656 23284 628
esp32 esp32-wroom-32 84341 85021 680
esp8266 esp8266-esp-12x 82995 83551 556
slwstk6220a slwstk6220a 18024 18652 628
fe310 hifive1 23293 24925 1632
kinetis mulle 27368 28204 836
lm4f120 ek-lm4f120xl 19388 19920 532
lpc1768 seeeduino_arch-pro 18788 19468 680
lpc2387 msba2 118844 119604 760
mips_pic32mx pic32-clicker 87720 88328 608
mips_pic32mz pic32-wifire 123556 124040 484
msp430fxyz wsn430-v1_3b 13262 13988 726
nrf51 nrf51dk 18328 18884 556
nrf52 nrf52dk 18300 18848 548
sam3 arduino-due 18344 18904 560
samd21 arduino-mkr1000 18852 19464 612
samd5x same54-xpro 19484 20112 628
saml1x saml10-xpro 18436 19048 612
saml21 saml21-xpro 18648 19304 656
stm32f0 nucleo-f030r8 18656 19288 632
stm32f1 iotlab-m3 18992 19656 664
stm32f2 nucleo-f207zg 18960 19608 648
stm32f3 nucleo-f302r8 18904 19536 632
stm32f4 nucleo-f410rb 19064 19712 648
stm32f7 nucleo-f746zg 19128 19776 648
stm32l0 nucleo-l031k6 18448 19044 596
stm32l1 nucleo-l152re 18668 19316 648
stm32l4 nucleo-l476rg 18928 19560 632

To check for collisions of GPIO extension pin values with CPU GPIO pin values, the PR also includes a compile time test tests/periph_gpio_coll.

Testing procedure

Compile time test tests/periph_gpio_coll has to succeed.
tests/extend_gpio has to succeed.

Issues/PRs references

Rework of PR #9860 and PR #9958
Implements issue #9690 for GPIOs

@gschorcht gschorcht added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Impact: major The PR changes a significant part of the code base. It should be reviewed carefully labels Sep 30, 2019
Low level `gpio_*` functions that are realized by CPUs are renamed to `gpio_cpu_*`. The declaraions of GPIO API functions `gpio_*` are replaced by inline functions. If the `extend_gpio` module is not enabled, these functions simply forward the calls to `gpio_cpu_*` functions. If module `extend_gpio` is enabled, these functions are overridden by the `extend_gpio` module.
@gschorcht gschorcht added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Sep 30, 2019
Low level `gpio_*` functions that are realized by CPUs are renamed to `gpio_cpu_*`.
ATmega 2560 has more than 8 port. The `GPIO_PIN` definition uses 4 bits for the pin number of a port and 4 bits for the port number.With a `gpio_t` size of 8 bit, there is no space for extesion API bit. Therefore, the size of gpio_t is changed to 16 bit.
Compile time tests for collisions of `gpio_t` values as generated by the CPU and `gpio_t` values used by the GPIO extension API.
Defines an example GPIO extension device configuration using software extension devices for testing that GPIO extension API is working.
@gschorcht gschorcht added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Sep 30, 2019
@gschorcht
Copy link
Contributor Author

gschorcht commented Oct 13, 2019

@kaspar030 Next try 😉

return;
}
#ifdef MODULE_PERIPH_GPIO
gpio_toggle(pin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be gpio_cpu_toggle(pin);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups, of course. I thought I fixed it already after a static test failed.

@maribu
Copy link
Member

maribu commented Oct 14, 2019

Have you checked what the impact on RAM/ROM would be, if the lower level interface wouldn't be split up into the extension and periph case, but would use the struct for every implementation? Having the same lower level API could make some things cleaner and simpler, so maybe the ROM overhead of having an additional struct in ROM would be compensated by the simplification it yields?

However, calling gpio_cpu_<foo>() instead of having the indirect call will yield some performance benefit in the no LTO case. (But I'd would be that the difference in runtime of an indirect function call to a direct function call is insignificant, while the difference of a LTO off and LTO on will reduce the runtime in orders of magnitude. So anyone really caring about performance will have to use LTO anyway.)

For network drivers we already have a single lower level API with the struct in netdev_driver_t (which is also used for peripheral netdevs like in nRF52), and this approach seems to be much cleaner to me. So if the same approach could be used for GPIOs, I2C, SPI, EEPROM, etc. with very little overhead in the non-LTO and no overhead in the LTO case, this would be a great improvement in my opinion.

@gschorcht
Copy link
Contributor Author

Have you checked what the impact on RAM/ROM would be, if the lower level interface wouldn't be split up into the extension and periph case, but would use the struct for every implementation? Having the same lower level API could make some things cleaner and simpler, so maybe the ROM overhead of having an additional struct in ROM would be compensated by the simplification it yields?

In principle, the first approach structured gpio_t 1 uses almost a direct call if module extend_gpio is enabled.

    if (pin.dev) {
        return pin.dev->driver->init(pin.dev->dev, pin.pin.ext_pin, mode);
    }
#if MODULE_PERIPH_GPIO
    return gpio_cpu_init(pin.pin.cpu_pin, mode);
#endif
    return -1;

I tested this only for Atmega 2560 since I had too many open problems with the other architectures. Even it produces a bit less code size, it requires much more RAM. Not really clear why.

Using pin.dev->driver->* also for low level drivers was @kaspar030's third proposed approach. Although it would be more clear and consistent, it seems to be almost impossible to change it in that way. In difference to netdev drivers, GPIOs have no limited scope. They are used in peripheral interface drivers, in existing applications and of cause in all kind of drivers. We have a lot of comparison of scalars like == GPIO_UNDEF or == SPI_CS, a lot of assignments in static paramer definitions and a lot of places where the scalar value of gpio_t is directly used, .e.g., to assign it to registers. We would have to change too much and the risk seems to be high if not unpredictable.

Therefore @ZetaR60 proposed the approach to redirect the call in case of extension GPIOs. It is the approach with minimum changes and minimum risk since everything keeps working as. The price is that the performance slightly reduces and the code size increases at bit.

@maribu
Copy link
Member

maribu commented Oct 14, 2019

In principle, the first approach structured gpio_t 1 uses almost a direct call if module extend_gpio is enabled.

Calling via function pointer is an indirect call, calling via symbol name is a direct call. An indirect has several disadvantages in terms of performance over a direct one:

  • The address of the function to call is in general not a compile time (or better link time) constant, so machine code generated to call the address is in general a bit longer
  • In general the compiler cannot know which function is later called, which prevents a bunch of optimizations that otherwise could be done (e.g. if the implementation is available (e.g. part of the same compilation unit or LTO is enabled): inline the function, be aware that some registers that the caller should backed up are not actually used and do not need to be backed up, ...)
  • Performance critical stuff like correct branch prediction, preventing pipeline stalls etc. is much easier if address of the function to call is constant, rather than computed
  • ...

But: On embedded platforms the performance penalty should be much less as far as I know, as e.g. advances stuff like dynamic branch prediction is not widely used. (E.g. the Cortex M7 has dynamic branch prediction, but the Cortex M3, M0, M0+ do not have.)

I tested this only for Atmega 2560 since I had too many open problems with the other architectures. Even it produces a bit less code size, it requires much more RAM. Not really clear why.

I bet this is again because the ATmegas are a Harvard architecture platform rather than using a von-Neumann architecture. (E.g. on von-Neumann platforms like ARM every variable that is marked as const is put in ROM. But on ATmega variables are always stored in RAM, unless some magic PROGMEM macro is set. But as those values than have a different address spaces that regular variables, special care is required to access them.)

@gschorcht
Copy link
Contributor Author

Calling via function pointer is an indirect call, calling via symbol name is a direct call.

Oh yeah, of course, you are right, my fault. Sorry, I was a bit under time pressure when I tried to answer. So you were asking what the impact on RAM/ROM would be if we would always use

    return pin.dev->driver->init(pin.dev->dev, pin.pin.ext_pin, mode);

Right?

This is something I did not test because the changes that would be necessary are really huge and risky.

@maribu
Copy link
Member

maribu commented Oct 14, 2019

This is something I did not test because the changes that would be necessary are really huge and risky.

OK, I would like to give it a try to see whether this would work out, or whether this would impose to much overhead. I hope I can find the time for that soon. (But that is out of scope of this PR and should not delay this PR.)

Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Some comments inline. Generally, this should allow extending the GPIO API fully compatible with no performance overhead if the extension module is not loaded. In case of compile time constant gpio_t values, the compiler should optimize out the check on whether it is provided via gpio_cpu_<foo>() or an GPIO extender even without LTO enabled.

The only disadvantage I can see is that the ABI has changed. I have absolutely no idea if anyone needs ABI compatibility, or if this is a non-issue. In case it is desired to have ABI compatibility: In the case that the extension API is not used it could be achieved by telling the linker that symbol gpio_<foo> is an alias for symbol gpio_cpu_<foo>. In case that the extension API is used, backward ABI compatibility is more involved. (And one could argue that the ability to use GPIO extenders with the GPIO API is new, so code compiled prior to the addition of this API and wanting ABI compatibility never worked GPIO extenders, so there is no regression).

#define SPI0_CS0_GPIO GPIO15 /**< HSPI / SPI_DEV(0) CS default pin, only used when cs
parameter in spi_acquire is GPIO_UNDEF */
#define SPI0_CS0_GPIO GPIO15 /**< HSPI / SPI_DEV(0) CS default pin, only used when cs
parameter in spi_acquire is GPIO_UNDEF */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated. (If you split this up into a separate PR, I'll ACK and merge that right away.)

@@ -35,19 +35,19 @@ extern "C" {
* @{
*/
#define HAVE_GPIO_T
typedef uint8_t gpio_t;
typedef uint16_t gpio_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to have a performance penalty, as the ATmegas are 8bit. I think the defines in gpio_ext_h already are using sizeof(gpio_t), so would it be possible to keep that uint8_t for 8 bit platforms? (If the cost is that only one GPIO extender could be used there, I bet that users of the ATmegas would take that trade-off. And if not, they could use CFLAGS += -include <header_with_custom_gpio_t_definition.h> -DHAVE_GPIO_T to use more GPIO extenders.)

But if this increases maintenance effort and/or lines of code more than a bit, I'd say using 16bit on ATmesags is fine. (The Arduino community are also using int (=16 bit) for digitalRead() and digitalWrite() - likely due to not paying attention when defining the API. And their user base seems to be not upset with that.)

Copy link
Contributor Author

@gschorcht gschorcht Oct 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. But ATmega 2560 defines 10 ports and GPIO_CPU_PIN(x, y) is defined by ((x << 4) | y). That is, with uint8_t there is no space left for the extension flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe then 8 bit unless ATmega2560 is used?

Copy link
Contributor Author

@gschorcht gschorcht Oct 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be possible in that way.

cpu/atmega_common/periph/gpio.c Show resolved Hide resolved
* @}
*/

#if MODULE_EXTEND_GPIO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be #ifdef instead of #if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since enabled modules lead to defintions like

#define MODULE_EXTEND_GPIO 1

in <app>/bin/<board>/riotbuild/riotbuild.h, the simple #if works as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more worried about the case the module is not used, in which MODULE_EXTEND_GPIO is not defined

Comment on lines 24 to 33
#if MODULE_EXTEND_GPIO

#include "periph/gpio.h"
#include "extend/gpio.h"
#include "gpio_ext_conf.h"

#define ENABLE_DEBUG (0)
#include "debug.h"

#endif /* MODULE_EXTEND_GPIO */
Copy link
Member

@maribu maribu Oct 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked the #includeed headers yet, but is this seems to be empty. Is this file needed? (Update: It is needed.)

If I recall correctly, the C standard forbids an empty complication unit. So you should add typedef int dont_be_pedantic; at least in the case MODULE_EXTEND_GPIO is not defined to have the file confirming with the standard. (Even though both GCC and clang don't care about empty compilation units, I believe there is no harm in being strictly compliant with the standard.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not remember, but there must have been a reason why I added this empty .c file. Maybe it had something to do with the order in which gpio_ext_conf.h was included. I will check it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the file could be removed.

drivers/extend/gpio_ext_notsup.c Outdated Show resolved Hide resolved
drivers/extend/include/gpio_ext_conf.h Show resolved Hide resolved
drivers/include/extend/gpio.h Show resolved Hide resolved
@gschorcht
Copy link
Contributor Author

The only disadvantage I can see is that the ABI has changed.

ABI?

@gschorcht
Copy link
Contributor Author

gschorcht commented Nov 7, 2019

gpio_driver_t {
    void set(gpio_port_t port, gpio_mask_t pins);
    void clear(gpio_port_t port, gpio_mask_t pins);
    gpio_mask_t read(gpio_port_t port);
};

@maribu There is a further problem with this approach. The goal was to have the same low level interface for all devices to be able to use GPIO extensions in the same way as MCU GPIOs.

But using the same mask for all devices will not be possible, for example, if a GPIO extender offers 16 pins per port but the MCU only 8 bit and vise versa. Also GPIO extenders of different size would be a problem.

@gschorcht
Copy link
Contributor Author

@maribu I went through all gpio.c implementations to check whether it would be possible at all to implement 'masked' gpio low level functions.

  1. gpio_clear and gpio_set would be possible with pin masks for all platforms.

  2. Some platforms implement gpio_read as

    if (dir_reg & (1 << pin))
        return in_reg & (1 << pin)
    else
        return out_reg & (1 << pin)

    The realization with a pin mask would require two register accesses.

    return ((dir_reg & in_reg) | (~dir_reg & out_reg)) & pin_mask

    This will probably lead to a performance decrease for single pins.

  3. Most platforms implement gpio_write as

    if (value)
        set_reg & (1 << pin)
    else
        clr_reg & (1 << pin)

    Realization with a pin mask would require again two register accesses.

    set_reg & pin_mask & value_mask
    clr_reg & pin_mask & ~value_mask
  4. Most platforms implement gpio_toggle as

    if (gpio_read(pin))
        gpio_clear(pin)
    else
        gpio_set(pin)

    This would probably have to be implemented as

    tmp = gpio_read(pin_mask)
    gpio_clear(tmp & pin_mask)
    gpio_set(~tmp & pin_mask)
  5. Some platforms would also allow to realize gpio_init, gpio_irq_enable and gpio_irq_disable with pin masks.

@maribu What do think about the following questions:

  1. Which low-level functions should be exposed with a pin mask at all?
  2. Would it make sense to change the existing low level functions for pin masks or should we rather define gpio_set_mask and gpio_clear_mask as additional functions? It seems to be more consistent to have gpio_* functions for single pins and gpio_*_mask functions for multiple pins.
  3. What should be the size of the mask? IMO, it should correspond to the port width.
  4. How should we deal with different port widths of different devices.

@gschorcht
Copy link
Contributor Author

Ping @maribu Did you have some time to take a look on my comment #12333 (comment)?

@gschorcht
Copy link
Contributor Author

gschorcht commented Nov 15, 2019

@kaspar030 @maribu At the moment I have some time left and I'm willing to invest it in the extension API. Therefore, I would like to know how to proceed and how to proceed with the suggestion of the masked GPIO API at a low level.

I am very interested in an extension API and have spent a lot of time in this PR. Therefore again the question, whether this PR wouldn't make sense as a short-term solution.

@gschorcht
Copy link
Contributor Author

There is obviously no interest on GPIO extensions 😟

@gschorcht gschorcht closed this Nov 17, 2019
@gschorcht gschorcht deleted the periph/gpio/ext_api branch November 17, 2019 18:07
@maribu
Copy link
Member

maribu commented Nov 17, 2019

There is obviously no interest on GPIO extensions 😟

I disagree. But sadly I currently do not have an answer to what the best approach would be. This is a non-trivial task, and sadly I currently have a bunch of other deadlines to fulfill.

@maribu
Copy link
Member

maribu commented Nov 17, 2019

2\. Some platforms implement `gpio_read` as
    ```c
    if (dir_reg & (1 << pin))
        return in_reg & (1 << pin)
    else
        return out_reg & (1 << pin)
    ```

The reason here is that gpio_read() is allowed to be called on pins configured as output. If the API would have a precondition that the pin must be in input mode when calling gpio_read(), a single access would be enough.

3\. Most platforms implement `gpio_write` as
    ```c
    if (value)
        set_reg & (1 << pin)
    else
        clr_reg & (1 << pin)
    ```

For peripheral ports this should always be a single access as far as I know. E.g. on SAMD this is implements gpio_write() like you said. But there are 4 registers available to manipulate the GPIOs of a port: One to write the given values of the pins, one to clear the given pins, one to set the given pins, and one to toggle the given pins. The first register could be chosen to implement the access in a single write.

4\. Most platforms implement `gpio_toggle` as
    ```c
    if (gpio_read(pin))
        gpio_clear(pin)
    else
        gpio_set(pin)
    ```

This is again an implementation choice. The toggle port register would allow doing this in a single access. (I think the motivation was to sacrifice performance to improve readability of the code.)

1. Which low-level functions should be exposed with a pin mask at all?

I'd say the following interface would be pretty generic:

typedef struct {
    gpio_mask_t direction; /**< A 1 bit sets the corresponding bit to output, a zero to input */
    union {
        gpio_mask_t pull_resistors; /**< A 1 bit of a corresponding output pin is used to enable pull resistors (instead of floating) */
        gpio_mask_t open_drain; /**< A 1 bit of a corresponding input pin is used to enable open drain (instead of push-pull) */
    };
    /**
     * @brief For output pins: The pull direction. For open-drain input pins: If pull ups are used
     */
    gpio_mask_t pull_up;
} gpio_config_t;

typedef struct {
    void gpio_port_write(gpio_port_t port, gpio_mask_t values);
    void gpio_port_set(gpio_port_t port, gpio_mask_t pins_to_set);
    void gpio_port_clear(gpio_port_t port, gpio_mask_t pins_to_clear);
    gpio_mask_t gpio_port_read(gpio_port_t port);
    void gpio_port_set_config(gpio_port_t port, const gpio_config_t *config);
    void gpio_port_get_config(gpio_port_t port, gpio_config_t *config);
} gpio_driver_t;

@gschorcht
Copy link
Contributor Author

Thanks a lot for your comments 😄

I'm sorry for pushing too much, I was probably a bit too impatient 😎 The reason was that in the last 2 weeks I had a lot of time to work on it, but in the next few weeks I will not have time for it.

In addition, I have proposed an approach with this PR based on the work of @ZetaR60 a year ago. I gave a detailed explanation of why I decided to suggested this approach again. Obviously we could not find an agreement again. After that, I spent a lot of time on various performance tests to compare your and @kaspar030's suggestions.

So I just wanted to know what approach I should implement in order to spend the work in the right direction and not to waste time again.

And finally, I just wanted to know whether this PR could have been a short-term solution at minimum of changes with a minimum of risk and at acceptable performance decrease if GPIO extension API is enabled at all.

@gschorcht
Copy link
Contributor Author

The reason here is that gpio_read() is allowed to be called on pins configured as output.

I believe that his feature is essential, for example for I2C bit-banging implementations where passive HIGH is realized as OD output with pull-ups. The protocol reads bus lines after setting it to passive HIGH to check whether someone else is driving it actively to LOW to recognize arbitration losses and optionally clock stretching.

@gschorcht
Copy link
Contributor Author

Concerning gpio_write and gpio_toggle, I will check data sheets for these implementations. At least ESPs do not support toggling.

@gschorcht
Copy link
Contributor Author

1. Which low-level functions should be exposed with a pin mask at all?

I'd say the following interface would be pretty generic:

typedef struct {
    gpio_mask_t direction; /**< A 1 bit sets the corresponding bit to output, a zero to input */
    union {
        gpio_mask_t pull_resistors; /**< A 1 bit of a corresponding output pin is used to enable pull resistors (instead of floating) */
        gpio_mask_t open_drain; /**< A 1 bit of a corresponding input pin is used to enable open drain (instead of push-pull) */
    };
    /**
     * @brief For output pins: The pull direction. For open-drain input pins: If pull ups are used
     */
    gpio_mask_t pull_up;
} gpio_config_t;

typedef struct {
    void gpio_port_write(gpio_port_t port, gpio_mask_t values);
    void gpio_port_set(gpio_port_t port, gpio_mask_t pins_to_set);
    void gpio_port_clear(gpio_port_t port, gpio_mask_t pins_to_clear);
    gpio_mask_t gpio_port_read(gpio_port_t port);
    void gpio_port_set_config(gpio_port_t port, const gpio_config_t *config);
    void gpio_port_get_config(gpio_port_t port, gpio_config_t *config);
} gpio_driver_t;

Such a change of the low-level API might have consequences that we can't see at the moment. We have to be sure that all aspects and concerns are taken into account. So I wanted to ask you if you could imagine to open an RFC issue with your proposal and to invite the stakeholders to a discussion.

@maribu
Copy link
Member

maribu commented Nov 18, 2019

So I wanted to ask you if you could imagine to open an RFC issue with your proposal and to invite the stakeholders to a discussion.

Maybe even better would be a prototype implementation for two platforms as a proof of concept. That would also allow to verify that the idea is actually doable.

@gschorcht
Copy link
Contributor Author

So I wanted to ask you if you could imagine to open an RFC issue with your proposal and to invite the stakeholders to a discussion.

Maybe even better would be a prototype implementation for two platforms as a proof of concept. That would also allow to verify that the idea is actually doable.

Agreed. Who should do that. On one hand, you know at the best what your ideas were when you suggested the masked low level API and gpio_config_t configuration type. On the other hand, your proposal requires some infrastructure around, e.g., the gpio_port_t to get it working which I already have. Furthermore, you have probably no time to implement such a prototype in short term. In short, I could that with your support, if I would completely understand your ideas behind the gpio_port_t.

@gschorcht
Copy link
Contributor Author

gschorcht commented Nov 21, 2019

@maribu Are

typedef struct {
   ...
} gpio_config_t;

and

typedef struct {
   ...
    void gpio_port_set_config(gpio_port_t port, const gpio_config_t *config);
} gpio_driver_t;

thought to be a replacement of the low level gpio_init function or in addition to it?

BTW, I have started with the first small changes.

@maribu
Copy link
Member

maribu commented Nov 21, 2019

thought to be a replacement of the low level gpio_init function or in addition to it?

That would basically be the HAL on which gpio_init() could be implemented in a fully hardware independent fashion. But some (advanced and not that common) use cases could surely use them as well.

@maribu
Copy link
Member

maribu commented Nov 21, 2019

The reason here is that gpio_read() is allowed to be called on pins configured as output.

I believe that his feature is essential, for example for I2C bit-banging implementations where passive HIGH is realized as OD output with pull-ups. The protocol reads bus lines after setting it to passive HIGH to check whether someone else is driving it actively to LOW to recognize arbitration losses and optionally clock stretching.

Hmm. The implementations I was puzzled about have checked the input or output register, depending on the direction. My knowledge is limited in regard to different semantics of the memory mapped GPIO registers out there. But if separate input and output registers are provided, I would be very surprised to ready anything back from of the output register but the exact value that was written to it.

In the scenario with OD output and pull-ups, I expect the following behavior: If one would set the bit x to one in the output register, pin x would go passive high. And reading back the output register would always result in bit x being one, regardless of physical state. But bit x in the input register would be zero if someone externally keeps the line low, but one if the state would reach high.

If those assumptions are correct, that implementation would actually prevent detecting clock stretching, rather than allowing it. Just always checking the input register on gpio_read() would in that case solve the issue. But what state would the bit in the input register have, if the output is push-pull? Ideally, the hardware handles this gracefully by just returning whatever was written to the corresponding bit in the output register. But to me reading from an push-pull output pin was something strange to do, so I never even thought about this.

@gschorcht
Copy link
Contributor Author

Thanks a lot for your very comprehensive answer.

In the scenario with OD output and pull-ups, I expect the following behavior: If one would set the bit x to one in the output register, pin x would go passive high. And reading back the output register would always result in bit x being one, regardless of physical state. But bit x in the input register would be zero if someone externally keeps the line low, but one if the state would reach high.

Completely agreed.

Unfortunately, I'm not familiar enough with other platforms. I can only tell how it works on ESP32. To be able to read from a pin, the signal has to be routed through a IO matrix to the input register. Otherwise, you would just read the initial value from this input register. That is, if the port is set to GPIO_OD*, the input would also need to be activated to to read back the pin. But this type of port sounds more like GPIO_IN_OD* than GPIO_OD*.

Unfortunately, the documentation of the API is not very clear. It just says "a pin can be configured as input or output", which means either input or output but not both. For gpio_read it says "Get the current value of the given pin". In case of an output, it should be clear. This is probably the reason why most implementations read back the output register. But what is meant by "value of the given pin" in case of open-drain, the signal at the pin or the value written to the pin? I wished the documentation of the GPIO API would be more clear.

BTW, with #11950 I provided a fix for ESP32 GPIOs (at least I thought it is a fix) to return the last written value for output ports on gpio_read and not the value from the input register.

@gschorcht
Copy link
Contributor Author

gschorcht commented Nov 25, 2019

@maribu Next question. What should the pin mask be or how should we handle it if you have ports with different widths? For example, the AVR MCU has a port witdth of 8 bits while the GPIO extender has a port width of 16 bits or vise versa, the MCU has a port width of 32 bits and the GPIO extender with only 16 bits.

Do you have any idea how to deal with it? The only solution I see would be to use always 32 bits as pin mask and the according port uses only the part it supports.

@maribu
Copy link
Member

maribu commented Nov 25, 2019

To me, the pin mask would be some unsigned integer of a width that fits the requirements of all backends compiled in. If on AVR only the periph GPIOs are used, and uint8_t should be fine.

I would use the build system to resolve the requirements of each implementation. Let's say we would abuse USEMODULE and pseudomodules for that for now. (It feels wrong to use modules for this, but let's go with that for now.) E.g. the AVR implementation would add USEMODULE += gpio_mask_8bit, a 16 bit port extender would pull in USEMODULE += gpio_mask_16bit etc. The header would then look like this:

#if defined(MODULE_GPIO_MASK_32BIT)
typedef uint32_t gpio_mask_t;
#elif defined(MODULE_GPIO_MASK_16BIT)
typedef uint16_t gpio_mask_t;
else
typedef uint8_t gpio_mask_t;
#endif

Additionally it will be need to be defined if the first pin should be the least significant bit in the mask, or the most significant bit. On AVR and STM32 the least significant bit in the register corresponds with the first pin. Ideally, every other hardware also uses lsb for the first pin. But in any case: As the width of the mask is not guaranteed to match the number of pins (it can be bigger), it will just be easier to always use the least significant bit as the first pin. For that reason I would just define it that way and let the driver take care of it, if it is different. (Sounds bad, but e.g. on ARM there is the RBIT instruction that reverses the bit order in a word. Depending on the platform, this can thus by a single instruction overhead.)

@gschorcht
Copy link
Contributor Author

@maribu I have an experimental implementation of following masked low-level GPIO functions for STM32

gpio_mask_t gpio_cpu_read(void *dev);
void gpio_cpu_set(void *dev, gpio_mask_t pins);
void gpio_cpu_clear(void *dev, gpio_mask_t pins);
void gpio_cpu_toggle(void *dev, gpio_mask_t pins);
void gpio_cpu_write(void *dev, gpio_mask_t values);

The benchmarks of high-level functions of this implementation are:

Implementation set [us] clear [us] toggle [us] read [us] write [us]
master 0.166 0.187 0.416 0.187 0.208
direct call of gpio_cpu_* 0.145 0.156 0.177 0.177 0.270
indirect call using gpio_driver_get 0.208 0.229 0.239 0.208 0.364

The following questions have arisen in relation to these functions:

  1. Would it makes sense to add a mask parameter to the write function? Writing all pins of a port might lead to problems if several pins are used for other purposes. Using a mask and the output register wouldn't make any difference. But using a mask and the set and reset registers would make a difference.
  2. Would it makes sense to add a mask parameter to the read function?

Further questions that are related to the other functions are:

  1. Make a port-wide configuration and masked gpio_init versions really sense?

    1. The pins of a port are normally not all used as GPIOs. Very often a number of pins are used for alternative functions. Writing a port-wide configuration could lead to unwanted configurations, strange effects and possibly hardware conflicts.
    2. The initialization of a pin is pretty special on some platforms and have to be done pin by pin. For example, on STM32 the configuration is not just a simple bit mapping.
  2. Make port-wide setting, enabling and disabling interrupt functions sense? Some platforms do not allow to configure every GPIO as interrupt source. Rather, they allow a number of interrupt sources that can be connected to individual GPIOs. A pin-wise one-to-one mapping of interrupt sources and GPIOs is therefore usually not possible, which requires a complex configuration.

Therefore, masked versions of these functions would have to iterate bit-wise over the mask after the high level functions produced the mask. This seems to be contradictory.

@maribu
Copy link
Member

maribu commented Nov 30, 2019

Sounds like that the port-wide configuration function indeed makes implementations harder. (And for platforms that would allow to implement this easily, the additional work to provide a pin specific gpio_init() instead is less than providing a gpio_port_init() on top of platforms like STM32.) So I agree that this wasn't a good idea.

  1. Would it makes sense to add a mask parameter to the read function?

That indeed seems to be something that could be nice to have for a user. But that could be just a static inline wrapper on top of gpio_read(), as I assume that the whole port is read in any case anyway. So to me that should not be part of the API a GPIO driver backend has to implement, but is instead provided as syntactic sugar via the common API (similar to the pin read/write/set/clear functions).

  1. Would it makes sense to add a mask parameter to the write function?

That is a good question. This seems to be something that could end up very complex to implement. Maybe it makes sense to declare the port write function as an optional feature and only provide it, if it can be implemented in a safe way. (So that only pins that are explicitly configured as some flavor of output to change their state, and pins used for SPI/I2C/GPIO input/... are guaranteed to be unaffected by an call to the port's write() function.) With that, a mask is not needed for safe usage of that function.

But that write() function for the port seems not to be particularly useful for the average use case. E.g. I think that the pin write function will be implemented on top of set() and clear() in any way. Using the ports write() function would only work by reading the port's state first, updating the bit matching the pin to write to, and then write the new state back. That seems to be more involved than just using set() or clear() instead.

So my gut feeling is that a port write() function and one that additionally takes a mask should be something that is likely best provided as an optional feature, and avoided for use in the GPIO pin API on top of the port API.

@gschorcht
Copy link
Contributor Author

gschorcht commented Dec 3, 2019

@kaspar030 I guess the following is what you tried to suggest, right?

/**
 * @brief   GPIO device type
 *
 * A GPIO device is a hardware component that provides a number of GPIO
 * pins, e.g. a GPIO extender. It is defined by a device descriptor that
 * contains the state and parameters of the device, as well as an associated
 * driver for using the device.
 *
 * @note The GPIO device type isn't used for MCU GPIO ports.
 */
typedef struct {
    void *dev;                      /**< device descriptor */
    const gpio_driver_t *driver;    /**< associated device driver */
} gpio_dev_t;

/**
 * @brief   GPIO port type
 * 
 * A GPIO port allows the access to a certain number of GPIO pins. It is either
 *
 * - a register address in the case of MCU GPIO ports or
 * - a pointer to a device of type `gpio_dev_t` which provides a number
 *   of GPIO pins, e.g. a GPIO extension device.
 */
typedef union gpio_port {
    const gpio_reg_t reg;   /**< register address of a MCU GPIO port */
    const gpio_dev_t* dev;  /**< pointer to a device that provides the GPIO port */
} gpio_port_t;

/**
 * @brief   GPIO pin type definition
 *
 * A GPIO pin is defined by a port that provides the access to the pin and
 * the pin number at this port.
 */
typedef struct {
    const gpio_port_t *port;   /**< pointer to the port */
    gpio_pin_t pin;            /**< pin number */
} gpio_t;

/**
 * @brief    Get the driver of a GPIO port
 */
static inline const gpio_driver_t *gpio_driver_get(const gpio_port_t *port)
{
#if MODULE_EXTEND_GPIO
    if ((port->reg & GPIO_CPU_REG_MASK) == GPIO_CPU_REG_GPIO) {
        return &gpio_cpu_driver;
    }
    else {
        return port->dev->driver;
    }
#else
    (void)port;
    return &gpio_cpu_driver;
#endif
}

...
static inline int gpio_init(gpio_t gpio, gpio_mode_t mode)
{
    const gpio_driver_t *driver = gpio_driver_get(gpio.port);
    return driver->init(gpio.port, gpio.pin, mode);
}

@maribu
Copy link
Member

maribu commented Dec 3, 2019

The gpio_driver_get() can be simplified using IS_USED() :-)

static inline const gpio_driver_t *gpio_driver_get(const gpio_port_t *port)
{
    if (!IS_USED(MODULE_EXTEND_GPIO) ||
        (port->reg & GPIO_CPU_REG_MASK) == GPIO_CPU_REG_GPIO)) 
        {
        return &gpio_cpu_driver;
    }
    else {
        return port->dev->driver;
    }
}

@kaspar030
Copy link
Contributor

I guess the following is what you tried to suggest, right?

exactly!

@gschorcht
Copy link
Contributor Author

The gpio_driver_get() can be simplified using IS_USED() :-)

static inline const gpio_driver_t *gpio_driver_get(const gpio_port_t *port)
{
    if (!IS_USED(MODULE_EXTEND_GPIO) ||
        (port->reg & GPIO_CPU_REG_MASK) == GPIO_CPU_REG_GPIO)) 
        {
        return &gpio_cpu_driver;
    }
    else {
        return port->dev->driver;
    }
}

Yes, the question is whether it's more readable 😉

@gschorcht
Copy link
Contributor Author

Would it makes sense to add a mask parameter to the write function?

@maribu I checked all gpio_write implementations again. Most of them directly use the set and clear registers or the entire output register with a pin mask created by the pin for writing. All others (4 out of 19) use gpio_set and gpio_clear, which in turn use the set and clear registers. The implementation of gpio_write only with the parameter values should therefore at least behave as before. Thus, it shouldn't be a problem to implement it as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: major The PR changes a significant part of the code base. It should be reviewed carefully Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants