-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scripts: add Zephyr inline code generation with Python #6762
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6762 +/- ##
=======================================
Coverage 52.15% 52.15%
=======================================
Files 212 212
Lines 25916 25916
Branches 5582 5582
=======================================
Hits 13517 13517
Misses 10149 10149
Partials 2250 2250 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disapproving to block this from being merged until I have had time to review.
@@ -0,0 +1,381 @@ | |||
.. | |||
Copyright (c) 2004-2015 Ned Batchelder | |||
SPDX-License-Identifier: MIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nashif Will we need to account for this use of MIT licensed material?
|
||
/* This is my C file. */ | ||
... | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... will this conflict with doxygen processing (that also uses the /**
comment convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not conflict with doxygen. In this example it is by purpose to include the python code snippet into doxygen output. The markers for code @code{.codegen}, .. are chosen in a way to support documentation by doxygen. /** is a marker for doxygen only and can be replaced by /* to leave the code snippet out of doxygen output..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'm curious how this will play with the document generating process, and if we'll need to add a codegen pass over source files before presenting them to doxygen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how this will play with the document generating process
Currently none of the files is parsed by doxygen because the driver directory is not included in the doxygen documentation. make htmldocs works without any errors.
if we'll need to add a codegen pass over source
There is no need for an extra pass. Inline code is within comments and clearly marked with the doxygen special commands @code/@Endcode. Doxygen does not know the {.codegen} type. It ignores the unknown code type specification (see https://www.stack.nl/~dimitri/doxygen/manual/commands.html#cmdcode) and "just show the output as-is".
As an improvement {.codegen} could be made known to doxygen to be treated as python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying!
doc/subsystems/codegen/codegen.rst
Outdated
:local: | ||
:backlinks: top | ||
|
||
Description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should have a problem statement first. What problem is codegen addressing and why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added problem statement
doc/subsystems/codegen/codegen.rst
Outdated
Raises an exception with msg as the text. No traceback is included, so | ||
that non-Python programmers using your code generators won’t be scared. | ||
|
||
codegen.inFile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for presentation consistency with the function documentation above you should use the attribute directive for these attributes:
.. attribute:: codegen.infile
This seems really promising, code is clear and well documented |
b0cb303
to
46f9325
Compare
doc/subsystems/codegen/codegen.rst
Outdated
Inline Code Generation | ||
###################### | ||
|
||
Inline code generation allows you to build code (basically any text) using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like this:
For some repetitive or parameterized coding tasks, it's convenient to
use a code generating tool to build C code fragments, instead of writing (or editing)
that source code by hand. Such a tool can also access CMake build parameters
to generate project source code tailored and tuned to a specific project
configuration without you editing the source files directly.
The Zephyr project supports a code generating tool that processes embedded
Python "snippets" inlined in your C source files, creating source code that
and can be used, for example, to generate source code that creates and fills
data structures, adapts programming logic, creates configuration-specific code
fragments, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
46f9325
to
6303127
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not desirable to have code generators inline with C code.
I believe it will cause a myriad of problems, but I only know of two:
The build system slows down because it can not distinguish between an edit of
the code generator or the C source code and is forced to re-run the code-generator
unnecessarily.
Neither compilers nor static analyzer are able to work with the intermixed file
and must work on the generated file instead. This causes several problems,
for instance: the compiler errors use line numbers that do not reflect
the line numbers of the file that the programmer is working on.
Sorry this is not true. Any code geneneration depends on the generation commands whether they are inlined or not. The build system always has to do something when you change the generation commands.
Sorry this is not true either. Static analyzer can work with the intermixed file as the generator code is embedded into comments which are not analyzed. But you should not do this to have the analyzer get the full picture. Even if you do not generate code with inline commands you would have to feed the generated code to the static analyzer to enable it to have the full view. Compilers always must work with the generated file - otherwise you would not generate them. This is independent whether you embed the generation commands in the source code comments or have some other generation command files. In the end you have to feed the generated code to the compiler.
Yes it uses the line numbers of the generated file which is available to the programmer. So no more guessing which shitty part of the code generation commands is the culprit. If your programmer can not map these two files you may have more problems than just line numbers. My conclusion from your reviewZephyr is unable to fulfill the promise to be configured with device tree data. And there is no solution in sight. |
Sorry for being terse, allow me to explain in detail. When the
In this graph, editing the C source code does not trigger a rebuild of |
@SebastianBoe I have to disagree with your comment. We really need something like this to solve device instance generation in relation with device tree. If there are any issue with the built system, let's fix the build system. |
The inline code generation will most probably only be used for drivers. During application development you have to generate the driver code one time. There is no difference in the calls to the generator compared to not inlined generation commands. During a specific driver development this is different. But then you will want the generated code to be regenerated to be shure that it is in sync with your changes. So also here is no real benefit for your scenario. |
Right, I thought I had identified a case where a static analyzer was able to analyze a C source file that used #include's but when comments were used it could not. But I can not reproduce this. So never mind the point about static analyzers. |
Right. The difference between #including and inlining a code generator is that a compile error in the non-generated part will not reflect the source when inlining, but it will when #including because the compiler knows that an #include does not affect the line number. This is one case (of many i suspect) where you will get better feedback from the compiler if you separate the code generation from the source code with the preprocessor than if you intermix them. |
Looks nice but still I'm not quite comfortable to have python snippets in C code, for many reasons, one being just "I don't like it" (I recognize this not quite elaborated but let us be honest and recognize this will be also a quite frequent remark we'll have to face, so let's be ahead of it) @b0661 , from your understanding of codegen, do you think this could be feasible? |
This depends on what you regard your primary artefact. If you take the generated code file as the primary artefact you get accurate feedback of the compiler with no distraction by include hirarchies. You have a clear view of the generated code and of the generation commands. It might be a problem to work benches that want to map the reported error lines to the source code of the original file. In the error listing the generated file name is listed which can or can not be distinguished from the original file. You can give hints to to the compiler about the line numbering in the generated file. I did not dive into that because I do not use a work bench. |
I believe this is an accurate assesment of todays situation. About the alternatives, the original proposal of using the preprocessor, or at the very least using a supported source language (C, C preprocessor, C++), is not affected by these issues. Also, retaining the codegen infrastructure pretty much as-is, but enforcing some guidelines for how to use it (no inlining) would also not be affected by these issues. The guidelines could perhaps look something like this: Don't do anything with the preprocessor that can be done in C. Generated code, whether generated by CPP, python, or other, should limit itself to just Generators should limit themselves to generating declarative data, and avoid generating |
I doubt it. The use-case that drivers have is shared by many. Essentially, |
Sounds like we may need to capture something here in our coding standards guidelines |
Codegen is already a layer on top of pure python abstracting away a lot of the complex interaction with the device tree info. It also provides access to CMake variables, ... Please have a look at the doumentation that is included in the PR. For example you can write:
and code generation will stop if there is no SPI node of type 'st,stm32-spi-fifo' activated in the device tree. Codegen can be expanded easily to support even more complex generation tasks by providing a simple command for them. It is Python. |
Cog (the basis of codegen) was developed by a guy who wanted something that is easier to use than C++ macros and more flexible. In the end you can achieve the same goal by different solutions. What I learned from the discussion about the C macro library originally used in #5799 is that the tool must fit to the users/ developers knowledge/ experience. I currently see not much C++ but a lot of Python. |
This is a perfect summary of the experience I made when using codegen to implement the drivers in #5799. In the end it is a variant of "keep it as simple as possible". |
@b0661 : Well, there're 241 pull requests open, all interesting to everyone. Some changes may take longer than the other to ripe and be ready for the mainline. For one, @erwango had a presentation about this at Linaro Connect, and it definitely looked interesting (though not without the issues to consider). |
@pfalcon, 'interest' is more about the activities in PR #9876 which is a prerequisite to codegen and was originally developed by me. @erwango and @galak used it as a playground for Linaro/ Intel interest/ideas and constantly denied my comments for focusing on what is important to me/ codegen. As I´m the original author of the EDTS database and DTS extraction to it I was expecting to at least be informed about breaking changes. This was not the case - there were such "important" changes as to prepend get_ to all getter functions without notice and in the end to remove functions that are essential to codegen usage also without notice or discussion. As I´m the only real world user of codegen it costs a lot of effort to remedy all these toying around changes and to make DTS extraction and codegen work for my project again. This is especially frustrating as a lot of additions I originally had in the code were removed on request of @erwango to not add something new which would question inclusion in mainline. Obviously there is a difference between additions by Linaro/ Intel and mine. There were some comments in the mean time from @erwango and @galak. It seems to change now. I will see. |
@b0661: I'm sorry to hear about possible uncoordinated changes. At the same time, the changes proposed are quite substantial and have many implications, so I wouldn't be surprised if some reduction of scope would be requested (meaning that you'd need to rework additional bits you need on top of a partial solution, which I agree is a pain, but is a part of the compromise maintainable for everyone). What I can say that even less brave ideas of mine were turned down/shunned, so, what to do, just continuing with ones which fit. So, I hope some compromise will be found, without people pulling to solve those issues, it's all bleak. |
The Extended Device Tree Specification database collates device tree (dts) information with information taken from the device tree bindings. The EDTS database may be loaded from a json file, stored to a json file or extracted from the DTS files and the bindings yaml files. The commit integrates database development done in zephyrproject-rtos#9876 which was based on zephyrproject-rtos#6762. Major differences/ improvements to zephyrproject-rtos#9876 are: - the database now has an own extraction function that can be used instead of e.g. extract_dts_includes. The extraction function follows the design of the extract_dts_includes script and the additions that were done in zephyrproject-rtos#9876. It is restructured and several globals are now classes and objects. All functionality of extract_dts_includes related to the generation of defines is not part of the database extract function. It's sole purpose is to fill the database directly from the compiled DTS file. - the database got itś own directory 'edtsdb' to structure all files related to the database. - The EDTSDevice class from zephyrproject-rtos#9876 was enhanced to allow devices to access the database they are taken from. Mayor compatibility issues to zephyrproject-rtos#9876. - the consumer, provider API and the internal structure of the databse is copied from zephyrproject-rtos#9876. API should be fully compatible. - the EDTSDevice class is copied from zephyrproject-rtos#9876. The device API should be fully compatible except for the unique id feature. Device name can be used instead. Signed off from zephyrproject-rtos#9876 added to attribute for the changes done there and copied. Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org> Signed-off-by: Kumar Gala <kumar.gala@linaro.org> Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
The Extended Device Tree Specification database collates device tree (dts) information with information taken from the device tree bindings. The EDTS database may be loaded from a json file, stored to a json file or extracted from the DTS files and the bindings yaml files. The database is integrated into cogeno as a module. The commit integrates database development done in zephyrproject-rtos#9876 which was based on zephyrproject-rtos#6762. Major differences/ improvements to zephyrproject-rtos#9876 are: - the database now has an own extraction function that can be used instead of e.g. extract_dts_includes. The extraction function follows the design of the extract_dts_includes script and the additions that were done in zephyrproject-rtos#9876. It is restructured and several globals are now classes and objects. All functionality of extract_dts_includes related to the generation of defines is not part of the database extract function. It's sole purpose is to fill the database directly from the compiled DTS file. - the database got itś own directory 'edtsdb' to structure all files related to the database. - The EDTSDevice class from zephyrproject-rtos#9876 was enhanced to allow devices to access the database they are taken from. Mayor compatibility issues to zephyrproject-rtos#9876. - The consumer, provider API and the internal structure of the database is copied from zephyrproject-rtos#9876. - API should be fully compatible. - Extraction of children was replaced as it broke the concept of the devices struct as a list of devices. The functions device.get_children() and device.get_parent() may be used to acess the parent<->child relation. - The EDTSDevice class is copied from zephyrproject-rtos#9876. - The device API should be compatible except for - the constructor which now needs the EDTS database and - the unique id feature. To ge an unique id the device.get_name() function can be used instead. Signed off from zephyrproject-rtos#9876 added to attribute for the changes done there and copied. Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org> Signed-off-by: Kumar Gala <kumar.gala@linaro.org> Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
The Extended Device Tree Specification database collates device tree (dts) information with information taken from the device tree bindings. The EDTS database may be loaded from a json file, stored to a json file or extracted from the DTS files and the bindings yaml files. The database is integrated into cogeno as a module. The commit integrates database development done in zephyrproject-rtos#9876 which was based on zephyrproject-rtos#6762. Major differences/ improvements to zephyrproject-rtos#9876 are: - the database now has an own extraction function that can be used instead of e.g. extract_dts_includes. The extraction function follows the design of the extract_dts_includes script and the additions that were done in zephyrproject-rtos#9876. It is restructured and several globals are now classes and objects. All functionality of extract_dts_includes related to the generation of defines is not part of the database extract function. It's sole purpose is to fill the database directly from the compiled DTS file. - the database got itś own directory 'edtsdb' to structure all files related to the database. - The EDTSDevice class from zephyrproject-rtos#9876 was enhanced to allow devices to access the database they are taken from. Mayor compatibility issues to zephyrproject-rtos#9876. - The consumer, provider API and the internal structure of the database is copied from zephyrproject-rtos#9876. - API should be fully compatible. - Extraction of children was replaced as it broke the concept of the devices struct as a list of devices. The functions device.get_children() and device.get_parent() may be used to acess the parent<->child relation. - The EDTSDevice class is copied from zephyrproject-rtos#9876. - The device API should be compatible except for - the constructor which now needs the EDTS database and - the unique id feature. To ge an unique id the device.get_name() function can be used instead. Signed off from zephyrproject-rtos#9876 added to attribute for the changes done there and copied. Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org> Signed-off-by: Kumar Gala <kumar.gala@linaro.org> Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
The Extended Device Tree Specification database collates device tree (dts) information with information taken from the device tree bindings. The EDTS database may be loaded from a json file, stored to a json file or extracted from the DTS files and the bindings yaml files. The database is integrated into cogeno as a module. The commit integrates database development done in zephyrproject-rtos#9876 which was based on zephyrproject-rtos#6762. Major differences/ improvements to zephyrproject-rtos#9876 are: - the database now has an own extraction function that can be used instead of e.g. extract_dts_includes. The extraction function follows the design of the extract_dts_includes script and the additions that were done in zephyrproject-rtos#9876. It is restructured and several globals are now classes and objects. All functionality of extract_dts_includes related to the generation of defines is not part of the database extract function. It's sole purpose is to fill the database directly from the compiled DTS file. - the database got itś own directory 'edtsdb' to structure all files related to the database. - The EDTSDevice class from zephyrproject-rtos#9876 was enhanced to allow devices to access the database they are taken from. Mayor compatibility issues to zephyrproject-rtos#9876. - The consumer, provider API and the internal structure of the database is copied from zephyrproject-rtos#9876. - API should be fully compatible. - Extraction of children was replaced as it broke the concept of the devices struct as a list of devices. The functions device.get_children() and device.get_parent() may be used to acess the parent<->child relation. - The EDTSDevice class is copied from zephyrproject-rtos#9876. - The device API should be compatible except for - the constructor which now needs the EDTS database and - the unique id feature. To ge an unique id the device.get_name() function can be used instead. Signed off from zephyrproject-rtos#9876 added to attribute for the changes done there and copied. Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org> Signed-off-by: Kumar Gala <kumar.gala@linaro.org> Signed-off-by: Bobby Noelte <b0661n0e17e@gmail.com>
Overview
Extend Zephyr build system by inline code generation with Python snippets in source files.
The PR is a cut-out from PR #5799 for review but can be applied independently.
Motivation for or Use Case
The prime motivation is to be able to create pin controller drivers that can configure the pins on startup by the information from the device tree in a standardized way. Therefor there must be means to easily generate structured data from device tree information at build time.
Device tree information is extracted to generated_dts_boards.[h,conf]. Inline code generation with Python snippets is used to get this information and generate structured data to be compiled.
Design Details
Python snippets that are inlined in a source file are use as generators. The Zephyr tool to scan the source file for the Python snippets and process them is Codegen. Codegen is itself written in Python. The Codegen class uses code from Cog and extends Cog by Zephyr specific generator functions. Cog is distributed under the MIT license which should be in line with the Zephyr license policy.
The processing of source files is controlled by two CMake extension functions: zephyr_sources_codegen(..) and zephyr_sources_codegen_ifdef(..). During CMake configuration the source files are processed by Codegen and the generated source files are written to the CMake binary directory. The generated source files are added to the Zephyr sources.
The inlined Python snippets can contain any Python code, they are regular Python scripts. All Python snippets in a source file and all Python snippets of included template files are treated as a python script with a common set of global Python variables. Global data created in one snippet can be used in another snippet that is processed later on. This feature is e.g. used to customize included template files.
An inlined Python snippet can always access the codegen module. The codegen module encapsulates and provides all the functions to retrieve information (options, device tree properties, CMake variables, config properties) and to put out the generated code.
Alternatives
Instead of code generation by a python tool the C preprocessor and a macro library may be used. This was the first approach in PR #5799. Due to the complexity of the chosen macro library CHAOS_PP and the general policy to use Python as a tooling language there are objections against this approach.
Test Strategy
A working proof of concept for STM32F0x pinctrl, gpio and spi drivers is available in PR #5799. PR#5799 also includes the unit test of the pinctrl driver template which also uses inline code generation.
Signed-off-by: Bobby Noelte b0661n0e17e@gmail.com