Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft module system #798

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 129 additions & 4 deletions p4-16/spec/P4-16-spec.mdk
Original file line number Diff line number Diff line change
Expand Up @@ -1180,7 +1180,123 @@ produced by different programmers at different times:
protocol header definitions), and
- the P4 programs that specify the behavior of each programmable block.

### P4 core library
The use of the `#include` preprocessor directive is discouraged;
instead the users should employ the newer module `import` directive
described in section [#sec-modules].

## Modules { #sec-modules }

P4 offers a very simple module system. Modules enable programmers to
decompose a program into multiple source files and libraries.

In the compilation process the preprocessing step is executed first.
After preprocessing the modules are loaded. Before a module is loaded
a *new* preprocessor is applied to its source file (no preprocessor
definitions are inherited from previous preprocessor invocations). If
a module is imported multiple times it is preprocessed anew each time.
Modules offer a functionality similar to `#include` directives:
Copy link
Contributor

@ChrisDodd ChrisDodd Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if module A imports B and module B imports A? Is that allowed?

what if mulitple modules A, B, and C all import X? Is it permissible to parse X once and import it into all of them, or do we have to reparse each time (getting different values for macros like __TIME__)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a module instantiates some extern, and the module is then imported into multiple other modules? Does that result in a separate instance for every import, or just one instance that is shared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the spec does not promise any CPP macros except the ones defined by the user without any arguments, so __TIME__ is not guaranteed to work even with just the preprocessor. It would be nice to get rid of the preprocessor - then we could answer these questions much easier - there are no macros in modules, only in the preprocessor.

But I kept the preprocessor for three reasons:

  • backwards compatibility
  • more expressivity in the preprocessor, some of which may be useful
  • we don't have a replacement for #ifdef/#else

The module system is not really designed for performance, so I don't personally mind parsing X three times. We should produce the most natural result the users would expect. For circular inclusions this could probably be an error message.

This could be a problem indeed if two libraries have private modules named "a", which they both import. With the import scheme proposed in this draft the second library would break the first one. This would suggest that import pathnames to be searched starting with the "current importing directory" rather than the global import path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the extern is a trickier question. This only is important for top-level externs really.
The extern will not be visible from the outside, so none of its methods can ever be called by user code. I am tempted to say that in this case the behavior is platform-dependent. I don't know if there are good uses for this capability.


* Both enable a program to be decomposed into multiple source files.

* Both need to some degree to understand the filesystem layout of
modules. The standard C preprocessor does this through the use of
the `-I` flag to specify the location of directories where included
files have to searched. We recommend the use of the `-L` flag for a
similar role for specifying filesystem paths to be searched by the
P4 compiler for `import`-ed modules. In general, the compiler will
maintain an ordered list of module library paths, and every `import`
statement will search the specified files in order under each of the
library paths, loading the first file that matches. For example,
invoking the p4 compiler with the flags `-L /lib -L /vendor/modules`
will search each module first in the directory `/lib` and then in the
directory `/vendor/modules`.

There are also some important differences between modules and
`#included` files:

* `#include` directives produce flat files. In contrast, P4 modules
can optionally create new namespaces, allowing P4 programs to make
use of multiple libraries without concern about name clashes.

* `#include` directives do not understand anything about the P4 syntax
and semantics, they are purely string processing operations. In
particular, the included files do not have to be syntactically
correct P4 programs or program fragments.

* `#include` directives function transitively: a file included from
another included file has the exact same visibility as the top-level
constructs, and preprocessor macro definitions are visible across
all transitive inclusions. In contrast, transitively included
modules are completely invisible at the top-level, and each module is
preprocessed by a fresh preprocessor instance.

The following keywords are used to construct modules in P4:

* `import`, `into`: import definitions from a module
* `private`: mark a declaration in a module as non-importable

### Importing modules { #sec-import }

A definitions in a module can be imported into the global namespace
with the following statement, executed at compile-time:

`import module/path/file.p4`

~ Begin P4Grammar
importStatement: IMPORT filePath
| IMPORT filePath INTO namespacePath
;

filePath: FILENAME
| DIRECTORYNAME SLASH filePath
;

namespacePath: IDENTIFIER
| namespacePath DOT IDENTIFIER
;
~ End P4Grammar

`import` statements must be present at the top-level only. The effect
of the `import` statement is to include all the definitions in the
imported module in the place of the `import` module. An import is
followed by an optional path separated by slashes and by a file name.
Paths cannot start with a slash.

Alternatively, the definitions in a module can be imported in a
separate namespace with the following construct:

`import module/path/file.p4 into namespace`

The effect of this statement is similar to the simple import; in
addition, the names of all declarations that are imported will be
prefixed with the `namespace.` prefix (including a trailing dot).

If a module is imported into a namespace then all its declarations are
available prefixed with the respective namespace.

~ Begin P4Grammar
pathPrefix: DOT
| IDENTIFIER DOT
| pathPrefix IDENTIFIER DOT
;
~ End P4Grammar

### Private declarations

A declaration in a module can be prefixed with the `private` keyword.
This will cause the declaration not to be visible in the importing
programs; however, the declaration is still accessible in the imported
code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe everything should be private by default -- add a public or export keyword and get rid of private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, the downside is that people may need to modify libraries they have written so far using #include to add export to each declaration. So I thought that public by default is simpler. But for a clean design I would prefer private by default.


### Transitive imports

If the top-level P4 program imports module `a` which in turns imports
module `b`, none of the declarations in module `b` are visible at the
top-level; they are only visible in the `a` module itself. The search
path for modules imported by `a` is the same as the search path of
modules imported by the top-level.

## P4 core library

The P4 language specification defines a core library that includes
several common programming constructs. A
Expand All @@ -1192,6 +1308,13 @@ library. Including the core library is done with
# include <core.p4>
~ End P4Example

Alternatively, using the module system the core library can be imported using

~ Begin P4Example
import core.p4
~ End P4Example


## Lexical constructs { #sec-lexical }

All P4 keywords use only ASCII characters. All P4 identifiers must use
Expand Down Expand Up @@ -1351,6 +1474,7 @@ A P4 program is a list of declarations:
p4program
: /* empty */
| p4program declaration
| p4program PRIVATE declaration
| p4program ';' /* empty declaration */
;

Expand All @@ -1365,6 +1489,7 @@ declaration
| errorDeclaration
| matchKindDeclaration
| functionDeclaration
| importStatement
;
~ End P4Grammar

Expand Down Expand Up @@ -1443,7 +1568,7 @@ following expressions are legal l-values:
~ Begin P4Grammar
prefixedNonTypeName
: nonTypeName
| dotPrefix nonTypeName
| pathPrefix nonTypeName
;

lvalue
Expand Down Expand Up @@ -2114,7 +2239,7 @@ namedType

prefixedType
: TYPE_IDENTIFIER
| dotPrefix TYPE_IDENTIFIER
| pathPrefix TYPE_IDENTIFIER
;

typeName
Expand Down Expand Up @@ -2868,7 +2993,7 @@ expression
| FALSE
| STRING_LITERAL
| nonTypeName
| dotPrefix nonTypeName
| pathPrefix nonTypeName
| expression '[' expression ']'
| expression '[' expression ':' expression ']'
| '{' expressionList '}'
Expand Down