-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preprocessing does not handle #define correctly #646
Comments
That's quite bizarre considering we just invoke the C preprocessor. If you invoke the preprocessor manually and look at the output, maybe that will help track down the issue. |
Oh well, I lied, it seems neither is working.
The |
The preprocessor removes all defines. It looks like the problem is that the P4 output code does not output the define, so the resulting code that relies on it in psa.h won't work properly the second time through. The P4 output code special-cases stuff in p4include, outputting an |
As a workaround, as long as the header with the #define and the definitions is a 'system' file (in the p4include_path), it should be recognized and replaced with an |
Another solution is to change the way tests are done for these files. |
Alternatively, we can add a compiler flag so that the compiler does not generate #include statements anymore, it just dumps the internal P4 representation. Maybe that's the right solution. |
Scoping out the work with a little deeper dive because I think this is where I am leaning toward. Adding a new compiler flag is less than one day of work. |
Top4 already has a boolean flag to avoid includes. |
@mbudiu-vmw First you said "we can add a compiler flag so that the compiler does not generate #include statements". Subsequently, you said above that toP4 already includes a boolean flag to avoid includes - is this an internal flag using p4includePath? I am trying to understand what externally-visible flag exists in p4c for the user to enable the following code changes? I see the "-v" flag that dumps IR - is this what you meant? If none exists, we need to add a new flag.
|
Currently there's no specific flag -- ifSystemFile checks to see if the file comes from a directory in the p4includePath (for system headers). I think there's a way of changing/clearing that (so no files would be identified as system files), however that seems undesirable -- we WANT to omit system header files from the dumped code as they are lots of (uninteresting) cruft that doesn't depend on the source file. The problem in this bug seems to be that |
That said, adding a flag to disable the generation of #include directives in the dumped code would be trivial -- just need tests of that flag in these two toP4::preorder methods. |
Another issue is we have midend pass, e.g, elimNewType, that transforms both user program AND system header files. For example, elimNewType replaces newtype in psa.p4 with a typedef. If we omit system header file from the dumped code, we will ends up with transformed user program + not transformed system header file, which makes the resulting dumped code incorrect. Currently, any dumped code after elimNewType pass in the midend is incorrect and cannot be compiled, which is pretty much all passes in midend, because elimNewType is at the very beginning of the midend. |
Does the original problem disappear if we use the latest psa.p4 #include file checked into p4include/psa.p4 as is, without doing any #define's that affect any #ifdef statements inside of psa.p4? I believe the latest version should do what we want for the BMv2 back end, at least, without having to have any code before the EDIT: You can probably disregard my question here. If I understand correctly from the rest of the discussion, a problem is that p4c's mid-end is replacing The C preprocessor doesn't let you do this kind of "preserve the #include directives" after running, for pretty good reason, yes? I know it is a lot of cruft to wade through, but it does produce #line directives with file names and line numbers, that can let you search for where in the post-preprocessed source that the lines are from original source file foo.p4 instead of bar.p4. |
As @hanw says,the problem is that during compilation the compiler transforms the code included, so when dumping it should dump the transformed code. If you include the original code it doesn't typecheck anymore. The toP4 pass already has a flag to skip the includes; we just need to expose it from the command-line. Then, when compiling PSA programs, we will not dump includes in the intermediate passes. Hopefully that will be enough. We don't need to do this for v1model, because none of the transformations affect it much (e.g., it does not have any |
If you don't replace things with #includes, it makes the resulting dumped code mostly useless -- it could only be recompiled for the exact same target and is not useful for any other target with the same architecture. So If we want to be able to use the compiler as a P4_14->16 translator in any useful way, we must preserve it. Note that when we replace things from the system header files with other definitions, those other definitions will not be in the system header file, so will still be dumped. So to make the resulting dumped code recompilable, we need to make sure that if any definitions from the original system file remain, other definitions that have been replaced will not conflict with their replacements (eg, the replacement process should involve some renaming to avoid name conflicts). Even then, it will still only work with the identical target (and likely the identical compiler version). |
It depends when you dump the code. If you dump the code early in the front-end most likely it will still work with includes. Currently the tests dump the code after the mid-end, and there the code already has target-dependent features. You can dump code even in the back-end, and this you certainly don't expect to be target-independent. We are just exposing a knob to the user to be able to do it both ways. The way we decide whether something is part of the system header is based on line numbers, which are often preserved when things are changed. So that's why some changed objects do not show up in the dumps. Moreover, sometimes we may change the structure of a type but not its name; if we dump both versions of the type we will have conflicts in the generated code. The only promise we made in the compiler is that when using the --pp flag you get a compilable p4-16 program. We never promised that the result of a dump in the middle will be a correct legal program which is target-independent. |
Perhaps there could be a compiler pass that "really expands #include's", meaning that #include statements will no longer appear in the intermediate dumped representations from that point and later, and this pass should be run somewhere before a later compiler pass would break if it has not yet been done? e.g. ReallyExpandIncludes pass before elimNewType pass? |
#includes never appear in the IR anywhere. We're just reconstructing them based on the source code locations in the code when dumping p4 code. So there's nothing for ReallyExpandIncludes to do. For the most part, dumping midend (or backend) code is for debugging the compiler, so we don't care too much if the result is compilable. Having it be compact and understandable is more important, so using #includes rather than dumping lots of cruft is still preferable (IMO) |
Yes, includes are removed very early by our preprocessor. We just use some heuristics to reverse-engineer the includes and put them back in the dumped code. One good reason is that the included code is in many tests very large, and it would blow up the size of the intermediate files, and also make it harder to read them for humans. |
So change how elimNewType pass works? Remove it, because it is incompatible with re-inserting #include of psa.p4? Something else? |
No, we just do not dump includes, we dump the whole code. |
do nothing? I don't actually see any problem here. |
Backing up a second, is the only problem here that the intermediate dumped P4_16 source code gives errors if you attempt to feed it as input to the compiler? If that is the only problem, then yeah, it seems like doing nothing is an option. |
The python testing scrips compile the intermediate code. So we have to change these scripts for psa. |
Mihai stole my thought. Yes, this is what I was suggesting but got interrupted. Just for the 4-5 PSA 1.1 P4 programs, we change the python scripts. The two steps included below for another test are the ones to skip for PSA 1.1 P4 programs. Thanks all for replying.
|
This code change changes the test for 4 files and retains the test for others. The file list is created in run-p4-sample.py. Maybe, one would like to provide this list from elsewhere to this script or is the change ok? We can make the same change for any other backend such as bmv2.
|
Maybe base it on the includes in the .p4 file rather than having a fixed list? See the getArch code here that does that already... |
Excellent suggestion, Chris! Changed the code and testing passed with psa and non-psa P4 program.
|
I wrote a test case based on psa.p4 and ran into a preprocessing issue.
Initially, the test case defines the unspecified typedefs in
psa.p4
at the top of the test program.(
psa.p4
is at https://github.com/p4lang/p4c/blob/master/p4include/psa.p4)Running the test case gives the following error.
However, if I define the typedefs in a separate file, say
psa_config.p4
and include bothpsa_config.p4
andpsa.p4
in the test case. The error does not happen.The text was updated successfully, but these errors were encountered: