Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P4_14 to P4_16 checksum conversion gives wrong comparison in verifyChecksum() control block #775

Closed
jafingerhut opened this issue Jul 13, 2017 · 20 comments
Labels
bug This behavior is unintended and should be fixed. fixed This topic is considered to be fixed.

Comments

@jafingerhut
Copy link
Contributor

jafingerhut commented Jul 13, 2017

If you run the command below to auto-convert the P4_14 program in p4c/testdata/p4_14_samples/checksum.p4 to a supposedly equivalent P4_16 program:

p4test --p4v 14 --pp checksum-p4-16.p4 checksum.p4

The output P4_16 program has this in the verifyChecksum() control block:

control verifyChecksum(in headers hdr, inout metadata meta) {
    Checksum16() ipv4_checksum;
    apply {
        if (hdr.ipv4.ihl == 4w5 && hdr.ipv4.hdrChecksum == ipv4_checksum.get({ hdr.ipv4.version, hdr.ipv4.ihl, hdr.ipv4.diffserv, hdr.ipv4.totalLen, hdr.ipv4.identification, hdr.ipv4.flags, hdr.ipv4.fragOffset, hdr.ipv4.ttl, hdr.ipv4.protocol, hdr.ipv4.srcAddr, hdr.ipv4.dstAddr })) 
            mark_to_drop();
    }
}

Unless I am very confused (possible!) I think the comparison between hdr.ipv4.hdrChecksum and the return value of ipv4_checksum.get() should be not-equals (!=), not equals (==).

The code in the auto-generated computeChecksum control block appears to me to be doing the correct thing.

@mihaibudiu mihaibudiu added the bug This behavior is unintended and should be fixed. label Jul 20, 2017
@jnfoster
Copy link
Contributor

+1

(Just observed this bug in a different converted program.)

@mihaibudiu
Copy link
Contributor

We are having lots of problems with the checksum. I am thinking to change v1model.p4 to handle checksums differently, and more in line with bmv2 capabilities.

extern void checksum_update<T>(in T fields, out bit<16> checksum);
extern void checksum_verify(in T fields);

(Maybe the second method should also have a bit<16> checksum argument, haven't decided yet.)
Then:

  • in the verify control you are only allowed to have calls to checksum_verify.
  • in the update control you are only allowed to have calls to checksum_update.

The existing Checksum16 extern would be removed.

This will make code generation for bmv2 much more robust, and also will enable us to give much better error messages.

Note that neither control can drop packets.
@jafingerhut @cc10512 @ChrisDodd @jnfoster @antoninbas : I would appreciate comments.

@ChrisDodd
Copy link
Contributor

Seems reaonable to me -- allowing general code in checksum blocks seems problematic. @sethfowler any opinions?

@jafingerhut
Copy link
Contributor Author

Are the problems you have found primarily related to auto-converting P4_14 to equivalent P4_16 programs?

How strong is the desire to preserve automatic conversion from P4_14 to P4_16, on a scale from 0 to 10? (I understand different people may have different answers here.)

@sethfowler
Copy link
Contributor

Mihai, the interface that you propose makes sense to me. (Presumably checksum_verify is signaling success or failure via an error.)

I'd say that from my perspective, preserving P4-14 support is critical, and since that support entails conversion to P4-16 (or something very like it) internally, we may as well let you serialize the result as a P4-16 program.

@mihaibudiu
Copy link
Contributor

Today on bmv2 verify just logs a message on failure. Nothing prevents an extern function from communicating deeply with the architecture to cause a packet drop after the current block is finished.

Yes, P4-14 support is very important.

@antoninbas
Copy link
Member

Thanks @mbudiu-vmw for proposing this.
This seems fine to me, but I'd like the ability to check for checksum verification errors in the match-action pipeline. Maybe this could be done by passing some metadata to the verify control and modifying the proposed checksum_verify extern to:

// if the checksum is not correct, sets error_flag to 1
// if the checksum is correct, does not write to error_flag
extern void checksum_verify<T>(in T fields, inout bit<1> error_flag);

More generally, if we are going to modify v1model.p4, I'd really like to be able to access parser errors in the match-action pipeline, as I suggested in this issue: #808

Finally, P4_14 support is also very important to me.

@mihaibudiu
Copy link
Contributor

@antoninbas : metadata should be used as an argument only if the P4 program is expected to do something with the metadata. Otherwise it can be just "hidden metadata" that the architecture sets and uses. For a user the question will be: what argument do I supply to the error_flag bit and how/where can I use it after verification? There is no such argument in P4-14 programs, so when translating a P4-14 program I don't know what to pass to this method.

Having a metadata bit exposed will allow for future expansion of the v1model, to allow stages that run after verification to inspect the error bit. The question is whether we plan to continue expanding v1model, or we plan to invest just in PSA going forward. I have always considered v1model as trying to emulate the capabilities of P4-14 pipeline, thus as a fixed model, which does not need to change ever (except for bug-fixes).

I do plan to work on #808, but I am tackling bugs first; unfortunately there are lots of them still open.

@jnfoster
Copy link
Contributor

jnfoster commented Aug 17, 2017

Going back to the original primitive for just a moment, I'd like to write the IP Checksum like this:

control MyVerifyChecksum(in headers hdr, inout metadata meta) {
    Checksum16() ipv4_checksum;    	   
    apply {
        ipv4_t ipv4 = hdr.ipv4;
	ipv4.hdrChecksum = 16w0;
        if (hdr.ipv4.hdrChecksum != ipv4_checksum.get(ipv4)) 
            mark_to_drop();
    }
}

rather than this

control MyVerifyChecksum(in headers hdr, inout metadata meta) {
    Checksum16() ipv4_checksum;    	   
    apply {
        if (hdr.ipv4.hdrChecksum != ipv4_checksum.get({hdr.ipv4.version, hdr.ipv4.ihl, hdr.ipv4.diffserv, hdr.ipv4.totalLen, hdr.ipv4.identification, hdr.ipv4.flags, hdr.ipv4.fragOffset, hdr.ipv4.ttl, hdr.ipv4.protocol, 16w0, hdr.ipv4.srcAddr, hdr.ipv4.dstAddr})) 
            mark_to_drop();
    }
}

or equivalently this

control MyVerifyChecksum(in headers hdr, inout metadata meta) {
    Checksum16() ipv4_checksum;    	   
    apply {
        if (hdr.ipv4.hdrChecksum != ipv4_checksum.get({hdr.ipv4.version, hdr.ipv4.ihl, hdr.ipv4.diffserv, hdr.ipv4.totalLen, hdr.ipv4.identification, hdr.ipv4.flags, hdr.ipv4.fragOffset, hdr.ipv4.ttl, hdr.ipv4.protocol, hdr.ipv4.srcAddr, hdr.ipv4.dstAddr})) 
            mark_to_drop();
    }
}

for reasons that should be obvious. If we adopt the proposed primitive, I think I'd like the same ability to use variables and assignment to avoid having to explicitly list out each field in the IP header.

Popping up a level, I'm missing the context for why we need to be so draconian. Bmv2 is a software switch, implemented in C++ so we can make it do whatever we want. Even on a hardware target, I'd really like to be able to use variables, simple control flow, etc. to express checksumming functionality (which is just a function from header + metadata bits to header + metadata bits + some extra metadata). The compiler should be able to take care of slotting the right values into the fixed-function checksum unit and assigning to the extra metadata as appropriate. To me that's in the spirit of P4_16.

If the group wants to go with this in the short-term, I prefer @antoninbas's refinement since a follow-on control can decide what to do.

@mbudiu-vmw for better or worse, I think the cat may be out of the bag in the sense that v1model is being used more broadly than as a target for P4_14 programs.

@sethfowler
Copy link
Contributor

I feel like we should just bite the bullet, accept that we need access to parser errors in the match-action pipeline, and implement #808.

@jnfoster, I feel like this approach is most natural:

control MyVerifyChecksum(in headers hdr, inout metadata meta) {
    Checksum16() checksum;    	   
    apply {
        if (checksum.get(hdr.ipv4) != 0) 
            mark_to_drop();
    }
}

@jnfoster
Copy link
Contributor

That would be nice, but doesn't it mean that checksum would need to somehow magically know which field is the checksum so it can ignore it in the calculation? If so, then I fear it's really an IP checksum extern and hence can't have a generic type and can't be used to implement all P4_14 programs...

@sethfowler
Copy link
Contributor

Usually checksums work by ensuring that the sum over the entire input, including the checksum, is zero. (Sometimes another value is used, but not usually.) In this particular case, the sum over the entire IPv4 packet header including the checksum will be zero if the checksum is valid.

@jnfoster
Copy link
Contributor

jnfoster commented Aug 17, 2017

Mea culpa (was looking at an example derived from the current translated code). For verify checksum, yes. For compute checksum, the issue would arise.

@antoninbas
Copy link
Member

@mbudiu-vmw the idea is indeed to expand v1model so it can be used to write more complete P4_16 programs (while retaining its ability to support any P4_14 program), as the PSA toolchain may take some time to catch up. After thinking about it some more, I think my preferred solution is to extend v1model's standard metadata with a one bit checksum_error flag. This flag will be set to 1 by the architecture if any of the checksum_verify extern calls in the verify control detects a checksum error, and can be consumed by the programmer in ingress. When translating from P4_14, this flag will not be referenced in the translated ingress. Would that be acceptable to you?

@jnfoster bmv2 is indeed a software switch and is able to support your code snippet. simple_switch, however, is not able to in its current form as it only supports 2 control flows, ingress and egress. For this specific issue, it could be fairly easy to update simple_switch to support 2 extra control flow, verify and compute, in a backward-compatible fashion. There would also be some work needed to support the Checksum16 extern natively in a satisfactory way. I am not opposed to the implementation of these changes, although I'd much rather do them directly in bmv2-PSA and not in bmv2-simple_switch.
Also, if there is no expectation of restricted syntax, why even have a verify and compute control flow to start with?

@mihaibudiu
Copy link
Contributor

Yes, handling the error bit as intrinsic metadata is simplest. I will work on a pr with this approach.

@cc10512
Copy link
Contributor

cc10512 commented Aug 17, 2017

+1 for @antoninbas proposal. Currently the PSA psa_parser_ingress_input_metadata_t defines a parser_error field of type ParserError_t which may include a CHECKSUM_VERIFY_ERROR. As far as I can see, the v1model intrinsic metadata and setting parser_error are equivalent. And we also agreed in the P4 arch WG that packets rejected by the parser should make it into the ingress with some reasonable way of identifying the error cause (https://github.com/p4lang/p4-spec/wiki/PSA-meeting-minutes:-Jul-24,-2017#reject-semantics-httpsgithubcomp4langp4-specissues339)

@jafingerhut P4_14 to P4_16 conversion is crucial. Even though we are encouraging everyone to write P4_16, there will be a long period of transition for legacy code. It will be best if this code runs on the p4c toolchain. And we do not want to duplicate the code in the compiler to handle two language versions, so we will continue to translate.

@jnfoster
Copy link
Contributor

jnfoster commented Aug 18, 2017 via email

@jafingerhut
Copy link
Contributor Author

jafingerhut commented Aug 19, 2017

@antoninbas wrote in an earlier comment: "my preferred solution is to extend v1model's standard metadata with a one bit checksum_error flag. This flag will be set to 1 by the architecture if any of the checksum_verify extern calls in the verify control detects a checksum error, and can be consumed by the programmer in ingress. When translating from P4_14, this flag will not be referenced in the translated ingress."

Is the intended behavior in P4_14 to drop packets that fail any of the defined checksums for incoming packets?

If so, how will you implement that behavior in a P4_16 program that has been auto-translated from P4_14, if "this flag will not be referenced in the translated ingress"? Will the flag be referenced somewhere else in the translated P4_16 program?

Also, is there some short sample P4_16 program that uses this approach yet, for reading and understanding purposes?

Also, anyone willing to make modifications to the latest draft of the PSA document with any changes needed to match this proposed behavior? Or are these proposed changes for v1model only proposed for v1model, and PSA is still free to implement checksums in a different way?

@mihaibudiu
Copy link
Contributor

The flag is simply not used when translating P4-14 programs, but if one writes P4-16 against the v1model then the flag can be used.

This is all about v1model, the PSA does not need to follow this approach. Note that v1model does not support incremental checksum updates, whereas the goal for PSA is to have that support. Also note that the current API is still not sufficient for handling TCP checksums. We may need to add an additional extern function "checksum_verify_including_payload".

Please take a look at #858, in particular the v1model.p4 file is the most important. The final API I implemented also has an argument for conditional execution.

@mihaibudiu mihaibudiu added the fixed This topic is considered to be fixed. label Aug 22, 2017
@mihaibudiu
Copy link
Contributor

with the new APIs this issue should be gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This behavior is unintended and should be fixed. fixed This topic is considered to be fixed.
Projects
None yet
Development

No branches or pull requests

7 participants