-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PSA] Restrictions on use of Checksum extern? #385
Comments
Also, I am guessing that it is not practical to try to implement a checksum other than one like the IPv4 header checksum using the Checksum extern? That one is commutative and associative in how it is calculated, over 16-bit "words". Things like CRCs are not commutative, so implementing update() of say 20 bytes of data, then trying to remove() on some 4-byte piece in the middle, then update() to add back in a different 4-byte piece, would not be possible unless you knew exactly where that 4-byte piece occurred in the middle of the 20-byte original. Even if you augmented the remove() and update() methods to indicate such a position in the data, I'm not sure it is practical to implement such changes to a CRC with such an API. |
I agree with @jafingerhut, having explicit incremental checksum updates is a really nice idea but they are only applicable to checksum algorithms. The penalty for recomputing UDP/TCP checksums is so high, that having a specific incremental checksum extern is worth while. In P4-14 days it was a lot easier to detect when an incremental checksum update could be done and it was feasible to have the backend handle the updates. I don't think this that easy in P4-16. |
I expect this will be a difficult design decision. Some architectures may support incremental checksum computation, while other may not. Perhaps the right choice is to offer both and to require that at least one of them is supported.
|
BTW: there is nothing in P4-16 about checksum units; you can design a checksum unit which looks very much like the P4-14 version; the actual API is up to the architecture. |
This amounts to forking the architecture into PSA-incremental and
PSA-nonincremental. If we're serious about issues like portability,
compliance, etc. In my opinion, we should strive to do this only if we are
really unable to resolve the issue.
…On Wed, Aug 16, 2017 at 11:58 AM, Mihai Budiu ***@***.***> wrote:
I expect this will be a difficult design decision. Some architectures may
support incremental checksum computation, while other may not. Perhaps the
right choice is to offer both and to require that at least one of them is
supported.
#if INCREMENTAL
extern IncrementalUnit { ... }
#else
extern ChecksumUnit { .. }
#endif
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#385 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABwi0rAEnY39itcO8Yba5pUjiK4LX1fQks5sYxGIgaJpZM4O44G_>
.
|
If your programmable platform has only support for the "wrong" kind of checksum unit, then it won't ever be able to implement PSA. |
I understand. |
Something to keep in mind about whatever checksum extern we end up with in PSA: It would be nice if the checksum extern API made it reasoanble to implement 'fixing up' a TCP header checksum for packets experiencing changes to their header fields due to NAT. A reasonable common way to do this, I believe, is to assume that the TCP header checksum is correct, and 'subtract out' the effects of the old TCP pseudo-header (containing IP source and destination addresses), and then 'add in' the effects of the new TCP pseudo-header. The remove() extern method in the latest PSA draft was created with this use case in mind, I would bet. |
It seems like there are three main issues related to checksums.
The current PSA source in master includes the following declarations: control ComputeChecksum<H, M>(inout H hdr, inout M user_meta);
package PSA_Switch<IH, IM, EH, EM>(IngressParser<IH, IM> ip,
Ingress<IH, IM> ig,
ComputeChecksum<IH, IM> ic,
Deparser<IH> id,
EgressParser<EH, EM> ep,
Egress<EH, EM> eg,
ComputeChecksum<EH, EM> ec,
Deparser<EH> ed); Note that in issue #360, we already decided to eliminate the first However, it's natural to wonder whether the
The V1Model architecture recently proposed to adopt a new extern function (here specialized to 16-bits): /**
Verifies the 16-bit checksum of the supplied data.
T must be a list expression where all the fields are bit-fields or varbits
and where the total dynamic length of the fields is a multiple of 16 bits.
If this method detects that a checksum of the data is not correct it
sets an internal error flag.
@param condition If 'false' the verification always succeeds.
@param data Data whose checksum is verified.
@param checksum Expected checksum of the data.
*/
extern void verify_checksum<T>(in bool condition, in T data, in bit<16> checksum);
/**
Computes the 16-bit checksum of the supplied data.
@param condition If 'false' the checksum is not changed
@param data Data whose checksum is computed.
@param checksum Checksum of the data.
T must be a list expression where all the fields are bit-fields or varbits
and where the total dynamic length of the fields is a multiple of 16 bits.
*/
extern void update_checksum<T>(in bool condition, in T data, inout bit<16> checksum); There are a few issues with this definition we might consider in defining the PSA extern:
As a strawman, suppose that PSA provided the following function: extern O checksum<O,T>(in T data, HashAlgorithm hash); Rationale:
For example, I could imagine writing the following code in the parser: state parse_ipv4 {
extract(hdrs.ipv4);
verify(hdr.ipv4.checksum == checksum({ hdr.ipv4.src, ... }, HashAlgorithm.crc16 ), error.ChecksumError);
transition ...
} Note that if I didn't want to bail out of the parser, instead of using In the same vein, I could imagine writing the following code in the egress control (or update checksum control if we keep that): if (hdr.ipv4.isValid())
hdr.ipv4.checksum = checksum({hdr.ipv4.src, ...}, HashAlgorithm.crc16); I realize this proposal slightly complicates compilation, but I would argue not in a particularly harmful way. A compiler for the V1Model functions already has to keep track of boolean conditions and references to state in parsed headers and metadata. It should be straightforward to rewrite code that uses (only)
Here's one proposal for a extern object: extern checksum<O> {
checksum(HashAlgorithm hash);
void clear();
void add<T>(in T data);
void remove<T>(in T data);
O get();
bool verify(); // equivalent to get() == 0
} Comments? @jafingerhut @cc10512 @mbudiu-vmw @hanw @antoninbas |
In the latest PSA draft, there are two ComputeChecksum blocks not because one is after parser and other before deparser, but because one is after ingress, and before ingress deparser (currently a separate control block in the PSA draft), and the other is after egress, and before egress deparser. There is currently no 'verifyChecksum' control block in the latest PSA draft, in favor of performing checksum verification of received packets in the parser. |
From some of the discussion on this p4lang/p4c issue p4lang/p4c#775 I had the impression that verify_checksum() and update_checksum() were introduced primarily to ease automatic conversion from P4_14 to P4_16 programs. If that is so, perhaps those extern functions could be preferred for use by that automatic conversion process, but the checksum extern object at the end of @jnfoster's comment above [1], with both add() and remove() methods, might be preferable for use when writing P4_16 PSA programs by hand? [1] #385 (comment) |
That's my fault. I think I missed the meeting where that was discussed in detail and in my conversations this week, there was some confusion. I've updated my comment to reflect that. |
That's right.
It's fine with me. However, I'd be curious to hear whether we want to require this functionality -- e.g., are there targets that might provide an extern function for checksumming but not an incremental unit that could implement the object? |
@jafingerhut is right: the changes about *_checksum functions which were made recently were entirely confined to v1model and were made to more accurately reflect restrictions of the BMv2 simple_switch implementation of these functions. These restrictions are directly inherited from the P4-14 spec, so we cannot work around them. If your target is very restricted in what it can do, it may be actually very difficult to write compiler transformations to massage all feasible programs into the restricted forms allowed by the target. I expect that the checksum unit will be a perfect example. You should consider what is needed to do this for an exotic target such as Tofino. The discussion is whether PSA has to support incremental checksums. I believe it should. I don't know exactly what should happen when someone implements support for PSA on a target device which actually does not have incremental checksums. |
As an addendum, without thinking too carefully about resource implications for any particular target, I think I would be in favor of a solution that eliminated the Suppose that we are on a target where all checksumming must happen after the ingress/egress pipelines, but the programmer has written a program that computes a checksum in the middle of one of the ingress/egress pipelines. It follows that the compiler will either have to implement the checksum computation using ALUs or other resources -- which might be expensive but should be feasible -- or it will have to relocate the checksum computation to the end of the block.
This story is relatively simple for a batch-mode checksum function. It's more complicated for an object, which has state that might persist across calls. But, waving my hands slightly, it should be possible for the compiler to extract that state from the call site and reconstruct it later to emulate the semantics of the original program. |
@mbudiu-vmw
Right. Note that the vagaries of the P4_14 conversion need to have any effect on PSA. We are starting from a clean slate and can design an elegant solution.
I think that for the common case, the transformation will be not too difficult to implement and will lead to the same efficient implementations as one would write with a |
I believe that the parts of this issue related to the Internet checksum for IPv4, TCP, and UDP headers have been fully addressed by the addition of the InternetChecksum extern to the PSA, which supports incremental checksums and 'calculated from scratch' checksums, at least for headers (by design, it does not support performing checkums over packet payloads). Unless there is some other aspect of this issue that remains, I think it can be closed now. |
Indeed, we have considered most aspects. |
The current draft of the Checksum extern has clear(), update(), remove(), and get() methods proposed.
Some externs explicitly maintain state across processing of different packets. There doesn't appear to be any desirable reason why the Checksum extern should. Is it considered that every time a new packet begins processing, this extern has automatically had clear() called on it already? Or is it required that for each new packet you first call clear(), because if you do not, it might have internal checksum state left over from a previous packet?
If we restrict our attention to the 16-bit one's complement sum used for IPv4 header checksums for a moment, the primary, and maybe only (?), use case for this extern, a few more questions:
Is it considered an error to call update() or remove() with a collection of data that is anything other than a multiple of 16 bits?
Is it expected that if a list of fields is given, that their order is significant, because the extern effectively takes the list of fields, concatenates them, breaks it up into consecutive multiples of 16 bits long, and does the one's complement sum on these 16-bit pieces?
If all of that is true, then an example showing how to correctly use this to first calculate an IPv4 header checksum, then modify only 1 byte, e.g. the TTL, then use remove() followed by update() to recalculate the new checksum with minimal work, would be good, including pointing out that the remove() and update() must both contain the "2-byte aligned" group of 16 bits in the header that contains the TTL field.
The text was updated successfully, but these errors were encountered: