You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For future NICs, the best would be to tell HW that it should prepend some metadata before the start of the frame directly to the packet buffer.
However, that's not the case for the vast majority of current NICs.
For now, we have 4 visible options:
Use if-else ladders as it's done in e.g. Flow Dissector code.
This may work good when we just need e.g. 2-3 fields like RSS hash or csum, but will require additional code expanding for any new field or layout;
use structs to describe the desired layout and then perform memcpy in the for-loop.
This is far more flexible way, but for the cases with 2-3 u32 fields can be an overkill;
use a tiny XDP prog to generate metadata and then run the actual prog.
This might be both scalable and fast, but involves prog linking etc. Can also be an overkill for lots of cases;
just copy full descriptor to the metadata area;
The fastest way, but it lacks flexibility and can go worse in cases when descriptor sizes is 40+ bytes.
All of these can be tried, benchmarked and picked according to the results, but some design discussion would definitely help.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
For future NICs, the best would be to tell HW that it should prepend some metadata before the start of the frame directly to the packet buffer.
However, that's not the case for the vast majority of current NICs.
For now, we have 4 visible options:
This may work good when we just need e.g. 2-3 fields like RSS hash or csum, but will require additional code expanding for any new field or layout;
This is far more flexible way, but for the cases with 2-3 u32 fields can be an overkill;
This might be both scalable and fast, but involves prog linking etc. Can also be an overkill for lots of cases;
The fastest way, but it lacks flexibility and can go worse in cases when descriptor sizes is 40+ bytes.
All of these can be tried, benchmarked and picked according to the results, but some design discussion would definitely help.
Beta Was this translation helpful? Give feedback.
All reactions