Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for "UNDERLAY_SET_DSCP" and "UNDERLAY_SET_DSCPV6" tables #3145

Merged
merged 15 commits into from
May 30, 2024

Conversation

siqbal1986
Copy link
Contributor

@siqbal1986 siqbal1986 commented May 13, 2024

What I did
This PR adds the logical table UNDERLAY_SET_DSCP and UNDERLAY_SET_DSCPV6.
This feature allows SONIC to match an ingressing L4 packet and change the DSCP field of the outer header when the packet is egressing.
These tables are only created on Cisco-8000, Mlnx and VS platforms.
Why I did it
The match set for these tables are following. The action for both of these tables is SET_DSCP
UNDERLAY_SET_DSCP

  • SRC_IP
  • DST_IP
  • IP_PROTOCOL
  • L4_SRC_PORT
  • L4_DST_PORT
  • TCP_FLAGS
  • DSCP

UNDERLAY_SET_DSCPV6

  • SRC_IPV6
  • DST_IPV6
  • IPV6_NEXT_HEADER
  • L4_SRC_PORT
  • L4_DST_PORT
  • DSCP
    Note: These tables are not created but only their names are used.

Since we require matching based on L4 parameters, all the mentioned match attributes are necessary. Merging these into a single table would result in a larger TCAM footprint which may be impact existing ACL usage scenarios.

These table names transalte into MARK_META and MARK_METAV6 and EGR_SET_DSCP tables.
The translation is such that an attempt to create UNDERLAY_SET_DSCP or UNDERLAY_SET_DSCPV6 results into creation of
MARK_META & EGR_SET_DSCP or MARK_METAV6 & EGR_SET_DSCP .
The EGR_SET_DSCP table is shared and created only once. if both UNDERLAY_SET_DSCP and UNDERLAY_SET_DSCPV6 are created then only one instance of EGR_SET_DSCP is created to save tcam resource. The EGR_SET_DSCP is created in the Egress stage.

These internal MARK_META/V6 and EGR_SET_DSCP tables use SAI_ACL_ENTRY_ATTR_ACTION_SET_ACL_META_DATA and SAI_ACL_ENTRY_ATTR_FIELD_ACL_USER_META to set and match the metaData field.
For Example:
if the intent is to match an ingressing packet with fields
"DST_IP": "20.0.0.1/32",
"SRC_IP": "10.0.0.0/32",
"DSCP": "1"
"SRC_PORT" : "10"
"DST_PORT" : "20"
and set its outer DSCP to 12 after encapsulation,
This would be done so by MARK_META entry in ingress tcam matching based on above mentioned critera and setting a metaData value e.g. "1"
Another EGR_SET_DSCP entry in egress stage would be created with match criteria of metaData=1 and action of SET_DSCP=12.

Each entry in the EGR_SET_DSCP table is refcounted and shared among all the rules interested in same DSCP value.
There are 7 metadata values available [1...7]. This is currently hardcoded but this can be enhanced based on capibility check.

A Metadata Manager class governs the allcoation and freeing of each metadata value and they are shared based on DSCP value.
How I verified it
Tests added for the tables and Metadata Manager. 9 Test cases have been added which verfiy the table creation, deletion, EGR_SET_DSCP referencing, Entry creation/deletion, metadata value allcation/ release and exhausion scenarios.

Details if related
tested on mlnx 2700 with following rule
{ "ACL_RULE": {
"OVERLAY_MARK_META_TEST|RULE0": {
"PRIORITY": "999",
"DSCP_ACTION": "40",
"SRC_IP": "1.1.1.1/32"
},
"OVERLAY_MARK_META_TEST|RULE1": {
"PRIORITY": "998",
"DSCP_ACTION": "42",
"SRC_IP": "2.2.2.2/32"
}
} and a vxlan tunnel
dscp value change

…s. This feature allows

SONIC to match an ingressing packet and change the DSCP field of the outer header when the
packet is egressing.
Tests added for the tables and Metadata Manager.
@siqbal1986 siqbal1986 requested a review from prsunny as a code owner May 13, 2024 05:48
@siqbal1986 siqbal1986 changed the title Added support for "UNDERLAY_SET_DSCP" and "UNDERLAY_SET_DSCPV6" tables [Draft] Added support for "UNDERLAY_SET_DSCP" and "UNDERLAY_SET_DSCPV6" tables May 13, 2024
},
{
// MARK_METAV6
TABLE_TYPE_MARK_META_V6,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to combine the table TABLE_TYPE_MARK_META and TABLE_TYPE_MARK_META_V6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do it but that would result increating a bigger table which would be wasteful. Currently only broadcom supports the merged table as per my understanding. We need to have all the match attributes which are necessary for Ipv4/v6 packet matching.

.withMatch(make_shared<AclTableMatch>(SAI_ACL_TABLE_ATTR_FIELD_L4_SRC_PORT))
.withMatch(make_shared<AclTableMatch>(SAI_ACL_TABLE_ATTR_FIELD_L4_DST_PORT))
.withMatch(make_shared<AclTableMatch>(SAI_ACL_TABLE_ATTR_FIELD_TCP_FLAGS))
.withMatch(make_shared<AclTableMatch>(SAI_ACL_TABLE_ATTR_FIELD_DSCP))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only keep the necessary matching field to save TCAM. And that should be better if you can combine V4 and V6 into one table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as stated above we need to match ipV4/V6 packets. i dont think we can skip on any of these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the fields to be used for matching? If it's for packet capturing, do we need to align with the fields defined in MIRROR/MIRRORV6?

else
{
//create a dummy table with no ports. The updateAclTable will remove the ports
// which were associated with the egrSetDscpTable we just added because the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose for this dummy table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only create one instance of EGR_SET_DSCP, the bind ports it needs to support have to be a set of MARK_META and Mark_METAV6. if e.g. MARK_META hs bind ports [Eth1, Eth2] and MARK_METAV6 has [Eth3,Eth4] then EGR_SET_DSCP needs to bind on Eth1, Eth2, Eth3, Eth4. if one of the table is removed we need to remove its ports form EGR_SET_DSCP. This EGR_SET_DSCP dummy table object is created if one instance of the MARK_META table is removed. we take its bind ports and take those out of the actual EGR_SET_DSCP table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the port merge happened?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siqbal1986 Can you check this comment?

if (tableTypeName == TABLE_TYPE_UNDERLAY_SET_DSCP || tableTypeName == TABLE_TYPE_UNDERLAY_SET_DSCPV6)
{
// This table translates into 2 tables.
//TABLE_TYPE_MARK_META/V6 on ingress side and TABLE_TYPE_EGR_SET_DSCP on Egress side.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there capability check for creating the new tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently no. but i inted to put that in in the next PR as an enhancement.

@@ -4447,7 +4740,7 @@ void AclOrch::doAclTableTask(Consumer &consumer)
// Add mandatory ACL action if not present
newTable.addMandatoryActions();
// validate and create/update ACL Table
if (bAllAttributesOk && newTable.validate())
if (egrSetDscpStatus && bAllAttributesOk && newTable.validate())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check egrSetDscpStatus here? It will break the current code.

Copy link
Contributor Author

@siqbal1986 siqbal1986 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the egrSetDscpStatus is by default TRUE. it can only be false if we attempt to create EGR_SET_DSCP table and fail. In such a case creating the MARK_META/V6 table is futile and should fail.
The logic goes like this.
egrSetDscpStatus = true
if UNDERLAY_SET_DSCP
then create EGR_SET_DSCP. failure should set egrSetDscpStatus =false
if egrSetDscpStatusis false then dont create MARK_META/v6
if egrSetDscpStatusis true then create MARK_META/v6 or other table for which the request came in.

bAllAttributesOk = false;
}
}
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the alignment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is github issue. i dont see the alignment problem in other editors.


// validate and create ACL rule
if (bAllAttributesOk && newRule->validate())
if (egrDscpRuleStatus && bAllAttributesOk && newRule->validate())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check egrDscpRuleStatus here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same logic as before.
egrDscpRuleStatus = true
if table type is UNDERLAY_SET_DSCP
then create EGR_SET_DSCP rule . failure should set egrDscpRuleStatus =false
if egrDscpRuleStatus is false then dont create other rule.
if egrDscpRuleStatus is true then create MARK_META/v6 rule or other table rule for which the request came in.

m_egrDscpRuleMetadata.erase(key);
}
}
if (egrDscpRuleStatus && removeAclRule(table_id, rule_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, why egrDscpRuleStatus is checked here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if removal of EGR_SET_DSCP rule fails then we dont attempt to remove the rule of MARK_META/v6.
both rules must be added and removed together.

@prsunny
Copy link
Collaborator

prsunny commented May 15, 2024

@siqbal1986 , please move to draft PR if the title/description etc are not ready. This PR is not giving reviewers the full details.
There is build and coverage issues. Please fix them as well and move to 'ready'.

@@ -120,6 +120,7 @@ jobs:
sudo /sbin/ip link add Vrf1 type vrf table 1001 || { echo 'vrf command failed' ; exit 1; }
sudo /sbin/ip link del Vrf1 type vrf table 1001
pushd tests
mkdir -p logs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this change? Please remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed already.
the tests were passing on my buidl enviorment but failing in the pipeline. This was due to some difference in SAI searialize API. I have removed these extra log generation parameters.

@siqbal1986 siqbal1986 force-pushed the overlay_dscp_change_acl branch from cff3702 to b7757e9 Compare May 16, 2024 00:33
@siqbal1986 siqbal1986 changed the title [Draft] Added support for "UNDERLAY_SET_DSCP" and "UNDERLAY_SET_DSCPV6" tables Added support for "UNDERLAY_SET_DSCP" and "UNDERLAY_SET_DSCPV6" tables May 16, 2024
Refactored code to segrigate the new changes from existing functionality.
Improved comments.
Added check so that these new tables are only created on cisco-800, Mlnx
 and VS platforms.
@siqbal1986
Copy link
Contributor Author

@bingwang-ms i have done the following.
Refactored code to segrigate the new changes from existing functionality.
Improved comments.
Added check so that these new tables are only created on cisco-800, Mlnx
and VS platforms.

{
ACL_STAGE_EGRESS,
{
SAI_ACL_ACTION_TYPE_PACKET_ACTION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the purpose if to set DSCP value of the outgoing packet, why SAI_ACL_ACTION_TYPE_PACKET_ACTION is required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mistake. I have fixed it with the right actions. I dont really understand the SAI_ACL_ACTION_TYPE parameter. This is is set on table level but each entry comes with own actions. Does this specify action set for the table.

if (matchData.data.u8 < METADATA_VALUE_START || matchData.data.u8 > METADATA_VALUE_END)
{
SWSS_LOG_ERROR("Invalid MATCH_METADATA configuration: %s, expected value between 1-127", attr_value.c_str());
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message doesn't align with the value of METADATA_VALUE_END.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

else if (acl->isUsingEgrSetDscp(table) || table == EGR_SET_DSCP_TABLE_ID)
{
return make_shared<AclRuleUnderlaySetDhcp>(acl, rule, table, m_metadataMgr);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AclRuleUnderlaySetDhcp should be a typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return false;
}
cachedMetadata = metadata;
attr_name = ACTION_META_DATA;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the input attr_name is changed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table configured is UNDERLAY_SET_DSCP/V6 with ACTION SET_DSCP. When a new entry comes in it has ACTION_SET_DSCP with a dscp value. Here the action is changed to SET_METADATA and the allocated metadata value is set with this action for this entry. The dscp value is stored in the cachedDscpValue which is an internal variable of the AclRuleUnderlaySetDscp structure. Before this entry is installed in addAclRule, a EGR_SET_DSCP rule is created and the cachedDscpValue is set as its action while the cachedMetadata allocated here is used as its match criteria.

// otherwise we allocate a new metadata. This metadata is then set an the action for the Rule of this table. We also cache the SET_DSCP
// value and the allocated metadata in a the rule structure itself so that when we go to addRule we can use these to add the
// egr_set_dscp rule
if (attr_name == ACTION_DSCP && (type == TABLE_TYPE_MARK_META || type == TABLE_TYPE_MARK_META_V6))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What ports are the Egress table bound to? Is it always the same with the Ingress table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Egress table is bound to a set of ports which contains all the ports associated with tables of type UNDERLAY_SET_DSCP and UNDERLAY_SET_DSCPV6. There is a test test_OverlayBothv4v6TableCreationDeletion and test_OverlayBothv4v6TableSameintfCreationDeletion test these scenarios.
e.g. if one table has ports [eth0,eth1,eth2] and other has [eth2,eth3,eth4]
The egress table would be bound to [eth0,eth1,eth2,eth3,eth4].
If one of the table is removed only the ports associated with the table are removed except those which are used by other others. In the above mentioned case removal of first table would remove only eth0 and eth1. eth2 would be retained because second table is still bound to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The port binding may not be good enough for the prod scenario.
For example, the packet that we are interested is ingressed from eth0 and eth1 and egressed from eth10. So the ingress table should be bound to eth0 and eth1, and egress table should be bound to eth10 only. In current design, the egress table is always bound to eth1, eth2 and eth10. The concern is the DSCP of unexpected packet is changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order for the Egr_Set_dscp to work similar to everflow, it needs to be bound to all upstream or downsteam ports. There are 2 ways to do that.

  1. by default statically set EGR_SET_DCP table to be bound to all ports in the Orchagent code.
  2. While configuring, Associate the UNDERLAY_SET_DSCP/v6 with both upstream and downstream ports as it is done in prod scenarios with everflow.
    While option 1 is more idiot-proof, it requires midofication in portMgr to expose list of active/Inactive ports along with their type.( since we dont need this on Mgmt or cpu ports, the filtering on the port type would be required.) This makes the whole implementation hackey. Another downside is we are consuming table membership even when one may not need to and the user of this table has no choice to select the port association for this table in the future.
    I spent some time exploring the first option and concluded that while its is possible, it make the whole solution very specific to one use case.
    The dowside of option 2 is that user has to account for and be aware the fact that the traffic may egress on ports not asociated with UNDERLAY_SET_DSCP/v6. But for the prod scenario, we would configure it with both upstream and downstream ports just as we do with everflow.

if (m_MetadataRef[metadata] == 0)
{
for (auto iter = m_dscpMetadata.begin(); iter != m_dscpMetadata.end(); ++iter)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove an item from map in a for loop when the iter is being referenced can lead to crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, ill add more tests for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the fix and. added a test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is an easier way to do this. iter = m_dscpMetadata.erase(dscpToErase)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure what you mean here. the metadataMgr has a map of dscp -> metada m_dscpMetadata and refcount of metadata. when metadata refcount becomes 0, the logic finds goves over the metadataMgr to find the dscp which as this metadata asosciated. and finally removes it.
how is iter = m_dscpMetadata.erase(dscpToErase) easier. could you please explain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is an example.

for (auto it = m.cbegin(); it != m.cend() /* not hoisted */; /* no increment */)
{
  if (must_delete)
  {
    m.erase(it++);    // or "it = m.erase(it)" since C++11
  }
  else
  {
    ++it;
  }
}

@siqbal1986
Copy link
Contributor Author

the current pipeline failures in SWSS are due to CRM tests which are failing in master.

@prsunny
Copy link
Collaborator

prsunny commented May 30, 2024

@siqbal1986 , please add sonic-mgmt tests and link it here when ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants