New p4orch development changes #3066

mint570 · 2024-02-29T04:47:05Z

New p4orch development changes.

This is the first PR for upstreaming the recent P4Orch changes (has been a while since the last time).
The main changes include:

Add new status code SWSS_RC_NOT_EXECUTED
Enable redis pipeline in response publisher for P4Orch (performance improvement)
Enable background thread of write to db in response publisher for P4Orch (performance improvement)
P4Orch writes to APPL DB instead of APPL STATE DB for responses (prepare for the zmq change, APPL DB will be used as APPL STATE DB in P4Orch for performance improvement)

More details on the P4Orch response path changes (changes are only done on P4Orch, other orchs has no behavior changes):

Enable P4Orch to use redis pipeline in response publisher.
The redis pipeline feature in response publisher is done by the SONiC communicate to improve performance. We enabled P4Orch to use the pipeline feature. This will increase performance in batched request processing. This change has no behavior changes.
Enable background thread in response publisher to write to DB for P4Orch.
This change will move the APPL STATE DB write into a background thread, which will improve performance. This change has no behavior changes. The original change was done for APPL STATE DB since the response path writes into the APPL STATE DB. With the next change, APPL STATE DB is deprecated and replaced with APPL DB. So the overall change is that the P4Orch response path will write into APPL DB in a background thread.
P4Orch writes to APPL DB instead of APPL STATE DB in response path.
The APPL STATE DB is supposed to have successful entries programmed in the ASIC, while the APPL DB has the intend. However, for P4 tables, P4RT needs to clean up APPL DB to match with APPL STATE DB to keep them in sync when a request fails. This is done for warmboot purpose as we don't want to program the failed intend during warmboot. So for P4RT, the APPL STATE DB is redundant information. This change will replace APPL STATE DB with APPL DB. In later PR, we will also upstream the change for P4Orch to enable zmq. This will reduce half of the DB write operations. This change will require the later P4RT change to be in sync.

mint570 · 2024-02-29T04:55:47Z

This PR requires sonic-net/sonic-swss-common#828

prsunny · 2024-04-15T16:51:56Z

orchagent/response_publisher.cpp

@@ -64,26 +64,52 @@ void RecordResponse(const std::string &response_channel, const std::string &key,

 } // namespace

-ResponsePublisher::ResponsePublisher(bool buffered)


Can you please provide the details of response_publisher in the description? What was the behavior before and whats the new change? Is it breaking any previous implementation?

Updated the PR description.

prsunny · 2024-04-15T16:53:12Z

Why change the return path to APP_DB instead of APPL_STATE_DB?

mint570 · 2024-04-15T17:11:00Z

Why change the return path to APP_DB instead of APPL_STATE_DB?

I updated the PR description.
This is to prepare for the zmq change that we are going to upstream. P4Orch will only update APPL DB (in a background thread) on response path. APPL STATE DB is redundant for P4Orch (as we need to keep them in sync for warmboot purpose).

prsunny · 2024-04-22T17:21:13Z

@qiluo-msft , could you please review the response publish section? As per description there is no change to other orch.

mint570 · 2024-04-24T21:36:34Z

Got vs test failure on fabric port test.

mint570 · 2024-04-25T17:19:34Z

Got vs test failure on fabric port test.

Looks like this is not related to my change. I tried a test PR with no change it still failed the vs test.

prsunny · 2024-04-26T17:57:34Z

@mint570 , there has been a VS issue in buildimage and is fixed now. I'll trigger all PRs to rerun

qiluo-msft · 2024-05-02T01:23:17Z

@bocon13, could you please review the response publish section?

qiluo-msft · 2024-05-02T01:23:42Z

orchagent/response_publisher.h

+    // Thread to write to DB.
+    std::unique_ptr<std::thread> m_update_thread;
+    std::queue<entry> m_queue;
+    mutable std::mutex m_lock;


mutable

why mutable?

That is a common practice for mutex object.

The key word "mutable" means that the variable can be changed in a const method.
If we have a method to read a variable that is protected by a mutex, it is nature to declare the method as const since it only does read operation. But the mutex object needs to be mutable for the read method to get the lock.

In this case, we might not really need it to be mutable. But we should follow the common practice.

we might not really need it to be mutable -> let's just remove it.

Thanks for the explanation! (not a blocking issue)

orchagent/response_publisher.h

qiluo-msft · 2024-05-02T01:25:16Z

orchagent/response_publisher.h

@@ -57,8 +61,29 @@ class ResponsePublisher : public ResponsePublisherInterface
    void setBuffered(bool buffered);

  private:
+    struct entry


entry

Where can I find HLD? #Closed

This is just internal implementation details. We don't have HLD for this.

The overall design is to put the DB update operation into a different thread for the response publisher.

In the detailed implementation here, we use a FIFO queue to store the DB update events. The main thread will queue up the event into the queue, and the "DB update thread" will read from the queue and process the DB update.
The "entry" struct here is the "DB update event".

Let me know if you need more details on this.

orchagent/response_publisher.cpp

qiluo-msft · 2024-05-02T01:27:31Z

orchagent/response_publisher.cpp

+        }
+        if (e.flush)
+        {
+            m_pipe->flush();


m_pipe

You should not share DBConnector between threads. It is not thread-safe. #Closed

This pipe object is only used in the "DB update thread". It is not used in the main thread beside constructor.

The DB write operations are completed handled by the "DB update thread" if thread mode is enabled.
(If thread mode is disabled, the main thread will update the DB. There is no "DB update thread" in that case.)

Change-Id: I867ce9d1d4a641b35493fb81d60813083d4404f9

Change-Id: Ib028780e726a9480e6049514bc776ccf27b84496

prsunny · 2024-05-15T15:50:50Z

tests/p4rt/test_p4rt_acl.py

@@ -241,26 +235,6 @@ def test_AclRulesAddUpdateDelPass(self, dvs, testlog):
        assert status == True
        util.verify_attr(fvs, attr_list)

-        # query application state database for ACL tables


Since its deleted, where is the check for APPL_STATE_DB entries?

P4Orch will no longer write to APPL_STATE_DB, so there is no APPL_STATE_DB entries for P4RT table.
(P4Orch writes to APPL_DB instead, which is no change at the moment as APPL_DB is already written when orchagent pops the request entries.)

But P4Orch will still send the response to the Redis channel, which is checked and verified in util.verify_response().
And the test also checks the APPL_DB entries. For now it doesn't mean much. But in the future when we upstream the ZMQ changes, P4Orch will not write to APPL_DB in pop; it will write to APPL_DB in response path instead. At that time the P4RT table in APPL_DB will represent the state (not the intent).

qiluo-msft · 2024-05-15T17:15:45Z

LGTM.

mint570 requested a review from prsunny as a code owner February 29, 2024 04:47

mint570 force-pushed the new_p4orch_upstream branch from 090bf88 to 048cc63 Compare March 6, 2024 03:11

mint570 force-pushed the new_p4orch_upstream branch from 048cc63 to 75ac844 Compare March 19, 2024 21:21

mint570 mentioned this pull request Mar 20, 2024

Clang format change. #3080

Merged

mint570 force-pushed the new_p4orch_upstream branch from 75ac844 to ca95fe9 Compare April 1, 2024 22:54

mint570 force-pushed the new_p4orch_upstream branch 2 times, most recently from f263fcf to df4e004 Compare April 9, 2024 00:42

prsunny reviewed Apr 15, 2024

View reviewed changes

prsunny requested a review from qiluo-msft April 22, 2024 17:21

mint570 force-pushed the new_p4orch_upstream branch 2 times, most recently from 8066625 to 03f0722 Compare April 24, 2024 17:40

mint570 force-pushed the new_p4orch_upstream branch from 03f0722 to 4e81549 Compare April 26, 2024 16:32

mint570 force-pushed the new_p4orch_upstream branch 3 times, most recently from 9c01751 to fa0efe5 Compare April 30, 2024 23:32

qiluo-msft reviewed May 2, 2024

View reviewed changes

orchagent/response_publisher.h Outdated Show resolved Hide resolved

qiluo-msft reviewed May 2, 2024

View reviewed changes

orchagent/response_publisher.cpp Outdated Show resolved Hide resolved

qiluo-msft reviewed May 2, 2024

View reviewed changes

mint570 force-pushed the new_p4orch_upstream branch 2 times, most recently from f3af0da to 9155503 Compare May 2, 2024 20:47

mint570 force-pushed the new_p4orch_upstream branch 3 times, most recently from fb27335 to 883fd32 Compare May 13, 2024 16:38

mint570 added 2 commits May 14, 2024 16:42

New p4orch development for performance.

e432f7f

Change-Id: I867ce9d1d4a641b35493fb81d60813083d4404f9

Address review comments

a43e01a

Change-Id: Ib028780e726a9480e6049514bc776ccf27b84496

mint570 force-pushed the new_p4orch_upstream branch from 883fd32 to a43e01a Compare May 14, 2024 23:42

prsunny reviewed May 15, 2024

View reviewed changes

prsunny approved these changes May 15, 2024

View reviewed changes

prsunny merged commit c36333c into sonic-net:master May 15, 2024
17 checks passed

mint570 deleted the new_p4orch_upstream branch May 15, 2024 21:30

mint570 mentioned this pull request May 18, 2024

SWSS Upstream of P4Orch changes sonic-net/SONiC#1614

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New p4orch development changes #3066

New p4orch development changes #3066

mint570 commented Feb 29, 2024 •

edited

Loading

mint570 commented Feb 29, 2024

prsunny Apr 15, 2024

mint570 Apr 16, 2024

prsunny commented Apr 15, 2024

mint570 commented Apr 15, 2024

prsunny commented Apr 22, 2024

mint570 commented Apr 24, 2024

mint570 commented Apr 25, 2024

prsunny commented Apr 26, 2024

qiluo-msft commented May 2, 2024

qiluo-msft May 2, 2024

mint570 May 2, 2024

qiluo-msft May 15, 2024 •

edited

Loading

qiluo-msft May 2, 2024 •

edited

Loading

mint570 May 2, 2024

qiluo-msft May 2, 2024 •

edited

Loading

mint570 May 2, 2024

prsunny May 15, 2024

mint570 May 15, 2024

qiluo-msft commented May 15, 2024

		@@ -64,26 +64,52 @@ void RecordResponse(const std::string &response_channel, const std::string &key,

		} // namespace

		ResponsePublisher::ResponsePublisher(bool buffered)

New p4orch development changes #3066

New p4orch development changes #3066

Conversation

mint570 commented Feb 29, 2024 • edited Loading

mint570 commented Feb 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prsunny commented Apr 15, 2024

mint570 commented Apr 15, 2024

prsunny commented Apr 22, 2024

mint570 commented Apr 24, 2024

mint570 commented Apr 25, 2024

prsunny commented Apr 26, 2024

qiluo-msft commented May 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiluo-msft May 15, 2024 • edited Loading

Choose a reason for hiding this comment

qiluo-msft May 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiluo-msft May 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiluo-msft commented May 15, 2024

mint570 commented Feb 29, 2024 •

edited

Loading

qiluo-msft May 15, 2024 •

edited

Loading

qiluo-msft May 2, 2024 •

edited

Loading

qiluo-msft May 2, 2024 •

edited

Loading