Update workflows on release branch (ROCm#408)

* Prep work for branch renaming (ROCm#389) * Add `amd-staging` and `amd-mainline` to workflow. * Update branch names in documentation * Change `dev` to `amd-staging` and `main` to `amd-mainline` in documentation. * Remove references from 2.x from workflows. * Convert the link to LICENSE to a relative path in CONTRIBUTING. --------- Signed-off-by: David Galiffi <David.Galiffi@amd.com> * Remove `dev` and `main` branch from workflows. (ROCm#404) * Remove `dev` and `main` branch from workflows. Update links in documentation. Signed-off-by: David Galiffi <David.Galiffi@amd.com> * `amd-staging` -> `amd-mainline` in docs Signed-off-by: Peter Jun Park <peter.park@amd.com> --------- Signed-off-by: David Galiffi <David.Galiffi@amd.com> Signed-off-by: Peter Jun Park <peter.park@amd.com> Co-authored-by: Peter Jun Park <peter.park@amd.com> * Run Workflows on Release Branches Signed-off-by: David Galiffi <David.Galiffi@amd.com> --------- Signed-off-by: David Galiffi <David.Galiffi@amd.com> Signed-off-by: Peter Jun Park <peter.park@amd.com> Co-authored-by: Peter Jun Park <peter.park@amd.com>
lajagapp · Aug 13, 2024 · 527264b · 527264b
1 parent be9f4ef
commit 527264b
Show file tree

Hide file tree

Showing 17 changed files with 47 additions and 50 deletions.
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -2,7 +2,7 @@ name: Documentation
 
 on:
   push:
-    branches: ["main"]
+    branches: [ amd-mainline ]
     paths:
       - 'docs/archive/docs-2.x/**'
       - 'docs/archive/docs-1.x/**'
@@ -36,7 +36,7 @@ jobs:
       - name: Build 1.x docs
         run: |
           cd docs/archive/docs-1.x
-          make html	
+          make html
       - name: Build 2.x docs
         run: |
           cd docs/archive/docs-2.x

diff --git a/.github/workflows/formatting.yml b/.github/workflows/formatting.yml
@@ -3,9 +3,9 @@ name: Formatting
 
 on:
   push:
-    branches: [ main, dev, 2.x ]
+    branches: [ amd-mainline, amd-staging, release/** ]
   pull_request:
-    branches: [ main, dev, 2.x ]
+    branches: [ amd-mainline, amd-staging, release/** ]
 
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}

diff --git a/.github/workflows/mi-rhel9.yml b/.github/workflows/mi-rhel9.yml
@@ -2,11 +2,10 @@ name: mi-rhel9
 
 on:
   push:
-    branches:
-      - 'main'
+    branches: [ amd-mainline, release/** ]
 
   # Allows manual execution
-  workflow_dispatch:            
+  workflow_dispatch:
 
 permissions:
   contents: read

diff --git a/.github/workflows/packaging.yml b/.github/workflows/packaging.yml
@@ -8,7 +8,7 @@ on:
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}
   cancel-in-progress: true
-  
+
 jobs:
   distbuild:
     runs-on: ubuntu-latest

diff --git a/.github/workflows/rhel-8.yml b/.github/workflows/rhel-8.yml
@@ -5,9 +5,9 @@ name: RHEL 8
 # Controls when the workflow will run
 on:
   push:
-    branches: [ main, dev ]
+    branches: [ amd-mainline, amd-staging, release/** ]
   pull_request:
-    branches: [ main, dev ]
+    branches: [ amd-mainline, amd-staging, release/** ]
 
   # Allows you to run this workflow manually from the Actions tab
   workflow_dispatch:

diff --git a/.github/workflows/tarball.yml b/.github/workflows/tarball.yml
@@ -2,15 +2,13 @@ name: tarball
 
 on:
   push:
-    branches:
-      - main
-      - 2.x
+    branches: [ amd-mainline, release/** ]
   pull_request:
 
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}
   cancel-in-progress: true
-  
+
 jobs:
   distbuild:
     runs-on: ubuntu-latest
@@ -25,7 +23,7 @@ jobs:
             echo "sha=${{github.event.pull_request.head.sha}}" >> $GITHUB_OUTPUT
           else
             echo "sha=$GITHUB_SHA" >> $GITHUB_OUTPUT
-          fi      
+          fi
       - name: Checkout code
         uses: actions/checkout@v4
         with:
@@ -101,9 +99,9 @@ jobs:
         run: sudo apt-get install -y lmod
       - name: Access omniperf using modulefile
         run: |
-          . /etc/profile.d/lmod.sh 
+          . /etc/profile.d/lmod.sh
           module use $INSTALL_DIR/omniperf/share/omniperf/modulefiles
           module load omniperf
           module list
           omniperf --version
-          
+
diff --git a/.github/workflows/ubuntu-jammy.yml b/.github/workflows/ubuntu-jammy.yml
@@ -4,9 +4,9 @@ name: Ubuntu 22.04
 
 on:
   push:
-    branches: [ main, dev ]
+    branches: [ amd-mainline, amd-staging, release/** ]
   pull_request:
-    branches: [ main, dev ]
+    branches: [ amd-mainline, amd-staging, release/** ]
 
   # Allows you to run this workflow manually from the Actions tab
   workflow_dispatch:

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,12 +1,12 @@
 ## How to fork from us
 
-To keep our development fast and conflict free, we recommend you to [fork](https://github.com/ROCm/omniperf/fork) our repository and start your work from our `dev` branch in your private repository.
+To keep our development fast and conflict free, we recommend you to [fork](https://github.com/ROCm/omniperf/fork) our repository and start your work from our `amd-staging` branch in your private repository.
 
 Afterwards, git clone your repository to your local machine. But that is not it! To keep track of the original develop repository, add it as another remote.
 
 ```
 git remote add mainline https://github.com/ROCm/omniperf.git
-git checkout dev
+git checkout amd-staging
 ```
 
 As always in git, start a new branch with
@@ -31,9 +31,9 @@ and apply your changes there.
 
 - Ensure the PR description clearly describes the problem and solution. If there is an existing GitHub issue open describing this bug, please include it in the description so we can close it.
 
-- Ensure the PR is based on the `dev` branch of the Omniperf GitHub repository.
+- Ensure the PR is based on the `amd-staging` branch of the Omniperf GitHub repository.
 
-- Omniperf requires new commits to include a "Signed-off-by" token in the commit message (typically enabled via the `git commit -s` option), indicating your agreement to the projects's [Developer's Certificate of Origin](https://developercertificate.org/) and compatability with the project [LICENSE](https://github.com/ROCm/omniperf/blob/main/LICENSE):
+- Omniperf requires new commits to include a "Signed-off-by" token in the commit message (typically enabled via the `git commit -s` option), indicating your agreement to the projects's [Developer's Certificate of Origin](https://developercertificate.org/) and compatability with the project [LICENSE](LICENSE):
 
 
 > (a) The contribution was created in whole or in part by me and I

diff --git a/README.md b/README.md
@@ -29,10 +29,10 @@ contribution process.
 Omniperf follows a
 [main-dev](https://nvie.com/posts/a-successful-git-branching-model/)
 branching model. As a result, our latest stable release is shipped
-from the `main` branch, while new features are developed in our
-`dev` branch.
+from the `amd-mainline` branch, while new features are developed in our
+`amd-staging` branch.
 
-Users may checkout `dev` to preview upcoming features.
+Users may checkout `amd-staging` to preview upcoming features.
 
 ## How to Cite
 

diff --git a/docs/archive/docs-1.x/analysis.md b/docs/archive/docs-1.x/analysis.md
@@ -171,7 +171,7 @@ $ omniperf analyze -p workloads/vcopy/mi200/ --list-metrics gfx90a
 ├─────────┼─────────────────────────────┤
 ...
  ```
- 2. Choose your own customized subset of metrics with `-b` (a.k.a. `--metric`), or build your own config following [config_template](https://github.com/ROCm/omniperf/blob/main/src/omniperf_analyze/configs/panel_config_template.yaml). Below shows how to generate a report containing only metric 2 (a.k.a. System Speed-of-Light).
+ 2. Choose your own customized subset of metrics with `-b` (a.k.a. `--metric`), or build your own config following [config_template](https://github.com/ROCm/omniperf/blob/amd-mainline/src/omniperf_analyze/configs/panel_config_template.yaml). Below shows how to generate a report containing only metric 2 (a.k.a. System Speed-of-Light).
 ```shell-session
 $ omniperf analyze -p workloads/vcopy/mi200/ -b 2
 --------

diff --git a/docs/archive/docs-2.x/analysis.md b/docs/archive/docs-2.x/analysis.md
@@ -181,7 +181,7 @@ Analysis mode = cli
                 2.1.30 -> L1I Fetch Latency
 ...
 ```
-3. Choose your own customized subset of metrics with `-b` (a.k.a. `--block`), or build your own config following [config_template](https://github.com/ROCm/omniperf/blob/main/src/omniperf_analyze/configs/panel_config_template.yaml). Below shows how to generate a report containing only metric 2 (a.k.a. System Speed-of-Light).
+3. Choose your own customized subset of metrics with `-b` (a.k.a. `--block`), or build your own config following [config_template](https://github.com/ROCm/omniperf/blob/amd-mainline/src/omniperf_analyze/configs/panel_config_template.yaml). Below shows how to generate a report containing only metric 2 (a.k.a. System Speed-of-Light).
 ```shell-session
 $ omniperf analyze -p workloads/vcopy/MI200/ -b 2
 --------

diff --git a/docs/archive/docs-2.x/performance_model.md b/docs/archive/docs-2.x/performance_model.md
@@ -2178,7 +2178,7 @@ A good discussion of coarse and fine grained memory allocations and what type of
 (VALU_inst_mix_example)=
 ## VALU Arithmetic Instruction Mix
 
-For this example, we consider the [instruction mix sample](https://github.com/ROCm/omniperf/blob/dev/sample/instmix.hip) distributed as a part of Omniperf.
+For this example, we consider the [instruction mix sample](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/instmix.hip) distributed as a part of Omniperf.
 
 ```{note}
 This example is expected to work on all CDNA accelerators, however the results in this section were collected on an [MI2XX](2xxnote) accelerator
@@ -2269,7 +2269,7 @@ shows that we have exactly one of each type of VALU arithmetic instruction, by c
 (Fabric_transactions_example)=
 ## Infinity-Fabric(tm) transactions
 
-For this example, we consider the [Infinity Fabric(tm) sample](https://github.com/ROCm/omniperf/blob/dev/sample/fabric.hip) distributed as a part of Omniperf.
+For this example, we consider the [Infinity Fabric(tm) sample](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/fabric.hip) distributed as a part of Omniperf.
 This code launches a simple read-only kernel, e.g.:
 
 ```c++
@@ -2826,7 +2826,7 @@ On an AMD [MI2XX](2xxnote) accelerator, for FP32 values this will generate a `gl
 (flatmembench)=
 ### Global / Generic (FLAT)
 
-For this example, we consider the [vector-memory sample](https://github.com/ROCm/omniperf/blob/dev/sample/vmem.hip) distributed as a part of Omniperf.
+For this example, we consider the [vector-memory sample](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/vmem.hip) distributed as a part of Omniperf.
 This code launches many different versions of a simple read/write/atomic-only kernels targeting various address spaces, e.g. below is our simple `global_write` kernel:
 
 ```c++
@@ -2976,7 +2976,7 @@ The assembly in these experiments were generated for an [MI2XX](2xxnote) acceler
 Next, we examine a generic write.
 As discussed [previously](Flat_design), our `generic_write` kernel uses an address space cast to _force_ the compiler to choose our desired address space, regardless of other optimizations that may be possible.
 
-We also note that the `filter` parameter passed in as a kernel argument (see [example](https://github.com/ROCm/omniperf/blob/dev/sample/vmem.hip), or [design note](Flat_design)) is set to zero on the host, such that we always write to the 'local' (LDS) memory allocation `lds`.
+We also note that the `filter` parameter passed in as a kernel argument (see [example](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/vmem.hip), or [design note](Flat_design)) is set to zero on the host, such that we always write to the 'local' (LDS) memory allocation `lds`.
 
 Examining this kernel in the VMEM Instruction Mix table yields: 
 
@@ -3339,7 +3339,7 @@ Next we examine the use of 'Spill/Scratch' memory.
 On current CDNA accelerators such as the [MI2XX](2xxnote), this is implemented using the [private](mspace) memory space, which maps to ['scratch' memory](https://llvm.org/docs/AMDGPUUsage.html#amdgpu-address-spaces) in AMDGPU hardware terminology.
 This type of memory can be accessed via different instructions depending on the specific architecture targeted. However, current CDNA accelerators such as the [MI2XX](2xxnote) use so called `buffer` instructions to access private memory in a simple (and typically) coalesced manner.  See [Sec. 9.1, 'Vector Memory Buffer Instructions' of the CDNA2 ISA guide](https://www.amd.com/system/files/TechDocs/instinct-mi200-cdna2-instruction-set-architecture.pdf) for further reading on this instruction type.
 
-We develop a [simple kernel](https://github.com/ROCm/omniperf/blob/dev/sample/stack.hip) that uses stack memory:
+We develop a [simple kernel](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/stack.hip) that uses stack memory:
 ```c++
 #include <hip/hip_runtime.h>
 __global__ void knl(int* out, int filter) {
@@ -3404,7 +3404,7 @@ Here we see a single write to the stack (10.3.6), which corresponds to an L1-L2
 (IPC_example)=
 ## Instructions-per-cycle and Utilizations example
 
-For this section, we use the instructions-per-cycle (IPC) [example](https://github.com/ROCm/omniperf/blob/dev/sample/ipc.hip) included with Omniperf.
+For this section, we use the instructions-per-cycle (IPC) [example](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/ipc.hip) included with Omniperf.
 
 This example is compiled using `c++17` support:
 
@@ -3824,7 +3824,7 @@ Finally, we note that our branch utilization (11.2.5) has increased slightly fro
 
 ## LDS Examples
 
-For this example, we consider the [LDS sample](https://github.com/ROCm/omniperf/blob/dev/sample/lds.hip) distributed as a part of Omniperf.
+For this example, we consider the [LDS sample](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/lds.hip) distributed as a part of Omniperf.
 This code contains two kernels to explore how both [LDS](lds) bandwidth and bank conflicts are calculated in Omniperf.
 
 This example was compiled and run on an MI250 accelerator using ROCm v5.6.0, and Omniperf v2.0.0.
@@ -4037,7 +4037,7 @@ The bank conflict rate linearly increases with the number of work-items within a
 ## Occupancy Limiters Example 
 
 
-In this [example](https://github.com/ROCm/omniperf/blob/dev/sample/occupancy.hip), we will investigate the use of the resource allocation panel in the [Workgroup Manager](SPI)'s metrics section to determine occupancy limiters.
+In this [example](https://github.com/ROCm/omniperf/blob/amd-mainline/sample/occupancy.hip), we will investigate the use of the resource allocation panel in the [Workgroup Manager](SPI)'s metrics section to determine occupancy limiters.
 This code contains several kernels to explore how both various kernel resources impact achieved occupancy, and how this is reported in Omniperf.
 
 This example was compiled and run on a MI250 accelerator using ROCm v5.6.0, and Omniperf v2.0.0:

diff --git a/docs/conf.py b/docs/conf.py
@@ -55,7 +55,7 @@
 
 # frequently used external resources
 extlinks = {
-    "dev-sample": ("https://github.com/ROCm/omniperf/blob/dev/sample/%s", "%s"),
+    "dev-sample": ("https://github.com/ROCm/omniperf/blob/amd-mainline/sample/%s", "%s"),
     "prod-page": (
         "https://www.amd.com/en/products/accelerators/instinct/%s.html",
         "%s",

diff --git a/docs/how-to/analyze/cli.rst b/docs/how-to/analyze/cli.rst
@@ -26,18 +26,18 @@ Run ``omniperf analyze -h`` for more details.
 Walkthrough
 ===========
 
-1. To begin, generate a high-level analysis report using Omniperf's ``-b`` (or ``--block``) flag. 
+1. To begin, generate a high-level analysis report using Omniperf's ``-b`` (or ``--block``) flag.
 
    .. code-block:: shell
 
       $ omniperf analyze -p workloads/vcopy/MI200/ -b 2
 
-        ___                  _                  __ 
+        ___                  _                  __
        / _ \ _ __ ___  _ __ (_)_ __   ___ _ __ / _|
-      | | | | '_ ` _ \| '_ \| | '_ \ / _ \ '__| |_ 
+      | | | | '_ ` _ \| '_ \| | '_ \ / _ \ '__| |_
       | |_| | | | | | | | | | | |_) |  __/ |  |  _|
-       \___/|_| |_| |_|_| |_|_| .__/ \___|_|  |_|  
-                              |_|                  
+       \___/|_| |_| |_|_| |_|_| .__/ \___|_|  |_|
+                              |_|
 
       Analysis mode = cli
       [analysis] deriving Omniperf metrics...
@@ -138,12 +138,12 @@ Walkthrough
 
       $ omniperf analyze -p workloads/vcopy/MI200/ --list-metrics gfx90a
 
-        ___                  _                  __ 
+        ___                  _                  __
        / _ \ _ __ ___  _ __ (_)_ __   ___ _ __ / _|
-      | | | | '_ ` _ \| '_ \| | '_ \ / _ \ '__| |_ 
+      | | | | '_ ` _ \| '_ \| | '_ \ / _ \ '__| |_
       | |_| | | | | | | | | | | |_) |  __/ |  |  _|
-       \___/|_| |_| |_|_| |_|_| .__/ \___|_|  |_|  
-                              |_|                  
+       \___/|_| |_| |_|_| |_|_| .__/ \___|_|  |_|
+                              |_|
 
       Analysis mode = cli
       [analysis] deriving Omniperf metrics...
@@ -186,7 +186,7 @@ Walkthrough
 
 3. Choose your own customized subset of metrics with the ``-b`` (or ``--block``)
    option. Or, build your own configuration following
-   `config_template <https://github.com/ROCm/omniperf/blob/main/src/omniperf_analyze/configs/panel_config_template.yaml>`_.
+   `config_template <https://github.com/ROCm/omniperf/blob/amd-mainline/src/omniperf_soc/analysis_configs/panel_config_template.yaml>`_.
    The following snippet shows how to generate a report containing only metric 2
    (:doc:`System Speed-of-Light </conceptual/system-speed-of-light>`).
 

diff --git a/docs/how-to/profile/mode.rst b/docs/how-to/profile/mode.rst
@@ -38,7 +38,7 @@ Run ``omniperf profile -h`` for more details. See
 Profiling example
 -----------------
 
-The `<https://github.com/ROCm/omniperf/blob/main/sample/vcopy.cpp>`__ repository
+The `<https://github.com/ROCm/omniperf/blob/amd-mainline/sample/vcopy.cpp>`__ repository
 includes source code for a sample GPU compute workload, ``vcopy.cpp``. A copy of
 this file is available in the ``share/sample`` subdirectory after a normal
 Omniperf installation, or via the ``$OMNIPERF_SHARE/sample`` directory when

diff --git a/docs/tutorial/includes/vector-memory-operation-counting.rst b/docs/tutorial/includes/vector-memory-operation-counting.rst
@@ -623,7 +623,7 @@ manner. See
 for further reading on this instruction type.
 
 We develop a `simple
-kernel <https://github.com/ROCm/omniperf/blob/dev/sample/stack.hip>`__
+kernel <https://github.com/ROCm/omniperf/blob/amd-mainline/sample/stack.hip>`__
 that uses stack memory:
 
 .. code-block:: cpp

diff --git a/docs/tutorial/profiling-by-example.rst b/docs/tutorial/profiling-by-example.rst
@@ -7,7 +7,7 @@ Profiling by example
 ********************
 
 The following examples refer to sample :doc:`HIP <hip:index>` code located in
-:fab:`github` :dev-sample:`ROCm/omniperf/blob/dev/sample <>` and distributed
+:fab:`github` :dev-sample:`ROCm/omniperf/blob/amd-mainline/sample <>` and distributed
 as part of Omniperf.
 
 .. include:: ./includes/valu-arithmetic-instruction-mix.rst