Refactored code for better build

1. created setup.py 2. fix root drectory to dlio_benchmark 3. renamed dlio_benchmark.py to main.py 4. renamed dlio_postprocessor.py to postprocessor.py 5. fixed documentation to use dlio_benchmark and dlio_postprocessor entry points.
argonne-lcf · Jun 20, 2023 · 1be84f1 · 1be84f1
1 parent e897c9c
commit 1be84f1
Show file tree

Hide file tree

Showing 16 changed files with 37 additions and 32 deletions.
diff --git a/.github/workflows/python-package-conda.yml b/.github/workflows/python-package-conda.yml
@@ -9,7 +9,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        os: [ ubuntu-latest ]
+        os: [ ubuntu-20.04 ]
         profiler: [ DEFAULT, DLIO_PROFILER ]
         gcc: [10]
     name: ${{ matrix.os }}-${{ matrix.profiler }}-${{ matrix.gcc }}
@@ -37,12 +37,10 @@ jobs:
         python -m pip install --upgrade pip
         pip install .[test]
         if [[ $DLIO_PROFILER == 'DLIO_PROFILER' ]]; then
+          sudo apt-get install libhwloc-dev
           git clone https://github.com/hariharan-devarajan/dlio-profiler /home/runner/work/dlio_profiler
           cd /home/runner/work/dlio_profiler
           git submodule update --init --recursive
-          pushd external/GOTCHA
-          git apply ../gotcha_glibc_workaround.patch 
-          popd
           mkdir build
           cd build
           cmake ../

diff --git a/README.md b/README.md
@@ -30,13 +30,13 @@ docker run -t dlio dlio_benchmark ++workload.workflow.generate_data=True
 You can also pull rebuilt container from docker hub (might not reflect the most recent change of the code): 
 ```bash
 docker docker.io/zhenghh04/dlio:latest
-docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/benchmark.py ++workload.workflow.generate_data=True
+docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/main.py ++workload.workflow.generate_data=True
 ```
 
 One can also run interactively inside the container
 ```bash
 docker run -t docker.io/zhenghh04/dlio:latest /bin/bash
-root@30358dd47935:/workspace/dlio$ python ./dlio_benchmark/benchmark.py ++workload.workflow.generate_data=True
+root@30358dd47935:/workspace/dlio$ python ./dlio_benchmark/main.py ++workload.workflow.generate_data=True
 ```
 
 ## PowerPC
@@ -78,7 +78,7 @@ Finally, run the benchmark with ```iostat``` profiling, listing the io devices y
 
 All the outputs will be stored in ```hydra_log/unet3d/$DATE-$TIME``` folder. To post process the data, one can do
 ```bash 
-python3 dlio_postprocesser --output-folder hydra_log/unet3d/$DATE-$TIME
+dlio_postprocessor --output-folder hydra_log/unet3d/$DATE-$TIME
 ```
 This will generate ```DLIO_$model_report.txt``` in the output folder. 
 

diff --git a/dev-requirements.txt b/dev-requirements.txt
@@ -59,3 +59,4 @@ pytest-mpi
 pytest-subtests
 pytest-timeout
 nvidia-dali-cuda110
+psutil
diff --git a/dlio_benchmark/configs/hydra/help/dlio_benchmark_help.yaml b/dlio_benchmark/configs/hydra/help/dlio_benchmark_help.yaml
@@ -26,13 +26,13 @@ template: |-
 
   DLIO - an IO benchmark for deep learning applications. 
 
-  Running the benchmark: python dlio_benchmark/benchmark.py workload=unet3d
+  Running the benchmark: dlio_benchmark workload=unet3d
 
   One can select the workload configuration using "workload={WORKLOAD}". 
   The corresponding YAML file is ./configs/workload/{WORKLOAD}.yaml folder. 
   Available choise for $APP_CONFIG_GROUPS
   One can override everything in the command line, for example:
-  python dlio_benchmark/benchmark.py workload.framework=tensorflow
+  dlio_benchmark workload.framework=tensorflow
 
   One can also create a custom YAML file for a specific workload. 
   An example of a YAML file is as follows. 

diff --git a/dlio_benchmark/benchmark.py → dlio_benchmark/main.py b/dlio_benchmark/benchmark.py → dlio_benchmark/main.py
diff --git a/dlio_benchmark/dlio_postprocessor.py → dlio_benchmark/postprocessor.py b/dlio_benchmark/dlio_postprocessor.py → dlio_benchmark/postprocessor.py
diff --git a/dlio_benchmark/utils/utility.py b/dlio_benchmark/utils/utility.py
@@ -26,13 +26,16 @@
 
 import numpy as np
 import inspect
-
+import psutil
+import socket
 # UTC timestamp format with microsecond precision
 from dlio_benchmark.common.enumerations import LoggerType
 
 LOG_TS_FORMAT = "%Y-%m-%dT%H:%M:%S.%f"
 from mpi4py import MPI
 
+p = psutil.Process()
+
 def add_padding(n, num_digits=None):
     str_out = str(n)
     if num_digits!=None:
@@ -137,6 +140,8 @@ def create_dur_event(name, cat, ts, dur, args={}):
         tid = threading.get_ident()
     else:
         tid = 0
+    args["hostname"] = socket.gethostname()
+    args["cpu_affinity"] = p.cpu_affinity()
     d = {
         "name": name,
         "cat": cat,

diff --git a/docs/source/config.rst b/docs/source/config.rst
@@ -345,7 +345,7 @@ We support following I/O profiling using following profilers:
   * ``pytorch`` (torch.profiler): https://pytorch.org/docs/stable/profiler.html. This works only for pytorch framework (and data loader).
 
 The YAML files are stored in the `workload`_ folder. 
-It then can be loaded by ```dlio_benchmark.py``` through hydra (https://hydra.cc/). This will override the default settings. One can override the configurations through command line (https://hydra.cc/docs/advanced/override_grammar/basic/). 
+It then can be loaded by ```dlio_benchmark``` through hydra (https://hydra.cc/). This will override the default settings. One can override the configurations through command line (https://hydra.cc/docs/advanced/override_grammar/basic/).
 
 
 .. _workload: https://github.com/argonne-lcf/dlio_benchmark/tree/main/configs/workload
diff --git a/docs/source/examples.rst b/docs/source/examples.rst
@@ -52,19 +52,19 @@ First, we generate the dataset with ```++workload.workflow.generate=False```
 
 .. code-block :: bash
     
-    mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
+    mpirun -np 8 python dlio_benchmark workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
 
 Then, we run the appliation with iostat profiling
 
 .. code-block:: bash
     
-    python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.profiling=iostat
+    dlio_benchmark workload=unet3d ++workload.workflow.profiling=iostat
 
 To run in data parallel mode, one can do
 
 .. code-block:: bash
 
-    mpirun -np 8 dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.profiling=iostat
+    mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.profiling=iostat
 
 This will run the benchmark and produce the following logging output: 
 
@@ -144,7 +144,7 @@ One can then post processing the data with dlio_postprocessor.py
 
 .. code-block:: bash 
 
-    python dlio_benchmark/dlio_postprocessor.py --output-folder hydra_log/unet3d/2022-11-09-17-55-44/
+    python postprocessor --output-folder hydra_log/unet3d/2022-11-09-17-55-44/
 
 The output is
 

diff --git a/docs/source/install.rst b/docs/source/install.rst
@@ -6,9 +6,8 @@ DLIO itself should run directly after installing dependence python packages spec
 
     git clone https://github.com/argonne-lcf/dlio_benchmark
     cd dlio_benchmark/
-    pip install -r requirements.txt 
-    export PYTHONPATH=$PWD/:$PYTHONPATH
-    python ./dlio_benchmark/benchmark.py
+    pip install .
+    dlio_benchmark
     
 One can build docker image run DLIO inside a docker container.  
 
@@ -17,18 +16,18 @@ One can build docker image run DLIO inside a docker container.
     git clone https://github.com/argonne-lcf/dlio_benchmark
     cd dlio_benchmark/
     docker build -t dlio .
-    docker run -t dlio python ./dlio_benchmark/benchmark.py
+    docker run -t dlio dlio_benchmark
 
 A prebuilt docker image is available in docker hub 
 
 .. code-block:: bash 
 
     docker pull docker.io/zhenghh04/dlio:latest
-    docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/benchmark.py
+    docker run -t docker.io/zhenghh04/dlio:latest dlio_benchmark
 
 To run interactively in the docker container. 
 
 .. code-block:: bash
 
     docker run -t docker.io/zhenghh04/dlio:latest bash
-    root@30358dd47935:/workspace/dlio# python ./dlio_benchmark/benchmark.py
+    root@30358dd47935:/workspace/dlio# dlio_benchmark
diff --git a/docs/source/run.rst b/docs/source/run.rst
@@ -16,7 +16,7 @@ Generate data
 
 .. code-block:: bash
 
-    mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
+    mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
 
 In this case, we override ```workflow.generate_data``` and ```workflow.train``` in the configuration to perform the data generation.  
 
@@ -26,7 +26,7 @@ Running benchmark
 
 .. code-block:: bash 
 
-    mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=False ++workload.workflow.train=True ++workload.workflow.evaluation=True
+    mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.generate_data=False ++workload.workflow.train=True ++workload.workflow.evaluation=True
 
 In this case, we set ```workflow.generate_data=False```, so it will perform training and evaluation with the data generated previously. 
 
@@ -39,7 +39,7 @@ To post process the data, one only need to specify the output folder. All the ot
 
 .. code-block:: bash 
 
-    python3 dlio_benchmark/dlio_postprocessor.py --output_folder=hydra_log/unet3d/$DATE-$TIME
+    dlio_postprocessor --output_folder=hydra_log/unet3d/$DATE-$TIME
 
 This will generate DLIO_$model_report.txt inside the output folder.
 

diff --git a/requirements.txt b/requirements.txt
@@ -55,3 +55,4 @@ urllib3==1.26.12
 Werkzeug==2.2.2
 wrapt==1.14.1
 nvidia-dali-cuda110
+psutil
diff --git a/setup.py b/setup.py
@@ -10,7 +10,8 @@
  'mpi4py',
  'numpy',
  'h5py',
- 'pandas'
+ 'pandas',
+ 'psutil'
 ]
 x86_deps = [
  'hydra-core == 1.2.0',
@@ -51,8 +52,8 @@
     extras_require=extras,
     entry_points={
         'console_scripts': [
-            'dlio_benchmark = dlio_benchmark.benchmark:main',
-            'dlio_postprocesser = dlio_benchmark.dlio_postprocesser:main',
+            'dlio_benchmark = dlio_benchmark.main:main',
+            'dlio_postprocessor = dlio_benchmark.postprocessor:main',
         ]
     }
 )
diff --git a/tests/dlio_benchmark_test.py b/tests/dlio_benchmark_test.py
@@ -39,7 +39,7 @@
     # logging's max timestamp resolution is msecs, we will pass in usecs in the message
 )
 
-from dlio_benchmark.benchmark import DLIOBenchmark
+from dlio_benchmark.main import DLIOBenchmark
 import glob
 
 
@@ -181,7 +181,7 @@ def test_iostat_profiling() -> None:
             with open(f"{hydra}/overrides.yaml", "w") as f:
                 f.write('[]')
             subprocess.run(["ls", "-l", "/dev/null"], capture_output=True)
-            cmd = f"python dlio_benchmark/dlio_postprocessor.py --output-folder={benchmark.output_folder}"
+            cmd = f"dlio_postprocessor --output-folder={benchmark.output_folder}"
             cmd = cmd.split()
             subprocess.run(cmd, capture_output=True, timeout=10)
         clean()

diff --git a/tests/dlio_postprocessor_test.py b/tests/dlio_postprocessor_test.py
@@ -18,7 +18,7 @@
 from collections import namedtuple
 import unittest
 
-from dlio_benchmark.dlio_postprocessor import DLIOPostProcessor
+from dlio_benchmark.postprocessor import DLIOPostProcessor
 import os
 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
 os.environ['AUTOGRAPH_VERBOSITY'] = '0'

diff --git a/tests/test_data/.hydra/hydra.yaml b/tests/test_data/.hydra/hydra.yaml
@@ -16,11 +16,11 @@ hydra:
     footer: "Please submit questions/bugs to \n  https://github.com/argonne-lcf/dlio_benchmark/issues\n\
       \n          Copyright (c) 2021 UChicago Argonne, LLC"
     template: "\n${hydra.help.header}\n\nDLIO - an IO benchmark for deep learning\
-      \ applications. \n\nRunning the benchmark: python dlio_benchmark/benchmark.py workload=unet3d\n\
+      \ applications. \n\nRunning the benchmark: python dlio_benchmark/main.py workload=unet3d\n\
       \nOne can select the workload configuration using \"workload={WORKLOAD}\". \n\
       The corresponding YAML file is ./configs/workload/{WORKLOAD}.yaml folder. \n\
       Available choise for $APP_CONFIG_GROUPS\nOne can override everything in the\
-      \ command line, for example:\npython dlio_benchmark/benchmark.py workload.framework=tensorflow\n\
+      \ command line, for example:\npython dlio_benchmark/main.py workload.framework=tensorflow\n\
       \nOne can also create a custom YAML file for a specific workload. \nAn example\
       \ of a YAML file is as follows. \n\n-------\n$CONFIG\n-------\nA complete list\
       \ of config options in the YAML file can be found: \nhttps://argonne-lcf.github.io/dlio_benchmark/config.html\n\