Skip to content

Commit

Permalink
Refactored code for better build
Browse files Browse the repository at this point in the history
1. created setup.py
2. fix root drectory to dlio_benchmark
3. renamed dlio_benchmark.py to main.py
4. renamed dlio_postprocessor.py to postprocessor.py
5. fixed documentation to use dlio_benchmark and dlio_postprocessor entry points.
  • Loading branch information
hariharan-devarajan committed Jun 20, 2023
1 parent e897c9c commit 1be84f1
Show file tree
Hide file tree
Showing 16 changed files with 37 additions and 32 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/python-package-conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ ubuntu-latest ]
os: [ ubuntu-20.04 ]
profiler: [ DEFAULT, DLIO_PROFILER ]
gcc: [10]
name: ${{ matrix.os }}-${{ matrix.profiler }}-${{ matrix.gcc }}
Expand Down Expand Up @@ -37,12 +37,10 @@ jobs:
python -m pip install --upgrade pip
pip install .[test]
if [[ $DLIO_PROFILER == 'DLIO_PROFILER' ]]; then
sudo apt-get install libhwloc-dev
git clone https://github.com/hariharan-devarajan/dlio-profiler /home/runner/work/dlio_profiler
cd /home/runner/work/dlio_profiler
git submodule update --init --recursive
pushd external/GOTCHA
git apply ../gotcha_glibc_workaround.patch
popd
mkdir build
cd build
cmake ../
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,13 @@ docker run -t dlio dlio_benchmark ++workload.workflow.generate_data=True
You can also pull rebuilt container from docker hub (might not reflect the most recent change of the code):
```bash
docker docker.io/zhenghh04/dlio:latest
docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/benchmark.py ++workload.workflow.generate_data=True
docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/main.py ++workload.workflow.generate_data=True
```

One can also run interactively inside the container
```bash
docker run -t docker.io/zhenghh04/dlio:latest /bin/bash
root@30358dd47935:/workspace/dlio$ python ./dlio_benchmark/benchmark.py ++workload.workflow.generate_data=True
root@30358dd47935:/workspace/dlio$ python ./dlio_benchmark/main.py ++workload.workflow.generate_data=True
```

## PowerPC
Expand Down Expand Up @@ -78,7 +78,7 @@ Finally, run the benchmark with ```iostat``` profiling, listing the io devices y

All the outputs will be stored in ```hydra_log/unet3d/$DATE-$TIME``` folder. To post process the data, one can do
```bash
python3 dlio_postprocesser --output-folder hydra_log/unet3d/$DATE-$TIME
dlio_postprocessor --output-folder hydra_log/unet3d/$DATE-$TIME
```
This will generate ```DLIO_$model_report.txt``` in the output folder.

Expand Down
1 change: 1 addition & 0 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,4 @@ pytest-mpi
pytest-subtests
pytest-timeout
nvidia-dali-cuda110
psutil
4 changes: 2 additions & 2 deletions dlio_benchmark/configs/hydra/help/dlio_benchmark_help.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ template: |-
DLIO - an IO benchmark for deep learning applications.
Running the benchmark: python dlio_benchmark/benchmark.py workload=unet3d
Running the benchmark: dlio_benchmark workload=unet3d
One can select the workload configuration using "workload={WORKLOAD}".
The corresponding YAML file is ./configs/workload/{WORKLOAD}.yaml folder.
Available choise for $APP_CONFIG_GROUPS
One can override everything in the command line, for example:
python dlio_benchmark/benchmark.py workload.framework=tensorflow
dlio_benchmark workload.framework=tensorflow
One can also create a custom YAML file for a specific workload.
An example of a YAML file is as follows.
Expand Down
File renamed without changes.
File renamed without changes.
7 changes: 6 additions & 1 deletion dlio_benchmark/utils/utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,16 @@

import numpy as np
import inspect

import psutil
import socket
# UTC timestamp format with microsecond precision
from dlio_benchmark.common.enumerations import LoggerType

LOG_TS_FORMAT = "%Y-%m-%dT%H:%M:%S.%f"
from mpi4py import MPI

p = psutil.Process()

def add_padding(n, num_digits=None):
str_out = str(n)
if num_digits!=None:
Expand Down Expand Up @@ -137,6 +140,8 @@ def create_dur_event(name, cat, ts, dur, args={}):
tid = threading.get_ident()
else:
tid = 0
args["hostname"] = socket.gethostname()
args["cpu_affinity"] = p.cpu_affinity()
d = {
"name": name,
"cat": cat,
Expand Down
2 changes: 1 addition & 1 deletion docs/source/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ We support following I/O profiling using following profilers:
* ``pytorch`` (torch.profiler): https://pytorch.org/docs/stable/profiler.html. This works only for pytorch framework (and data loader).

The YAML files are stored in the `workload`_ folder.
It then can be loaded by ```dlio_benchmark.py``` through hydra (https://hydra.cc/). This will override the default settings. One can override the configurations through command line (https://hydra.cc/docs/advanced/override_grammar/basic/).
It then can be loaded by ```dlio_benchmark``` through hydra (https://hydra.cc/). This will override the default settings. One can override the configurations through command line (https://hydra.cc/docs/advanced/override_grammar/basic/).


.. _workload: https://github.com/argonne-lcf/dlio_benchmark/tree/main/configs/workload
8 changes: 4 additions & 4 deletions docs/source/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,19 +52,19 @@ First, we generate the dataset with ```++workload.workflow.generate=False```

.. code-block :: bash
mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
mpirun -np 8 python dlio_benchmark workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
Then, we run the appliation with iostat profiling

.. code-block:: bash
python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.profiling=iostat
dlio_benchmark workload=unet3d ++workload.workflow.profiling=iostat
To run in data parallel mode, one can do

.. code-block:: bash
mpirun -np 8 dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.profiling=iostat
mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.profiling=iostat
This will run the benchmark and produce the following logging output:

Expand Down Expand Up @@ -144,7 +144,7 @@ One can then post processing the data with dlio_postprocessor.py

.. code-block:: bash
python dlio_benchmark/dlio_postprocessor.py --output-folder hydra_log/unet3d/2022-11-09-17-55-44/
python postprocessor --output-folder hydra_log/unet3d/2022-11-09-17-55-44/
The output is

Expand Down
11 changes: 5 additions & 6 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@ DLIO itself should run directly after installing dependence python packages spec
git clone https://github.com/argonne-lcf/dlio_benchmark
cd dlio_benchmark/
pip install -r requirements.txt
export PYTHONPATH=$PWD/:$PYTHONPATH
python ./dlio_benchmark/benchmark.py
pip install .
dlio_benchmark
One can build docker image run DLIO inside a docker container.

Expand All @@ -17,18 +16,18 @@ One can build docker image run DLIO inside a docker container.
git clone https://github.com/argonne-lcf/dlio_benchmark
cd dlio_benchmark/
docker build -t dlio .
docker run -t dlio python ./dlio_benchmark/benchmark.py
docker run -t dlio dlio_benchmark
A prebuilt docker image is available in docker hub

.. code-block:: bash
docker pull docker.io/zhenghh04/dlio:latest
docker run -t docker.io/zhenghh04/dlio:latest python ./dlio_benchmark/benchmark.py
docker run -t docker.io/zhenghh04/dlio:latest dlio_benchmark
To run interactively in the docker container.

.. code-block:: bash
docker run -t docker.io/zhenghh04/dlio:latest bash
root@30358dd47935:/workspace/dlio# python ./dlio_benchmark/benchmark.py
root@30358dd47935:/workspace/dlio# dlio_benchmark
6 changes: 3 additions & 3 deletions docs/source/run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Generate data

.. code-block:: bash
mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.generate_data=True ++workload.workflow.train=False
In this case, we override ```workflow.generate_data``` and ```workflow.train``` in the configuration to perform the data generation.

Expand All @@ -26,7 +26,7 @@ Running benchmark

.. code-block:: bash
mpirun -np 8 python dlio_benchmark/benchmark.py workload=unet3d ++workload.workflow.generate_data=False ++workload.workflow.train=True ++workload.workflow.evaluation=True
mpirun -np 8 dlio_benchmark workload=unet3d ++workload.workflow.generate_data=False ++workload.workflow.train=True ++workload.workflow.evaluation=True
In this case, we set ```workflow.generate_data=False```, so it will perform training and evaluation with the data generated previously.

Expand All @@ -39,7 +39,7 @@ To post process the data, one only need to specify the output folder. All the ot

.. code-block:: bash
python3 dlio_benchmark/dlio_postprocessor.py --output_folder=hydra_log/unet3d/$DATE-$TIME
dlio_postprocessor --output_folder=hydra_log/unet3d/$DATE-$TIME
This will generate DLIO_$model_report.txt inside the output folder.

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,4 @@ urllib3==1.26.12
Werkzeug==2.2.2
wrapt==1.14.1
nvidia-dali-cuda110
psutil
7 changes: 4 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
'mpi4py',
'numpy',
'h5py',
'pandas'
'pandas',
'psutil'
]
x86_deps = [
'hydra-core == 1.2.0',
Expand Down Expand Up @@ -51,8 +52,8 @@
extras_require=extras,
entry_points={
'console_scripts': [
'dlio_benchmark = dlio_benchmark.benchmark:main',
'dlio_postprocesser = dlio_benchmark.dlio_postprocesser:main',
'dlio_benchmark = dlio_benchmark.main:main',
'dlio_postprocessor = dlio_benchmark.postprocessor:main',
]
}
)
4 changes: 2 additions & 2 deletions tests/dlio_benchmark_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
# logging's max timestamp resolution is msecs, we will pass in usecs in the message
)

from dlio_benchmark.benchmark import DLIOBenchmark
from dlio_benchmark.main import DLIOBenchmark
import glob


Expand Down Expand Up @@ -181,7 +181,7 @@ def test_iostat_profiling() -> None:
with open(f"{hydra}/overrides.yaml", "w") as f:
f.write('[]')
subprocess.run(["ls", "-l", "/dev/null"], capture_output=True)
cmd = f"python dlio_benchmark/dlio_postprocessor.py --output-folder={benchmark.output_folder}"
cmd = f"dlio_postprocessor --output-folder={benchmark.output_folder}"
cmd = cmd.split()
subprocess.run(cmd, capture_output=True, timeout=10)
clean()
Expand Down
2 changes: 1 addition & 1 deletion tests/dlio_postprocessor_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from collections import namedtuple
import unittest

from dlio_benchmark.dlio_postprocessor import DLIOPostProcessor
from dlio_benchmark.postprocessor import DLIOPostProcessor
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['AUTOGRAPH_VERBOSITY'] = '0'
Expand Down
4 changes: 2 additions & 2 deletions tests/test_data/.hydra/hydra.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ hydra:
footer: "Please submit questions/bugs to \n https://github.com/argonne-lcf/dlio_benchmark/issues\n\
\n Copyright (c) 2021 UChicago Argonne, LLC"
template: "\n${hydra.help.header}\n\nDLIO - an IO benchmark for deep learning\
\ applications. \n\nRunning the benchmark: python dlio_benchmark/benchmark.py workload=unet3d\n\
\ applications. \n\nRunning the benchmark: python dlio_benchmark/main.py workload=unet3d\n\
\nOne can select the workload configuration using \"workload={WORKLOAD}\". \n\
The corresponding YAML file is ./configs/workload/{WORKLOAD}.yaml folder. \n\
Available choise for $APP_CONFIG_GROUPS\nOne can override everything in the\
\ command line, for example:\npython dlio_benchmark/benchmark.py workload.framework=tensorflow\n\
\ command line, for example:\npython dlio_benchmark/main.py workload.framework=tensorflow\n\
\nOne can also create a custom YAML file for a specific workload. \nAn example\
\ of a YAML file is as follows. \n\n-------\n$CONFIG\n-------\nA complete list\
\ of config options in the YAML file can be found: \nhttps://argonne-lcf.github.io/dlio_benchmark/config.html\n\
Expand Down

0 comments on commit 1be84f1

Please sign in to comment.