nv-morpheus · rapids-bot · May 2, 2024 · Apr 29, 2024 · Apr 29, 2024 · Apr 30, 2024
@@ -27,14 +27,9 @@ docker pull nvcr.io/nvidia/tritonserver:23.06-py3
 ```
 
 ##### Deploy Triton Inference Server
-From the root of the Morpheus repo, navigate to the anomalous behavior profiling example directory:
+From the root of the Morpheus repo, run the following to launch Triton and load the `abp-pcap-xgb` model:
 ```bash
-cd examples/abp_pcap_detection
-```
-
-The following creates the Triton container, mounts the `abp-pcap-xgb` directory to `/models/abp-pcap-xgb` in the Triton container, and starts the Triton server:
-```bash
-docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/abp-pcap-xgb:/models/abp-pcap-xgb --name tritonserver nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models --exit-on-error=false
+docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/examples/abp_pcap_detection/abp-pcap-xgb:/models/abp-pcap-xgb --name tritonserver nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models --exit-on-error=false
 ```
 
 ##### Verify Model Deployment
@@ -53,53 +48,49 @@ Use Morpheus to run the Anomalous Behavior Profiling Detection Pipeline with the
 
 From the root of the Morpheus repo, run:
 ```bash
-cd examples/abp_pcap_detection
-python run.py --help
+python examples/abp_pcap_detection/run.py --help
 ```
 
 Output:
 ```
 Usage: run.py [OPTIONS]
 
 Options:
- --num_threads INTEGER RANGE Number of internal pipeline threads to use
+ --num_threads INTEGER RANGE Number of internal pipeline threads to use.
  [x>=1]
  --pipeline_batch_size INTEGER RANGE
  Internal batch size for the pipeline. Can be
  much larger than the model batch size. Also
- used for Kafka consumers [x>=1]
+ used for Kafka consumers. [x>=1]
  --model_max_batch_size INTEGER RANGE
- Max batch size to use for the model [x>=1]
- --input_file PATH Input filepath [required]
+ Max batch size to use for the model. [x>=1]
+ --input_file PATH Input filepath. [required]
  --output_file TEXT The path to the file where the inference
  output will be saved.
  --model_fea_length INTEGER RANGE
- Features length to use for the model [x>=1]
+ Features length to use for the model.
+ [x>=1]
  --model_name TEXT The name of the model that is deployed on
- Tritonserver
+ Tritonserver.
  --iterative Iterative mode will emit dataframes one at a
  time. Otherwise a list of dataframes is
  emitted. Iterative mode is good for
  interleaving source stages.
- --server_url TEXT Tritonserver url [required]
- --file_type [auto|json|csv] Indicates what type of file to read.
+ --server_url TEXT Tritonserver url. [required]
+ --file_type [auto|csv|json] Indicates what type of file to read.
  Specifying 'auto' will determine the file
  type from the extension.
  --help Show this message and exit.
 ```
 
-To launch the configured Morpheus pipeline with the sample data that is provided in `examples/data`, from the `examples/abp_pcap_detection` directory run the following:
+To launch the configured Morpheus pipeline with the sample data that is provided in `examples/data`, run the following:
 
 ```bash
-python run.py \
- --input_file ../data/abp_pcap_dump.jsonlines \
- --output_file ./pcap_out.jsonlines \
- --model_name 'abp-pcap-xgb' \
- --server_url localhost:8001
+python examples/abp_pcap_detection/run.py
 ```
 Note: Both Morpheus and Triton Inference Server containers must have access to the same GPUs in order for this example to work.
 
-The pipeline will process the input `pcap_dump.jsonlines` sample data and write it to `pcap_out.jsonlines`.
+The pipeline will process the input `abp_pcap_dump.jsonlines` sample data and write it to `pcap_out.jsonlines`.
 
 ### CLI Example
 The above example is illustrative of using the Python API to build a custom Morpheus Pipeline.
@@ -123,5 +114,3 @@ morpheus --log_level INFO --plugin "examples/abp_pcap_detection/abp_pcap_preproc
  to-file --filename "pcap_out.jsonlines" --overwrite \
  monitor --description "Write to file rate" --unit "to-file"
 ```
-
-Note: Triton is still needed to be launched from the `examples/abp_pcap_detection` directory.
@@ -33,6 +33,9 @@
 from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
 from morpheus.utils.logger import configure_logging
 
+CUR_DIR = os.path.dirname(__file__)
+EX_DATA_DIR = os.path.join(CUR_DIR, "../data")
+
 
 @click.command()
 @click.option(
@@ -57,7 +60,7 @@
 @click.option(
  "--input_file",
  type=click.Path(exists=True, readable=True),
- default="pcap.jsonlines",
+ default=os.path.join(EX_DATA_DIR, "abp_pcap_dump.jsonlines"),
  required=True,
  help="Input filepath.",
 )
@@ -84,7 +87,7 @@
  help=("Iterative mode will emit dataframes one at a time. Otherwise a list of dataframes is emitted. "
  "Iterative mode is good for interleaving source stages."),
 )
-@click.option("--server_url", required=True, help="Tritonserver url.")
+@click.option("--server_url", required=True, help="Tritonserver url.", default="localhost:8001")
 @click.option(
  "--file_type",
  type=click.Choice(FILE_TYPE_NAMES, case_sensitive=False),

@@ -28,17 +28,10 @@ mamba env update \
 ```
 
 ## Running
-
-##### Setup Env Variable
-```bash
-export MORPHEUS_ROOT=$(pwd)
-```
-
 Use Morpheus to run the GNN fraud detection Pipeline with the transaction data. A pipeline has been configured in `run.py` with several command line options:
 
 ```bash
-cd ${MORPHEUS_ROOT}/examples/gnn_fraud_detection_pipeline
-python run.py --help
+python examples/gnn_fraud_detection_pipeline/run.py --help
 ```
 ```
 Usage: run.py [OPTIONS]
@@ -63,11 +56,10 @@ Options:
  --help Show this message and exit.
 ```
 
-To launch the configured Morpheus pipeline with the sample data that is provided at `$MORPHEUS_ROOT/models/dataset`, run the following:
+To launch the configured Morpheus pipeline, run the following:
 
 ```bash
-cd ${MORPHEUS_ROOT}/examples/gnn_fraud_detection_pipeline
-python run.py
+python examples/gnn_fraud_detection_pipeline/run.py
 ```
 ```
 ====Registering Pipeline====
@@ -125,6 +117,7 @@ morpheus --log_level INFO \
  monitor --description "Graph construction rate" \
  gnn-fraud-sage --model_dir examples/gnn_fraud_detection_pipeline/model/ \
  monitor --description "Inference rate" \
+ gnn-fraud-classification --model_xgb_file examples/gnn_fraud_detection_pipeline/model/xgb.pt \
  monitor --description "Add classification rate" \
  serialize \
  to-file --filename "output.csv" --overwrite

@@ -32,6 +32,8 @@
 from stages.graph_construction_stage import FraudGraphConstructionStage
 from stages.graph_sage_stage import GraphSAGEStage
 
+CUR_DIR = os.path.dirname(__file__)
+
 
 @click.command()
 @click.option(
@@ -62,21 +64,21 @@
 @click.option(
  "--input_file",
  type=click.Path(exists=True, readable=True, dir_okay=False),
- default="validation.csv",
+ default=os.path.join(CUR_DIR, "validation.csv"),
  required=True,
  help="Input data filepath.",
 )
 @click.option(
  "--training_file",
  type=click.Path(exists=True, readable=True, dir_okay=False),
- default="training.csv",
+ default=os.path.join(CUR_DIR, "training.csv"),
  required=True,
  help="Training data filepath.",
 )
 @click.option(
  "--model_dir",
  type=click.Path(exists=True, readable=True, file_okay=False, dir_okay=True),
- default="model",
+ default=os.path.join(CUR_DIR, "model"),
  required=True,
  help="Path to trained Hinsage & XGB models.",
 )

@@ -29,11 +29,6 @@ Example:
 docker pull nvcr.io/nvidia/tritonserver:23.06-py3
 ```
 
-##### Setup Env Variable
-```bash
-export MORPHEUS_ROOT=$(pwd)
-```
-
 ##### Start Triton Inference Server Container
 From the Morpheus repo root directory, run the following to launch Triton and load the `log-parsing-onnx` model:
 
@@ -56,19 +51,15 @@ Once Triton server finishes starting up, it will display the status of all loade
 
 ### Run Log Parsing Pipeline
 
-Run the following from the `examples/log_parsing` directory to start the log parsing pipeline:
+Run the following from the root of the Morpheus repo to start the log parsing pipeline:
 
 ```bash
-python run.py \
- --num_threads 1 \
- --input_file ${MORPHEUS_ROOT}/models/datasets/validation-data/log-parsing-validation-data-input.csv \
- --output_file ./log-parsing-output.jsonlines \
+python examples/log_parsing/run.py \
+ --input_file=./models/datasets/validation-data/log-parsing-validation-data-input.csv \
  --model_vocab_hash_file=data/bert-base-cased-hash.txt \
- --model_vocab_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
- --model_seq_length=256 \
+ --model_vocab_file=./models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
  --model_name log-parsing-onnx \
- --model_config_file=${MORPHEUS_ROOT}/models/log-parsing-models/log-parsing-config-20220418.json \
- --server_url localhost:8001
+ --model_config_file=./models/log-parsing-models/log-parsing-config-20220418.json
 ```
 
 Use `--help` to display information about the command line options:
@@ -110,7 +101,7 @@ PYTHONPATH="examples/log_parsing" \
 morpheus --log_level INFO \
  --plugin "inference" \
  --plugin "postprocessing" \
- run --num_threads 1 --pipeline_batch_size 1024 --model_max_batch_size 32 \
+ run --pipeline_batch_size 1024 --model_max_batch_size 32 \
  pipeline-nlp \
  from-file --filename ./models/datasets/validation-data/log-parsing-validation-data-input.csv \
  deserialize \

@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import logging
 import os
 
 import click
@@ -28,6 +29,7 @@
 from morpheus.stages.output.write_to_file_stage import WriteToFileStage
 from morpheus.stages.preprocess.deserialize_stage import DeserializeStage
 from morpheus.stages.preprocess.preprocess_nlp_stage import PreprocessNLPStage
+from morpheus.utils.logger import configure_logging
 
 
 @click.command()
@@ -79,7 +81,7 @@
  help="The name of the model that is deployed on Tritonserver.",
 )
 @click.option("--model_config_file", required=True, help="Model config file.")
-@click.option("--server_url", required=True, help="Tritonserver url.")
+@click.option("--server_url", required=True, help="Tritonserver url.", default="localhost:8001")
 def run_pipeline(
  num_threads,
  pipeline_batch_size,
@@ -93,6 +95,10 @@ def run_pipeline(
  model_config_file,
  server_url,
 ):
+
+ # Enable the default logger.
+ configure_logging(log_level=logging.INFO)
+
  config = Config()
  config.mode = PipelineModes.NLP
  config.num_threads = num_threads

@@ -103,11 +103,10 @@ The following command line is the entire command to build and launch the pipelin
 
 From the Morpheus repo root directory, run:
 ```bash
-export MORPHEUS_ROOT=$(pwd)
 # Launch Morpheus printing debug messages
 morpheus --log_level=DEBUG \
- `# Run a pipeline with 8 threads and a model batch size of 32 (Must match Triton config)` \
- run --num_threads=8 --pipeline_batch_size=1024 --model_max_batch_size=32 \
+ `# Run a pipeline with a model batch size of 32 (Must match Triton config)` \
+ run --pipeline_batch_size=1024 --model_max_batch_size=32 \
  `# Specify a NLP pipeline with 256 sequence length (Must match Triton config)` \
  pipeline-nlp --model_seq_length=256 \
  `# 1st Stage: Read from file` \
@@ -117,7 +116,7 @@ morpheus --log_level=DEBUG \
  `# 3rd Stage: Preprocessing converts the input data into BERT tokens` \
  preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --do_lower_case=True --truncation=True \
  `# 4th Stage: Send messages to Triton for inference. Specify the model loaded in Setup` \
- inf-triton --model_name=sid-minibert-onnx --server_url=localhost:8000 --force_convert_inputs=True \
+ inf-triton --model_name=sid-minibert-onnx --server_url=localhost:8000 \
  `# 5th Stage: Monitor stage prints throughput information to the console` \
  monitor --description "Inference Rate" --smoothing=0.001 --unit inf \
  `# 6th Stage: Add results from inference to the messages` \

@@ -19,12 +19,12 @@ SCRIPT_DIR=${SCRIPT_DIR:-"$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null
 export MORPHEUS_ROOT=${MORPHEUS_ROOT:-"$(realpath ${SCRIPT_DIR}/../..)"}
 
 morpheus --log_level=DEBUG \
- run --num_threads=8 --pipeline_batch_size=1024 --model_max_batch_size=32 \
+ run --pipeline_batch_size=1024 --model_max_batch_size=32 \
  pipeline-nlp --model_seq_length=256 \
  from-file --filename=${MORPHEUS_ROOT}/examples/data/pcap_dump.jsonlines \
  deserialize \
  preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --do_lower_case=True --truncation=True \
- inf-triton --model_name=sid-minibert-onnx --server_url=localhost:8000 --force_convert_inputs=True \
+ inf-triton --model_name=sid-minibert-onnx --server_url=localhost:8000 \
  monitor --description "Inference Rate" --smoothing=0.001 --unit inf \
  add-class \
  filter --filter_source=TENSOR \

@@ -35,15 +35,15 @@ export MORPHEUS_ROOT=$(pwd)
 ```
 
 ##### Start Triton Inference Server Container
-Run the following from the `examples/ransomware_detection` directory to launch Triton and load the `ransomw-model-short-rf` model:
-
+From the Morpheus repo root directory, run the following to launch Triton and load the `ransomw-model-short-rf` model:
 ```bash
 # Run Triton in explicit mode
-docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models/triton-model-repo nvcr.io/nvidia/tritonserver:23.06-py3 \
- tritonserver --model-repository=/models/triton-model-repo \
- --exit-on-error=false \
- --model-control-mode=explicit \
- --load-model ransomw-model-short-rf
+docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
+ -v $PWD/examples/ransomware_detection/models:/models/triton-model-repo nvcr.io/nvidia/tritonserver:23.06-py3 \
+ tritonserver --model-repository=/models/triton-model-repo \
+ --exit-on-error=false \
+ --model-control-mode=explicit \
+ --load-model ransomw-model-short-rf
 ```
 
 ##### Verify Model Deployment
@@ -67,14 +67,13 @@ mamba install 'dask>=2023.1.1' 'distributed>=2023.1.1'
 ```
 
 ## Run Ransomware Detection Pipeline
-Run the following from the `examples/ransomware_detection` directory to start the ransomware detection pipeline:
+Run the following from the root of the Morpheus repo to start the ransomware detection pipeline:
 
 ```bash
-python run.py --server_url=localhost:8001 \
+python examples/ransomware_detection/run.py --server_url=localhost:8001 \
  --sliding_window=3 \
  --model_name=ransomw-model-short-rf \
- --conf_file=./config/ransomware_detection.yaml \
- --input_glob=${MORPHEUS_ROOT}/examples/data/appshield/*/snapshot-*/*.json \
+ --input_glob=./examples/data/appshield/*/snapshot-*/*.json \
  --output_file=./ransomware_detection_output.jsonlines
 ```
 

@@ -33,6 +33,8 @@
 from stages.create_features import CreateFeaturesRWStage
 from stages.preprocessing import PreprocessingRWStage
 
+CUR_DIR = os.path.dirname(__file__)
+
 
 @click.command()
 @click.option('--debug', default=False)
@@ -64,7 +66,7 @@
 @click.option(
  "--conf_file",
  type=click.STRING,
- default="./config/ransomware_detection.yaml",
+ default=os.path.join(CUR_DIR, "config/ransomware_detection.yaml"),
  help="Ransomware detection configuration filepath.",
 )
 @click.option(

@@ -98,9 +98,6 @@ From the Morpheus repo root directory, run:
 
 ```bash
 export MORPHEUS_ROOT=$(pwd)
-```
-
-```bash
 morpheus --log_level=DEBUG \
 `# Run a pipeline with 5 threads and a model batch size of 32 (Must match Triton config)` \
 run --num_threads=8 --edge_buffer_size=4 --use_cpp=True --pipeline_batch_size=1024 --model_max_batch_size=32 \
@@ -113,7 +110,7 @@ deserialize \
 `# 3rd Stage: Preprocessing converts the input data into BERT tokens` \
 preprocess --column=log --vocab_hash_file=./data/bert-base-uncased-hash.txt --truncation=True --do_lower_case=True --add_special_tokens=False \
 `# 4th Stage: Send messages to Triton for inference. Specify the binary model loaded in Setup` \
-inf-triton --force_convert_inputs=True --model_name=root-cause-binary-onnx --server_url=localhost:8001 \
+inf-triton --model_name=root-cause-binary-onnx --server_url=localhost:8000 \
 `# 5th Stage: Monitor stage prints throughput information to the console` \
 monitor --description='Inference rate' --smoothing=0.001 --unit inf \
 `# 6th Stage: Add scores from inference to the messages` \