From ae0c2f42078411a4637535cddf7b4eee5f92e281 Mon Sep 17 00:00:00 2001 From: swetha097 <59434434+swetha097@users.noreply.github.com> Date: Fri, 16 Dec 2022 03:24:20 +0530 Subject: [PATCH 1/3] rocAL: Classification Training Related changes (#1001) * Zen DNN - Docker & Tests (#924) * Zen DNN - Docker Updates * Zen DNN - Sample Updates * Codacy - Fix * Zen DNN - Cleanup * Zen DNN - single layer sample * Rocal Updates (#921) * rocal updates for tf training * updates for rocal * tf updates and pytorch bug fixes * repo name change * Update README.md * dockerfile update * [rocAL] Fix rocAL Pybind build issue. * [rocAL] Remove unused function in pipeline. * [rocAL] Change rocAL pybind installation from setup.py to wheel. setup.py install is deprecated in python 3.9 * [rocAL] Make TF pets example dataset compatible with tf2. * [rocAL] Change getImageLabels() compatible with tf. * [rocAL] Add fix to pick wheel from dist installation folder. Remove the old installation files in conda environment. * [rocAL] Remove commented statement. Co-authored-by: shobana-mcw * Docker Update (#928) * turboJPEG version update * turboJPEG version update * turboJPEG version update * TurboJPEG version update * Update mivisionx-opencl-on-ubuntu20.dockerfile * Update zenDNN-HIP.dockerfile * Update level-5.dockerfile * Update level-5.dockerfile * Zen DNN Updates Sync Co-authored-by: Kiriti Gowda * AMD OpenVX Custom Extension - implementation (#925) * custom node implementation files * fix build errors * custom extension changes for working implementation * add README and documentation * update readme * fix codacy issues and CPU flow * fix cadacy warning * Addressed review comments * minor change * fix formating * amd_migraphx - update readme for extension (#929) * amd_custom - fixes build issue (#935) * fixes build issue * Update CMakeLists.txt * migraphx extension - update the readme(#936) * vx_amd_migraphx - tests (#923) * batch size support for migraphx * changing to accept tensors of all batch sizes * creates file with results * bug fix * changes to singular test cases - mnist and resnet50 * readme updates * resolving PR comments * resolving PR comments * resolving PR comments * Readme update to reflect tot * formatting * fixing typo * readme update * readme update * OpenVX HIP backend - report correct number of CUs for gfx10+ in the logs (#930) * Add API to get reader config and decoder config * Add API to obtain max and min aspect ratio from image source evaluator * Add scaling modes support Add support to pass the resize scaling modes Add support to calculate the normalized crop * Minor change * Remove crop parameters and related changes for resize * Fix segmentation fault * Fix error with resize modes * MInor fix : update tensor ROI * Minor fix - center crop * Update the python API for resize with scaling modes and interpolation param * Minor changes * Minor changes * Minor change * Remove center crop related changes * Remove redundant max size check * Remove crop param from node resize * Remove source evaluator * Minor fix * Remove the get decoder and reader config API * Remove aspect ratio calculations in source evaluator * Remove decoder and reader config variables Remove crop related changes * Remove decoder and reader config variables Remove crop related changes * Minor fix for max size * Code clean up * Minor change * Minor changes * Minor changes * Minor change * Fix python codacy warnings * Minor codacy fix * Revert "Minor codacy fix" This reverts commit df1dd28427e5e133f23247c0dcece744e9ebf5b1. * Minor change * Minor code changes * Remove API to get max width and height for resize node * Minor fix * Minor changes * Working Image Classification Chnages * Working Image Classification USER GIVEN PARAMS * Add changes in types.py * 1. Code clean up 2. Centre Crop bug fix * Code Clean Up * Add centre_crop changes * ResizeTensor addition * Minor changes in PR * ResizeTensor.cpp - Removing OpenCL backend support * Code Clean Up * Resolving internal PR comments * Resolve the internal review comments -2 * Reesolve build error * runVisionTest - add a new test (#979) * rocAL PyBind - Wheel Package Fix (#982) Co-authored-by: Swetha B S * amd media - device support (#983) * amd_media decoder add parameter for passing deviceid * minor cleanup * fix for review comments * docker update - rpp version update (#986) * Update mivisionx-with-pytorch.dockerfile * Update mivisionx-with-tensorflow.dockerfile * Update level-5.dockerfile * Update mivisionx-on-ubuntu20.dockerfile * Update mivisionx-opencl-on-ubuntu20.dockerfile * rocal - README updates and directory name change (#981) * Update README.md * Update README.md * folder name change * Update README.md * Delete PYTHON_UNITTEST_TEST_FILE.sh * Delete rocAL/rocAL_pybind/example/new_api directory * Update README.md * Update README.md * Update README.md * Create README.md * added new random_crop_dec parameter class * Set the crop values to partial decoder. [rocAL] * Fix undefined reference error in random number generator. [rocAL] * Change parameters for rocalFusedCropDecoder wrt new randomgenrator changes. [rocAL] Remove unused paramaters. * Clean up wrt Random number generator. [rocAl] * Convert double to float for aspect ration and random area parameters in fused crop. [rocAL] * Set seed for every batch in paramater random crop. [rocAL] * Clean up. [rocAL] * Fix Bug with seed generation for RNG * rocAL - hardware decoder python support (#987) * rocAL - removing references (#954) * rocAL - Tf pets training (#947) * Zen DNN - Docker & Tests (#924) * Zen DNN - Docker Updates * Zen DNN - Sample Updates * Codacy - Fix * Zen DNN - Cleanup * Zen DNN - single layer sample * Rocal Updates (#921) * rocal updates for tf training * updates for rocal * tf updates and pytorch bug fixes * repo name change * Update README.md * dockerfile update * [rocAL] Fix rocAL Pybind build issue. * [rocAL] Remove unused function in pipeline. * [rocAL] Change rocAL pybind installation from setup.py to wheel. setup.py install is deprecated in python 3.9 * [rocAL] Make TF pets example dataset compatible with tf2. * [rocAL] Change getImageLabels() compatible with tf. * [rocAL] Add fix to pick wheel from dist installation folder. Remove the old installation files in conda environment. * [rocAL] Remove commented statement. Co-authored-by: shobana-mcw * Docker Update (#928) * turboJPEG version update * turboJPEG version update * turboJPEG version update * TurboJPEG version update * Update mivisionx-opencl-on-ubuntu20.dockerfile * Update zenDNN-HIP.dockerfile * Update level-5.dockerfile * Update level-5.dockerfile * Zen DNN Updates Sync Co-authored-by: Kiriti Gowda * AMD OpenVX Custom Extension - implementation (#925) * custom node implementation files * fix build errors * custom extension changes for working implementation * add README and documentation * update readme * fix codacy issues and CPU flow * fix cadacy warning * Addressed review comments * minor change * fix formating * amd_migraphx - update readme for extension (#929) * amd_custom - fixes build issue (#935) * fixes build issue * Update CMakeLists.txt * tf_pets_v2 * code_cleanup * minor code cleanup * migraphx extension - update the readme(#936) * vx_amd_migraphx - tests (#923) * batch size support for migraphx * changing to accept tensors of all batch sizes * creates file with results * bug fix * changes to singular test cases - mnist and resnet50 * readme updates * resolving PR comments * resolving PR comments * resolving PR comments * Readme update to reflect tot * formatting * fixing typo * readme update * readme update * OpenVX HIP backend - report correct number of CUs for gfx10+ in the logs (#930) * PyTorch docker file - add argument for specifying version (#938) * add argument for specifying pytorch version for building docker file * add readme for pytorch * rocAL - Fix ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED (#940) * Docker - codacy fix for pr937 (#942) * rocAL - Adding NCHW FP16 SIMD kernel (#926) * Adding NCHW FP16 SIMD kernel for normalization and buffer copy * Fixed some codestyle issues with FP16 kernel * Using FMA SSE instruction for multiply-add ops * Removed extra spaces * Adding fma flag to rocAL CMakeLists * Adding FP16 intrinsics for buffer copies * Setting rounding mode to _MM_FROUND_TO_ZERO * rocAL - README updates for video unit test (#939) * Add README support for video unit test * Update Readme for video unit test * Minor test_suite fix * Update video unit test Readme * Update Readme * Updated README with the explation of test cases and arguments * Minor fix * Add test case samples to video unit test README Also add images for README * Modify sample images * Change sample image dimension * Minor README changes * Minor README changes * Minor change * Minor fix to handle relative input path in video unit test * Resolve codacy warnings * Minor change * Add correct video reader outputs * MIVisionX - cmake cleanup (#943) * OpenCV EXT - Updates & Tests (#944) * OpenCV - Readme updates * Updates - Readme & Tests * OpenCV - Tests Added * Updates * ZenDNN - model compiler (#941) * model compiler - zendnn - mnist layers * bug fix + lrn * layers: batch norm, sum ; bug fixes * fixes lgtm errors * bug fixes * codacy fixes * bug fix * codacy fixes * Update train_withROCAL_withTFRecordReader.py * Resolved PR comments Co-authored-by: Kiriti Gowda Co-authored-by: LakshmiKumar23 Co-authored-by: shobana-mcw Co-authored-by: Kiriti Gowda Co-authored-by: Rajy Rawther Co-authored-by: Aryan Salmanpour Co-authored-by: root Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com> * rocAL - fix bug in the usage of GetImageName (#955) * fix bug in the usage of GetImageName * add ground-truth labels .txt file for tinydataset * rename file to all smaller case * AMD - OpenVX Float16 Support (#956) * AMD - Float16 Support * Remove redundant def * OpenVX FP16 - CPP FP16 support * AMD Media Decoder - Measure Decode Time (#964) * added what Aryan recommended to decoder performance measure code * added transfer time measure * rocAL - Changing Python Lib Path (#959) * Changing Python Lib Path * Keep the checks for different env intact Co-authored-by: Swetha B S * MIVisionX - CMakeList Updates (#967) * CMakeList Updates * CMakeList - Cleanup * Setup - Updates * rocAL - CMakeList Cleanup * rocAL - Resize scaling modes support (#950) * Zen DNN - Docker & Tests (#924) * Zen DNN - Docker Updates * Zen DNN - Sample Updates * Codacy - Fix * Zen DNN - Cleanup * Zen DNN - single layer sample * Rocal Updates (#921) * rocal updates for tf training * updates for rocal * tf updates and pytorch bug fixes * repo name change * Update README.md * dockerfile update * [rocAL] Fix rocAL Pybind build issue. * [rocAL] Remove unused function in pipeline. * [rocAL] Change rocAL pybind installation from setup.py to wheel. setup.py install is deprecated in python 3.9 * [rocAL] Make TF pets example dataset compatible with tf2. * [rocAL] Change getImageLabels() compatible with tf. * [rocAL] Add fix to pick wheel from dist installation folder. Remove the old installation files in conda environment. * [rocAL] Remove commented statement. Co-authored-by: shobana-mcw * Docker Update (#928) * turboJPEG version update * turboJPEG version update * turboJPEG version update * TurboJPEG version update * Update mivisionx-opencl-on-ubuntu20.dockerfile * Update zenDNN-HIP.dockerfile * Update level-5.dockerfile * Update level-5.dockerfile * Zen DNN Updates Sync Co-authored-by: Kiriti Gowda * AMD OpenVX Custom Extension - implementation (#925) * custom node implementation files * fix build errors * custom extension changes for working implementation * add README and documentation * update readme * fix codacy issues and CPU flow * fix cadacy warning * Addressed review comments * minor change * fix formating * amd_migraphx - update readme for extension (#929) * amd_custom - fixes build issue (#935) * fixes build issue * Update CMakeLists.txt * migraphx extension - update the readme(#936) * vx_amd_migraphx - tests (#923) * batch size support for migraphx * changing to accept tensors of all batch sizes * creates file with results * bug fix * changes to singular test cases - mnist and resnet50 * readme updates * resolving PR comments * resolving PR comments * resolving PR comments * Readme update to reflect tot * formatting * fixing typo * readme update * readme update * OpenVX HIP backend - report correct number of CUs for gfx10+ in the logs (#930) * Add API to get reader config and decoder config * Add API to obtain max and min aspect ratio from image source evaluator * Add scaling modes support Add support to pass the resize scaling modes Add support to calculate the normalized crop * Minor change * Remove crop parameters and related changes for resize * Fix segmentation fault * Fix error with resize modes * MInor fix : update tensor ROI * Minor fix - center crop * Update the python API for resize with scaling modes and interpolation param * Minor changes * Minor changes * Minor change * Remove center crop related changes * Remove redundant max size check * Remove crop param from node resize * Remove source evaluator * Minor fix * Remove the get decoder and reader config API * Remove aspect ratio calculations in source evaluator * Remove decoder and reader config variables Remove crop related changes * Remove decoder and reader config variables Remove crop related changes * Minor fix for max size * Code clean up * Minor change * Minor changes * Minor changes * Minor change * Fix python codacy warnings * Minor codacy fix * Revert "Minor codacy fix" This reverts commit df1dd28427e5e133f23247c0dcece744e9ebf5b1. * Minor change * Minor code changes * Remove API to get max width and height for resize node * Resize ROI changes * Code cleanup * Rename variables * Code cleanup * MInor changes * Minor change * Minor fix * Minor changes * Modify logic to calculate max size for each mode * Fix max_size calculation algorithm * Fix max_size calculation logic * Minor changes * Minor change * Add space after if * Minor change * Minor changes Co-authored-by: Kiriti Gowda Co-authored-by: LakshmiKumar23 Co-authored-by: shobana-mcw Co-authored-by: Kiriti Gowda Co-authored-by: Rajy Rawther Co-authored-by: Aryan Salmanpour Co-authored-by: IndumathiR * rocAL - fix for copy-write violation (#968) * fix for copywrite violation * fix for review comments and other clean_up * minor clean_up * revert run.sh changes * fix codacy warnings * add jupyter notebook for decoder * rocAL - add missing header (#972) * rocAL - add missing header * Tested Config Updates * OpenVX Framework - update max tensor dims to 6 (#970) * add pipeline decorator for rocal * fix build error * fix script for jupyter notebook * changes to Jupyter notebook to support HW decoder * fixed review comments * hardcoding decoder device to cpu for python unit tests * add option for decoder.py to run on gpu/cpu Co-authored-by: LakshmiKumar23 Co-authored-by: swetha097 <59434434+swetha097@users.noreply.github.com> Co-authored-by: Kiriti Gowda Co-authored-by: shobana-mcw Co-authored-by: Kiriti Gowda Co-authored-by: Aryan Salmanpour Co-authored-by: root Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com> Co-authored-by: Pavel Tcherniaev Co-authored-by: Swetha B S Co-authored-by: IndumathiR * Fix Python build * Wrap long lines of code * Fix spacing & add copyright in pybind * amd-openvx-hip: create a separate stream for graph (#996) * rocAL - CMake and header files Clean up (#991) * rocAL - removing references (#954) * rocAL - Tf pets training (#947) * Zen DNN - Docker & Tests (#924) * Zen DNN - Docker Updates * Zen DNN - Sample Updates * Codacy - Fix * Zen DNN - Cleanup * Zen DNN - single layer sample * Rocal Updates (#921) * rocal updates for tf training * updates for rocal * tf updates and pytorch bug fixes * repo name change * Update README.md * dockerfile update * [rocAL] Fix rocAL Pybind build issue. * [rocAL] Remove unused function in pipeline. * [rocAL] Change rocAL pybind installation from setup.py to wheel. setup.py install is deprecated in python 3.9 * [rocAL] Make TF pets example dataset compatible with tf2. * [rocAL] Change getImageLabels() compatible with tf. * [rocAL] Add fix to pick wheel from dist installation folder. Remove the old installation files in conda environment. * [rocAL] Remove commented statement. Co-authored-by: shobana-mcw * Docker Update (#928) * turboJPEG version update * turboJPEG version update * turboJPEG version update * TurboJPEG version update * Update mivisionx-opencl-on-ubuntu20.dockerfile * Update zenDNN-HIP.dockerfile * Update level-5.dockerfile * Update level-5.dockerfile * Zen DNN Updates Sync Co-authored-by: Kiriti Gowda * AMD OpenVX Custom Extension - implementation (#925) * custom node implementation files * fix build errors * custom extension changes for working implementation * add README and documentation * update readme * fix codacy issues and CPU flow * fix cadacy warning * Addressed review comments * minor change * fix formating * amd_migraphx - update readme for extension (#929) * amd_custom - fixes build issue (#935) * fixes build issue * Update CMakeLists.txt * tf_pets_v2 * code_cleanup * minor code cleanup * migraphx extension - update the readme(#936) * vx_amd_migraphx - tests (#923) * batch size support for migraphx * changing to accept tensors of all batch sizes * creates file with results * bug fix * changes to singular test cases - mnist and resnet50 * readme updates * resolving PR comments * resolving PR comments * resolving PR comments * Readme update to reflect tot * formatting * fixing typo * readme update * readme update * OpenVX HIP backend - report correct number of CUs for gfx10+ in the logs (#930) * PyTorch docker file - add argument for specifying version (#938) * add argument for specifying pytorch version for building docker file * add readme for pytorch * rocAL - Fix ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED (#940) * Docker - codacy fix for pr937 (#942) * rocAL - Adding NCHW FP16 SIMD kernel (#926) * Adding NCHW FP16 SIMD kernel for normalization and buffer copy * Fixed some codestyle issues with FP16 kernel * Using FMA SSE instruction for multiply-add ops * Removed extra spaces * Adding fma flag to rocAL CMakeLists * Adding FP16 intrinsics for buffer copies * Setting rounding mode to _MM_FROUND_TO_ZERO * rocAL - README updates for video unit test (#939) * Add README support for video unit test * Update Readme for video unit test * Minor test_suite fix * Update video unit test Readme * Update Readme * Updated README with the explation of test cases and arguments * Minor fix * Add test case samples to video unit test README Also add images for README * Modify sample images * Change sample image dimension * Minor README changes * Minor README changes * Minor change * Minor fix to handle relative input path in video unit test * Resolve codacy warnings * Minor change * Add correct video reader outputs * MIVisionX - cmake cleanup (#943) * OpenCV EXT - Updates & Tests (#944) * OpenCV - Readme updates * Updates - Readme & Tests * OpenCV - Tests Added * Updates * ZenDNN - model compiler (#941) * model compiler - zendnn - mnist layers * bug fix + lrn * layers: batch norm, sum ; bug fixes * fixes lgtm errors * bug fixes * codacy fixes * bug fix * codacy fixes * Update train_withROCAL_withTFRecordReader.py * Resolved PR comments Co-authored-by: Kiriti Gowda Co-authored-by: LakshmiKumar23 Co-authored-by: shobana-mcw Co-authored-by: Kiriti Gowda Co-authored-by: Rajy Rawther Co-authored-by: Aryan Salmanpour Co-authored-by: root Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com> * rocAL - fix bug in the usage of GetImageName (#955) * fix bug in the usage of GetImageName * add ground-truth labels .txt file for tinydataset * rename file to all smaller case * AMD - OpenVX Float16 Support (#956) * AMD - Float16 Support * Remove redundant def * OpenVX FP16 - CPP FP16 support * migraphx - palamida scan fix (#984) * Delete image_0.jpg * Delete image_1.jpg * Delete image_4.jpg * image update * Readme updates - OpenVX Trademark Updates (#989) * Readme updates - OpenVX Trademark Updates * Readme - Attribution Updates * Readme - Codacy Fix * Media - License Issue Fix (#990) * Fix include path issue in image augmentation app.[rocAL] * CMake clean up. [rocAL] * Clean up. Introduce header files to include all nodes and meta nodes headers. [rocAL] * Change include directories path in image_augmentation app. * CMake clean up in rocAL utilities. * Clean up. Co-authored-by: LakshmiKumar23 Co-authored-by: swetha097 <59434434+swetha097@users.noreply.github.com> Co-authored-by: Kiriti Gowda Co-authored-by: Kiriti Gowda Co-authored-by: Rajy Rawther Co-authored-by: Aryan Salmanpour Co-authored-by: root Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com> * Resolve the PR comments * Resolve PR Comments * Fix the bug with Resize Node * AMD OpenVX - HIP cleanup (#997) * amd-openvx-hip: create a separate stream for graph * removed hipstream associated with context since it is not used * fix for review comments * docker - Pytorch with mesa driver (#998) * Create mivisionx-with-pytorch-with-mesa.dockerfile Adding dockerfile for pytorch with mesa driver for hardware decode * bug fixes to dockerfile Co-authored-by: Lakshmi * OS Support - Updates (#994) * Docker - Archive Old OS * Docker Updates - Fix Support * Setup - Updates * OpenCV - Upgrade to 4.6.0 * Docker - Name Fix * U20 Fix * Docker Readme - Updates * Minor change in the unittest * Remove RPATH/RUNPATH - Adding SKIP RPATH flag (#995) * Adding SKIP RPATH flag * Update Review Comments-SKIP_RPATH replaced with SKIP_INSTALL_RPATH, disable use_link_path * Review Comments Updated * Resolve the internal PR comments * Minor change in image.cpp * Minor change in decoder.h * Minor change in fused_crop_decoder.cpp * Minor changes * Minor changes * Minor changes * Correct spacing issues * Wrap long lines of code in decoders.py * Remove extra line in readers.py * Removes extra line from fused_crop_decoder.cpp * Remove Trailing white space in rocal_pybind.cpp * Wrapping up the long lines of code in decoders.py * Resolving PR comments * Update decoders.py Co-authored-by: Kiriti Gowda Co-authored-by: LakshmiKumar23 Co-authored-by: shobana-mcw Co-authored-by: Kiriti Gowda Co-authored-by: Rajy Rawther Co-authored-by: Aryan Salmanpour Co-authored-by: fiona-gladwin Co-authored-by: root Co-authored-by: Swetha B S Co-authored-by: Swetha B S Co-authored-by: root Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com> Co-authored-by: Pavel Tcherniaev Co-authored-by: IndumathiR Co-authored-by: Lakshmi Co-authored-by: arvindcheru <90783369+arvindcheru@users.noreply.github.com> --- .../include/api/rocal_api_data_loaders.h | 100 +++---- rocAL/rocAL/include/decoders/image/decoder.h | 28 +- .../decoders/image/fused_crop_decoder.h | 8 +- .../include/decoders/image/open_cv_decoder.h | 6 +- .../decoders/image/turbo_jpeg_decoder.h | 8 +- .../include/decoders/video/hw_jpeg_decoder.h | 6 +- .../loaders/image/image_read_and_decode.h | 2 + .../loaders/image/node_fused_jpeg_crop.h | 12 +- .../image/node_fused_jpeg_crop_single_shard.h | 15 +- .../rocAL/include/parameters/parameter_crop.h | 5 +- .../include/parameters/parameter_factory.h | 1 + .../parameter_random_crop_decoder.h | 63 +++++ rocAL/rocAL/include/pipeline/image.h | 4 +- .../source/api/rocal_api_data_loaders.cpp | 106 +++----- .../geometry_augmentations/node_crop.cpp | 10 +- .../node_crop_mirror_normalize.cpp | 4 +- .../geometry_augmentations/node_resize.cpp | 7 +- .../decoders/image/fused_crop_decoder.cpp | 105 ++------ .../loaders/image/image_read_and_decode.cpp | 28 +- .../loaders/image/node_fused_jpeg_crop.cpp | 16 +- .../node_fused_jpeg_crop_single_shard.cpp | 18 +- .../source/parameters/parameter_factory.cpp | 11 +- .../parameter_random_crop_decoder.cpp | 112 ++++++++ ...rali_crop.cpp => parameter_rocal_crop.cpp} | 67 +++-- rocAL/rocAL/source/pipeline/image.cpp | 4 +- rocAL/rocAL/source/pipeline/master_graph.cpp | 2 +- rocAL/rocAL_pybind/amd/rocal/decoders.py | 246 +++++++++--------- rocAL/rocAL_pybind/amd/rocal/fn.py | 5 +- rocAL/rocAL_pybind/amd/rocal/pipeline.py | 1 + rocAL/rocAL_pybind/amd/rocal/readers.py | 73 +++--- rocAL/rocAL_pybind/rocal_pybind.cpp | 212 ++------------- rocAL/rocAL_pybind/run.sh | 4 +- .../rocAL/rocAL_unittests/rocAL_unittests.cpp | 26 +- 33 files changed, 615 insertions(+), 700 deletions(-) create mode 100644 rocAL/rocAL/include/parameters/parameter_random_crop_decoder.h create mode 100644 rocAL/rocAL/source/parameters/parameter_random_crop_decoder.cpp rename rocAL/rocAL/source/parameters/{parameter_rali_crop.cpp => parameter_rocal_crop.cpp} (58%) diff --git a/rocAL/rocAL/include/api/rocal_api_data_loaders.h b/rocAL/rocAL/include/api/rocal_api_data_loaders.h index 4efb236b38..f15676eee1 100644 --- a/rocAL/rocAL/include/api/rocal_api_data_loaders.h +++ b/rocAL/rocAL/include/api/rocal_api_data_loaders.h @@ -155,13 +155,12 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSource(RocalContext cont /// \param rocal_color_format The color format the images will be decoded to. /// \param shard_count Defines the parallelism level by internally sharding the input dataset and load/decode using multiple decoder/loader instances. Using shard counts bigger than 1 improves the load/decode performance if compute resources (CPU cores) are available. /// \param is_output Determines if the user wants the loaded images to be part of the output or not. +/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. +/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. +/// \param num_attempts Maximum number of attempts to generate crop. Default 10 /// \param decode_size_policy /// \param max_width The maximum width of the decoded images, larger or smaller will be resized to closest /// \param max_height The maximum height of the decoded images, larger or smaller will be resized to closest -/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. -/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. -/// \param y_drift_factor - Determines from top left corder to height (crop_height), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. -/// \param x_drift_factor - Determines from top left corder to width (crop_width), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. /// \return Reference to the output image extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartial(RocalContext p_context, const char* source_path, @@ -169,12 +168,13 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartial(RocalConte RocalImageColor rocal_color_format, unsigned internal_shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MAX_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL); + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0); /// Creates JPEG image reader and partial decoder. It allocates the resources and objects required to read and decode COCO Jpeg images stored on the file systems. It has internal sharding capability to load/decode in parallel is user wants. /// If images are not Jpeg compressed they will be ignored. @@ -190,8 +190,6 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartial(RocalConte /// \param max_height The maximum height of the decoded images, larger or smaller will be resized to closest /// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. /// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. -/// \param y_drift_factor - Determines from top left corder to height (crop_height), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. -/// \param x_drift_factor - Determines from top left corder to width (crop_width), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. /// \return Reference to the output image extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartialSingleShard(RocalContext p_context, const char* source_path, @@ -200,14 +198,13 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartialSingleShard unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MAX_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL); - -/// Creates JPEG image reader and decoder. It allocates the resources and objects required to read and decode COCO Jpeg images stored on the file systems. It accepts external sharding information to load a singe shard. only + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0); /// \param rocal_context Rocal context /// \param source_path A NULL terminated char string pointing to the location on the disk /// \param json_path Path to the COCO Json File @@ -220,18 +217,18 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourcePartialSingleShard /// \param max_height The maximum height of the decoded images, larger or smaller will be resized to closest /// \param rocal_decoder_type Determines the decoder_type, tjpeg or hwdec /// \return Reference to the output image -extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourceSingleShard(RocalContext context, - const char* source_path, - const char* json_path, - RocalImageColor color_format, - unsigned shard_id, - unsigned shard_count, - bool is_output , - bool shuffle = false, - bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalDecoderType rocal_decoder_type=RocalDecoderType::ROCAL_DECODER_TJPEG); +extern "C" RocalImage ROCAL_API_CALL rocalJpegCOCOFileSourceSingleShard(RocalContext context, + const char *source_path, + const char *json_path, + RocalImageColor color_format, + unsigned shard_id, + unsigned shard_count, + bool is_output, + bool shuffle = false, + bool loop = false, + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0, + RocalDecoderType rocal_decoder_type = RocalDecoderType::ROCAL_DECODER_TJPEG); /// Creates JPEG image reader and decoder for Caffe LMDB records. It allocates the resources and objects required to read and decode Jpeg images stored in Caffe LMDB Records. It has internal sharding capability to load/decode in parallel is user wants. /// If images are not Jpeg compressed they will be ignored. @@ -392,27 +389,27 @@ extern "C" RocalImage ROCAL_API_CALL rocalMXNetRecordSourceSingleShard(RocalCo /// \param rocal_color_format The color format the images will be decoded to. /// \param num_threads Defines the parallelism level by internally sharding the input dataset and load/decode using multiple decoder/loader instances. Using shard counts bigger than 1 improves the load/decode performance if compute resources (CPU cores) are available. /// \param is_output Determines if the user wants the loaded images to be part of the output or not. +/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. +/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. +/// \param num_attempts Maximum number of attempts to generate crop. Default 10 /// \param shuffle Determines if the user wants to shuffle the dataset or not. /// \param loop Determines if the user wants to indefinitely loops through images or not. /// \param decode_size_policy /// \param max_width The maximum width of the decoded images, larger or smaller will be resized to closest /// \param max_height The maximum height of the decoded images, larger or smaller will be resized to closest -/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. -/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. -/// \param y_drift_factor - Determines from top left corder to height (crop_height), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. -/// \param x_drift_factor - Determines from top left corder to width (crop_width), where to start cropping other wise try for a central crop or take image dims. Ranges from 0 to 1. /// \return Reference to the output image extern "C" RocalImage ROCAL_API_CALL rocalFusedJpegCrop(RocalContext context, const char* source_path, RocalImageColor rocal_color_format, unsigned num_threads, bool is_output , + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MAX_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL); + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0); /// Creates JPEG image reader and partial decoder. It allocates the resources and objects required to read and decode Jpeg images stored on the file systems. It accepts external sharding information to load a singe shard. only /// \param context Rocal context @@ -421,6 +418,9 @@ extern "C" RocalImage ROCAL_API_CALL rocalFusedJpegCrop(RocalContext context, /// \param shard_id Shard id for this loader /// \param shard_count Total shard count /// \param is_output Determines if the user wants the loaded images to be part of the output or not. +/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. +/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. +/// \param num_attempts Maximum number of attempts to generate crop. Default 10 /// \param decode_size_policy /// \param max_width The maximum width of the decoded images, larger or smaller will be resized to closest /// \param max_height The maximum height of the decoded images, larger or smaller will be resized to closest @@ -431,12 +431,13 @@ extern "C" RocalImage ROCAL_API_CALL rocalFusedJpegCropSingleShard(RocalContex unsigned shard_id, unsigned shard_count, bool is_output , + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MAX_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL); + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0); /// Creates TensorFlow records JPEG image reader and decoder. It allocates the resources and objects required to read and decode Jpeg images stored on the file systems. It has internal sharding capability to load/decode in parallel is user wants. /// If images are not Jpeg compressed they will be ignored. @@ -686,6 +687,9 @@ extern "C" RocalStatus ROCAL_API_CALL rocalResetLoaders(RocalContext context); /// \param shard_id Shard id for this loader /// \param shard_count Total shard count /// \param is_output Determines if the user wants the loaded images to be part of the output or not. +/// \param area_factor Determines how much area to be cropped. Ranges from from 0.08 - 1. +/// \param aspect_ratio Determines the aspect ration of crop. Ranges from 0.75 to 1.33. +/// \param num_attempts Maximum number of attempts to generate crop. Default 10 /// \param shuffle Determines if the user wants to shuffle the dataset or not. /// \param loop Determines if the user wants to indefinitely loops through images or not. /// \param decode_size_policy @@ -698,12 +702,13 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCaffeLMDBRecordSourcePartialSing unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL ); + unsigned max_width = 0, unsigned max_height = 0); /// Creates JPEG image reader and partial decoder for Caffe2 LMDB records. It allocates the resources and objects required to read and decode Jpeg images stored in Caffe22 LMDB Records. It has internal sharding capability to load/decode in parallel is user wants. /// \param rocal_context Rocal context @@ -724,12 +729,13 @@ extern "C" RocalImage ROCAL_API_CALL rocalJpegCaffe2LMDBRecordSourcePartialSin unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle = false, bool loop = false, - RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MAX_SIZE, - unsigned max_width = 0, unsigned max_height = 0, - RocalFloatParam area_factor = NULL, RocalFloatParam aspect_ratio = NULL, - RocalFloatParam y_drift_factor = NULL, RocalFloatParam x_drift_factor = NULL ); + RocalImageSizeEvaluationPolicy decode_size_policy = ROCAL_USE_MOST_FREQUENT_SIZE, + unsigned max_width = 0, unsigned max_height = 0); #endif //MIVISIONX_ROCAL_API_DATA_LOADERS_H diff --git a/rocAL/rocAL/include/decoders/image/decoder.h b/rocAL/rocAL/include/decoders/image/decoder.h index 31ac09165b..8e21f32a79 100644 --- a/rocAL/rocAL/include/decoders/image/decoder.h +++ b/rocAL/rocAL/include/decoders/image/decoder.h @@ -26,6 +26,7 @@ THE SOFTWARE. #include #include #include "parameter_factory.h" +#include "parameter_random_crop_decoder.h" enum class DecoderType { @@ -45,20 +46,18 @@ class DecoderConfig explicit DecoderConfig(DecoderType type):_type(type){} virtual DecoderType type() {return _type; }; DecoderType _type = DecoderType::TURBO_JPEG; - std::vector*> _crop_param; - void set_crop_param(std::vector*> crop_param) { _crop_param = std::move(crop_param); }; - std::vector get_crop_param(){ - std::vector crop_mul(4); - _crop_param[0]->renew(); - crop_mul[0] = _crop_param[0]->get(); - _crop_param[1]->renew(); - crop_mul[1] = _crop_param[1]->get(); - _crop_param[2]->renew(); - crop_mul[2] = _crop_param[2]->get(); - _crop_param[3]->renew(); - crop_mul[3] = _crop_param[3]->get(); - return crop_mul; - }; + void set_random_area(std::vector &random_area) { _random_area = std::move(random_area); } + void set_random_aspect_ratio(std::vector &random_aspect_ratio) { _random_aspect_ratio = std::move(random_aspect_ratio); } + void set_num_attempts(unsigned num_attempts) { _num_attempts = num_attempts; } + std::vector get_random_area() { return _random_area; } + std::vector get_random_aspect_ratio() { return _random_aspect_ratio; } + unsigned get_num_attempts() { return _num_attempts; } + void set_seed(int seed) { _seed = seed; } + int get_seed() { return _seed; } +private: + std::vector _random_area, _random_aspect_ratio; + unsigned _num_attempts = 10; + int _seed = std::time(0); //seed for decoder random crop }; @@ -114,4 +113,5 @@ class Decoder virtual bool is_partial_decoder() = 0; virtual void set_bbox_coords(std::vector bbox_coords) = 0; virtual std::vector get_bbox_coords() = 0; + virtual void set_crop_window(CropWindow &crop_window) = 0; }; diff --git a/rocAL/rocAL/include/decoders/image/fused_crop_decoder.h b/rocAL/rocAL/include/decoders/image/fused_crop_decoder.h index cdc4167ecf..5bebce408e 100644 --- a/rocAL/rocAL/include/decoders/image/fused_crop_decoder.h +++ b/rocAL/rocAL/include/decoders/image/fused_crop_decoder.h @@ -56,9 +56,10 @@ class FusedCropTJDecoder : public Decoder { ~FusedCropTJDecoder() override; void initialize(int device_id) override {}; - bool is_partial_decoder() override { return _is_partial_decoder; }; - void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord;}; - std::vector get_bbox_coords() override { return _bbox_coord;} + bool is_partial_decoder() override { return _is_partial_decoder; } + void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord; } + std::vector get_bbox_coords() override { return _bbox_coord; } + void set_crop_window(CropWindow &crop_window) override { _crop_window = crop_window; } private: tjhandle m_jpegDecompressor; @@ -83,4 +84,5 @@ class FusedCropTJDecoder : public Decoder { }; bool _is_partial_decoder = true; std::vector _bbox_coord; + CropWindow _crop_window; }; diff --git a/rocAL/rocAL/include/decoders/image/open_cv_decoder.h b/rocAL/rocAL/include/decoders/image/open_cv_decoder.h index 95692a1df3..ced1e95c6c 100644 --- a/rocAL/rocAL/include/decoders/image/open_cv_decoder.h +++ b/rocAL/rocAL/include/decoders/image/open_cv_decoder.h @@ -64,8 +64,9 @@ class CVDecoder : public Decoder { Decoder::ColorFormat desired_decoded_color_format, DecoderConfig config, bool keep_original_size=false) override; bool is_partial_decoder() override { return _is_partial_decoder; } - void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord;} - std::vector get_bbox_coords() override { return _bbox_coord;} + void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord; } + void set_crop_window(CropWindow &crop_window) override { _crop_window = crop_window; } + std::vector get_bbox_coords() override { return _bbox_coord; } //virtual Status decode(unsigned char* input_buffer, size_t input_size, unsigned char* output_buffer,int desired_width, int desired_height, ColorFormat desired_color); void initialize(int device_id) override {}; ~CVDecoder() override; @@ -76,5 +77,6 @@ class CVDecoder : public Decoder { cv::Mat m_mat_orig; bool _is_partial_decoder = false; std::vector _bbox_coord; + CropWindow _crop_window; }; #endif diff --git a/rocAL/rocAL/include/decoders/image/turbo_jpeg_decoder.h b/rocAL/rocAL/include/decoders/image/turbo_jpeg_decoder.h index 86777edb27..9f333ecb67 100644 --- a/rocAL/rocAL/include/decoders/image/turbo_jpeg_decoder.h +++ b/rocAL/rocAL/include/decoders/image/turbo_jpeg_decoder.h @@ -57,9 +57,10 @@ class TJDecoder : public Decoder { ~TJDecoder() override; void initialize(int device_id) override {}; - bool is_partial_decoder() override { return _is_partial_decoder; } ; - void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord;}; - std::vector get_bbox_coords() override { return _bbox_coord;} + bool is_partial_decoder() override { return _is_partial_decoder; } + void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord; } + void set_crop_window(CropWindow &crop_window) override { _crop_window = crop_window; } + std::vector get_bbox_coords() override { return _bbox_coord; } private: tjhandle m_jpegDecompressor; const static unsigned SCALING_FACTORS_COUNT = 16; @@ -84,4 +85,5 @@ class TJDecoder : public Decoder { bool _is_partial_decoder = false; std::vector _bbox_coord; const static unsigned _max_scaling_factor = 8; + CropWindow _crop_window; }; diff --git a/rocAL/rocAL/include/decoders/video/hw_jpeg_decoder.h b/rocAL/rocAL/include/decoders/video/hw_jpeg_decoder.h index c1221bdd0b..db72f8bf07 100644 --- a/rocAL/rocAL/include/decoders/video/hw_jpeg_decoder.h +++ b/rocAL/rocAL/include/decoders/video/hw_jpeg_decoder.h @@ -68,8 +68,9 @@ class HWJpegDecoder : public Decoder { ~HWJpegDecoder() override; void initialize(int device_id=0); - bool is_partial_decoder() override { return _is_partial_decoder; }; - void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord;}; + bool is_partial_decoder() override { return _is_partial_decoder; } + void set_bbox_coords(std::vector bbox_coord) override { _bbox_coord = bbox_coord;} + void set_crop_window(CropWindow &crop_window) override { _crop_window = crop_window;} std::vector get_bbox_coords() override { return _bbox_coord;} private: @@ -87,6 +88,7 @@ class HWJpegDecoder : public Decoder { bool _is_partial_decoder = false; std::vector _bbox_coord; + CropWindow _crop_window; }; #endif diff --git a/rocAL/rocAL/include/loaders/image/image_read_and_decode.h b/rocAL/rocAL/include/loaders/image/image_read_and_decode.h index fb90f38973..531948adcf 100644 --- a/rocAL/rocAL/include/loaders/image/image_read_and_decode.h +++ b/rocAL/rocAL/include/loaders/image/image_read_and_decode.h @@ -29,6 +29,7 @@ THE SOFTWARE. #include "reader_factory.h" #include "timing_debug.h" #include "loader_module.h" +#include "parameter_random_crop_decoder.h" /** * Compute the scaled value of dimension using the given scaling @@ -95,5 +96,6 @@ class ImageReadAndDecode std::vector> _bbox_coords, _crop_coords_batch; std::shared_ptr _randombboxcrop_meta_data_reader = nullptr; pCropCord _CropCord; + RocalRandomCropDecParam *_random_crop_dec_param = nullptr; }; diff --git a/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop.h b/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop.h index b13e93a4a0..1d3823c2eb 100644 --- a/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop.h +++ b/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop.h @@ -43,7 +43,7 @@ class FusedJpegCropNode: public Node /// for example if there are 10 images in the dataset and load_batch_count is 3, the loader repeats 2 images as if there are 12 images available. void init(unsigned internal_shard_count, const std::string &source_path, const std::string &json_path, StorageType storage_type, DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, - FloatParam *area_factor, FloatParam *aspect_ratio, FloatParam *x_drift, FloatParam *y_drift); + unsigned num_attempts, std::vector &random_area, std::vector &random_aspect_ratio); std::shared_ptr get_loader_module(); protected: @@ -51,12 +51,6 @@ class FusedJpegCropNode: public Node void update_node() override {}; private: std::shared_ptr _loader_module = nullptr; - Parameter* _x_drift; - Parameter* _y_drift; - Parameter* _area_factor; - Parameter* _aspect_ratio; - constexpr static float X_DRIFT_RANGE [2] = {0, 1}; - constexpr static float Y_DRIFT_RANGE [2] = {0, 1}; - constexpr static float AREA_FACTOR_RANGE[2] = {0.08, 0.99}; - constexpr static float ASPECT_RATIO_RANGE[2] = {0.75, 1.33}; + std::vector _random_area, _random_aspect_ratio; + unsigned _num_attempts; }; diff --git a/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop_single_shard.h b/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop_single_shard.h index c2f90c2f80..766db74da5 100644 --- a/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop_single_shard.h +++ b/rocAL/rocAL/include/loaders/image/node_fused_jpeg_crop_single_shard.h @@ -39,9 +39,8 @@ class FusedJpegCropSingleShardNode: public Node /// The loader will repeat images if necessary to be able to have images in multiples of the load_batch_count, /// for example if there are 10 images in the dataset and load_batch_count is 3, the loader repeats 2 images as if there are 12 images available. void init(unsigned shard_id, unsigned shard_count, const std::string &source_path, const std::string &json_path, StorageType storage_type, - DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, - FloatParam *area_factor, FloatParam *aspect_ratio, FloatParam *x_drift, FloatParam *y_drift); - + DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, + unsigned num_attempts, std::vector &random_area, std::vector &random_aspect_ratio); std::shared_ptr get_loader_module(); protected: @@ -49,12 +48,6 @@ class FusedJpegCropSingleShardNode: public Node void update_node() override {}; private: std::shared_ptr _loader_module = nullptr; - Parameter* _x_drift; - Parameter* _y_drift; - Parameter* _area_factor; - Parameter* _aspect_ratio; - constexpr static float X_DRIFT_RANGE [2] = {0, 1}; - constexpr static float Y_DRIFT_RANGE [2] = {0, 1}; - constexpr static float AREA_FACTOR_RANGE[2] = {0.08, 0.99}; - constexpr static float ASPECT_RATIO_RANGE[2] = {0.75, 1.33}; + std::vector _random_area, _random_aspect_ratio; + unsigned _num_attempts; }; diff --git a/rocAL/rocAL/include/parameters/parameter_crop.h b/rocAL/rocAL/include/parameters/parameter_crop.h index 3b4591bab8..058184fee4 100644 --- a/rocAL/rocAL/include/parameters/parameter_crop.h +++ b/rocAL/rocAL/include/parameters/parameter_crop.h @@ -45,7 +45,7 @@ class CropParam // V Y directoin public: CropParam() = delete; - CropParam(unsigned int batch_size): batch_size(batch_size), _random(false) + CropParam(unsigned int batch_size): batch_size(batch_size), _random(false), _is_center_crop(false) { x_drift_factor = default_x_drift_factor(); y_drift_factor = default_y_drift_factor(); @@ -58,6 +58,7 @@ class CropParam in_height = in_height_; } void set_random() {_random = true;} + void set_center_crop() { _is_center_crop = true; } void set_x_drift_factor(Parameter* x_drift); void set_y_drift_factor(Parameter* y_drift); std::vector in_width, in_height; @@ -82,7 +83,7 @@ class CropParam Parameter* default_x_drift_factor(); Parameter* default_y_drift_factor(); std::vector x1_arr_val, y1_arr_val, croph_arr_val, cropw_arr_val, x2_arr_val, y2_arr_val; - bool _random; + bool _random, _is_center_crop; virtual void fill_crop_dims(){}; void update_crop_array(); }; diff --git a/rocAL/rocAL/include/parameters/parameter_factory.h b/rocAL/rocAL/include/parameters/parameter_factory.h index 8090842edf..ec6ed2a695 100644 --- a/rocAL/rocAL/include/parameters/parameter_factory.h +++ b/rocAL/rocAL/include/parameters/parameter_factory.h @@ -77,6 +77,7 @@ class ParameterFactory { void renew_parameters(); void set_seed(unsigned seed); unsigned get_seed(); + void generate_seed(); template Parameter* create_uniform_rand_param(T start, T end){ diff --git a/rocAL/rocAL/include/parameters/parameter_random_crop_decoder.h b/rocAL/rocAL/include/parameters/parameter_random_crop_decoder.h new file mode 100644 index 0000000000..29e87fb861 --- /dev/null +++ b/rocAL/rocAL/include/parameters/parameter_random_crop_decoder.h @@ -0,0 +1,63 @@ +/* +Copyright (c) 2019 - 2022 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + + +#pragma once +#include "parameter_factory.h" +#include +#include + +struct CropWindow { + unsigned x, y, H, W; + CropWindow() {} + CropWindow(unsigned x1, unsigned y1, unsigned h, unsigned w) { x = x1, y = y1, H = h, W = w ; } + void set_shape(unsigned h, unsigned w) { H = h, W = w; } +}; + +typedef std::vector Shape; +using AspectRatioRange = std::pair; +using AreaRange = std::pair; + +class RocalRandomCropDecParam { + public: + explicit RocalRandomCropDecParam( + AspectRatioRange aspect_ratio_range = { 3.0f / 4, 4.0f / 3 }, + AreaRange area_range = { 0.08, 1 }, + int64_t seed = time(0), + int num_attempts = 10, + int batch_size = 256); + CropWindow generate_crop_window(const Shape& shape, const int instance); + void generate_random_seeds(); + private: + CropWindow generate_crop_window_implementation(const Shape& shape); + AspectRatioRange _aspect_ratio_range; + // Aspect ratios are uniformly distributed on logarithmic scale. + // This provides natural symmetry and smoothness of the distribution. + std::uniform_real_distribution _aspect_ratio_log_dis; + std::uniform_real_distribution _area_dis; + // thread_local is needed to call it from multiple threads async, so each thread will have its own copy + static thread_local std::mt19937 _rand_gen; + int64_t _seed; + std::vector _seeds; + int _num_attempts; + int _batch_size; +}; diff --git a/rocAL/rocAL/include/pipeline/image.h b/rocAL/rocAL/include/pipeline/image.h index 8dba62c175..ae919db4de 100644 --- a/rocAL/rocAL/include/pipeline/image.h +++ b/rocAL/rocAL/include/pipeline/image.h @@ -104,7 +104,7 @@ struct ImageInfo Type type() const { return _type; } unsigned batch_size() const {return _batch_size;} RocalMemType mem_type() const { return _mem_type; } - unsigned data_size() const { return _data_size; } + uint64_t data_size() const { return _data_size; } RocalColorFormat color_format() const {return _color_fmt; } unsigned get_roi_width(int image_batch_idx) const; unsigned get_roi_height(int image_batch_idx) const; @@ -118,7 +118,7 @@ struct ImageInfo unsigned _height;//!< image height for a single image in the batch unsigned _color_planes;//!< number of color planes unsigned _batch_size;//!< the batch size (images in the batch are stacked on top of each other) - unsigned _data_size;//!< total size of the memory needed to keep the image's data in bytes including all planes + uint64_t _data_size;//!< total size of the memory needed to keep the image's data in bytes including all planes RocalMemType _mem_type;//!< memory type, currently either OpenCL or Host RocalColorFormat _color_fmt;//!< color format of the image std::shared_ptr> _roi_width;//!< The actual image width stored in the buffer, it's always smaller than _width/_batch_size. It's created as a vector of pointers to integers, so that if it's passed from one image to another and get updated by one and observed for all. diff --git a/rocAL/rocAL/source/api/rocal_api_data_loaders.cpp b/rocAL/rocAL/source/api/rocal_api_data_loaders.cpp index 176501249c..142a0d9641 100644 --- a/rocAL/rocAL/source/api/rocal_api_data_loaders.cpp +++ b/rocAL/rocAL/source/api/rocal_api_data_loaders.cpp @@ -771,21 +771,16 @@ rocalJpegCaffeLMDBRecordSourcePartialSingleShard( unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -830,7 +825,7 @@ rocalJpegCaffeLMDBRecordSourcePartialSingleShard( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); @@ -858,21 +853,16 @@ rocalJpegCaffe2LMDBRecordSourcePartialSingleShard( unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -917,7 +907,7 @@ rocalJpegCaffe2LMDBRecordSourcePartialSingleShard( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); if(is_output) @@ -943,21 +933,16 @@ rocalMXNetRecordSource( unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -1002,7 +987,7 @@ rocalMXNetRecordSource( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); @@ -1359,22 +1344,17 @@ rocalFusedJpegCrop( RocalImageColor rocal_color_format, unsigned internal_shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor + unsigned max_height ) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -1412,7 +1392,7 @@ rocalFusedJpegCrop( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); if(is_output) @@ -1438,21 +1418,16 @@ rocalJpegCOCOFileSourcePartial( RocalImageColor rocal_color_format, unsigned internal_shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -1494,7 +1469,7 @@ rocalJpegCOCOFileSourcePartial( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); @@ -1523,21 +1498,16 @@ rocalJpegCOCOFileSourcePartialSingleShard( unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -1580,7 +1550,7 @@ rocalJpegCOCOFileSourcePartialSingleShard( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); @@ -1913,22 +1883,16 @@ rocalFusedJpegCropSingleShard( unsigned shard_id, unsigned shard_count, bool is_output, + std::vector& area_factor, + std::vector& aspect_ratio, + unsigned num_attempts, bool shuffle, bool loop, RocalImageSizeEvaluationPolicy decode_size_policy, unsigned max_width, - unsigned max_height, - RocalFloatParam p_area_factor, - RocalFloatParam p_aspect_ratio, - RocalFloatParam p_x_drift_factor, - RocalFloatParam p_y_drift_factor - ) + unsigned max_height) { Image* output = nullptr; - auto area_factor = static_cast(p_area_factor); - auto aspect_ratio = static_cast(p_aspect_ratio); - auto x_drift_factor = static_cast(p_x_drift_factor); - auto y_drift_factor = static_cast(p_y_drift_factor); auto context = static_cast(p_context); try { @@ -1969,7 +1933,7 @@ rocalFusedJpegCropSingleShard( context->user_batch_size(), context->master_graph->mem_type(), context->master_graph->meta_data_reader(), - area_factor, aspect_ratio, x_drift_factor, y_drift_factor); + num_attempts, area_factor, aspect_ratio); context->master_graph->set_loop(loop); if(is_output) diff --git a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp index 6185ca09c3..1e987f2a82 100644 --- a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp +++ b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp @@ -73,16 +73,14 @@ void CropNode::init(unsigned int crop_h, unsigned int crop_w, float x_drift_, fl _crop_param->set_y_drift_factor(core(y_drift)); } +// This init is used only for centre crop void CropNode::init(unsigned int crop_h, unsigned int crop_w) { _crop_param->crop_w = crop_w; _crop_param->crop_h = crop_h; - _crop_param->x1 = 0; - _crop_param->y1 = 0; - FloatParam *x_drift = ParameterFactory::instance()->create_single_value_float_param(0.5); - FloatParam *y_drift = ParameterFactory::instance()->create_single_value_float_param(0.5); - _crop_param->set_x_drift_factor(core(x_drift)); - _crop_param->set_y_drift_factor(core(y_drift)); + _crop_param->x1 = 0; + _crop_param->y1 = 0; + _crop_param->set_center_crop(); } diff --git a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp index cdf63ddd5a..6398893438 100644 --- a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp +++ b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp @@ -79,9 +79,11 @@ void CropMirrorNormalizeNode::update_node() void CropMirrorNormalizeNode::init(int crop_h, int crop_w, float start_x, float start_y, float mean, float std_dev, IntParam *mirror) { + _crop_param->x1 = start_x; + _crop_param->y1 = start_y; _crop_param->crop_h = crop_h; _crop_param->crop_w = crop_w; - _mean = mean; + _mean = mean; _std_dev = std_dev; _mirror.set_param(core(mirror)); } diff --git a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_resize.cpp b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_resize.cpp index 078087b6df..6dd57ba51e 100644 --- a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_resize.cpp +++ b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_resize.cpp @@ -55,7 +55,7 @@ void ResizeNode::create_node() { #endif vx_status status; if((status = vxGetStatus((vx_reference)_node)) != VX_SUCCESS) - THROW("Adding the resize (vxExtrppNode_ResizebatchPD) node failed: "+ TOSTR(status)) + THROW("Adding the resize (vxExtrppNode_Resizetensor) node failed: "+ TOSTR(status)) } void ResizeNode::update_node() { @@ -128,10 +128,11 @@ void ResizeNode::adjust_out_roi_size() { } if (has_max_size) { - if (_max_width != 0) scale = std::min(scale, static_cast(_max_width) / _src_width); - if (_max_height != 0) scale = std::min(scale, static_cast(_max_height) / _src_height); + if (_max_width) scale = std::min(scale, static_cast(_max_width) / _src_width); + if (_max_height) scale = std::min(scale, static_cast(_max_height) / _src_height); } if ((scale_h != scale) || (!_dst_height)) _dst_height = std::lround(_src_height * scale); + if ((scale_w != scale) || (!_dst_width)) _dst_width = std::lround(_src_width * scale); } } diff --git a/rocAL/rocAL/source/decoders/image/fused_crop_decoder.cpp b/rocAL/rocAL/source/decoders/image/fused_crop_decoder.cpp index bf05032c3d..e179bed494 100644 --- a/rocAL/rocAL/source/decoders/image/fused_crop_decoder.cpp +++ b/rocAL/rocAL/source/decoders/image/fused_crop_decoder.cpp @@ -27,21 +27,9 @@ THE SOFTWARE. FusedCropTJDecoder::FusedCropTJDecoder(){ m_jpegDecompressor = tjInitDecompress(); - -#if 0 - int num_avail_scalings = 0; - auto scaling_factors = tjGetScalingFactors (&num_avail_scalings); - for(int i = 0; i < num_avail_scalings; i++) { - if(scaling_factors[i].num < scaling_factors[i].denom) { - - printf("%d / %d - ",scaling_factors[i].num, scaling_factors[i].denom ); - } - } -#endif }; -Decoder::Status FusedCropTJDecoder::decode_info(unsigned char* input_buffer, size_t input_size, int* width, int* height, int* color_comps) -{ +Decoder::Status FusedCropTJDecoder::decode_info(unsigned char* input_buffer, size_t input_size, int* width, int* height, int* color_comps) { //TODO : Use the most recent TurboJpeg API tjDecompressHeader3 which returns the color components if(tjDecompressHeader2(m_jpegDecompressor, input_buffer, @@ -60,8 +48,7 @@ Decoder::Status FusedCropTJDecoder::decode(unsigned char *input_buffer, size_t i size_t max_decoded_width, size_t max_decoded_height, size_t original_image_width, size_t original_image_height, size_t &actual_decoded_width, size_t &actual_decoded_height, - Decoder::ColorFormat desired_decoded_color_format, DecoderConfig decoder_config, bool keep_original_size) -{ + Decoder::ColorFormat desired_decoded_color_format, DecoderConfig decoder_config, bool keep_original_size) { int tjpf = TJPF_RGB; int planes = 1; switch (desired_decoded_color_format) { @@ -83,64 +70,17 @@ Decoder::Status FusedCropTJDecoder::decode(unsigned char *input_buffer, size_t i // You need get the output of random bbox crop // check the vector size for bounding box. If its more than zero go for random bbox crop // else go to random crop - unsigned int crop_width, crop_height, x1, y1, x1_diff, crop_width_diff; - if(_bbox_coord.size() != 0) - { + unsigned int x1_diff, crop_width_diff; + if (_bbox_coord.size() != 0) { // Random bbox crop returns normalized crop cordinates - // hence bringing it back to absolute cordinates - x1 = std::lround(_bbox_coord[0] * original_image_width); - y1 = std::lround(_bbox_coord[1] * original_image_height); - crop_width = std::lround((_bbox_coord[2]) * original_image_width); - crop_height = std::lround((_bbox_coord[3]) * original_image_height); + // hence bringing it back to absolute cordinates + _crop_window.x = std::lround(_bbox_coord[0] * original_image_width); + _crop_window.y = std::lround(_bbox_coord[1] * original_image_height); + _crop_window.W = std::lround((_bbox_coord[2]) * original_image_width); + _crop_window.H = std::lround((_bbox_coord[3]) * original_image_height); } - else - { - std::vector crop_mul_param = decoder_config.get_crop_param(); - auto is_valid_crop = [](uint h, uint w, uint height, uint width) - { - return (h < height && w < width); - }; - bool bvalid_crop = false; - int num_of_attempts = 5; - for(int i = 0; i < num_of_attempts; i++) - { - double target_area = crop_mul_param[0] * original_image_width * original_image_height; - crop_width = static_cast(std::sqrt(target_area * crop_mul_param[1])); - crop_height = static_cast(std::sqrt(target_area * (1 / crop_mul_param[1]))); - if(is_valid_crop(crop_height, crop_width, original_image_height, original_image_width)) - { - x1 = static_cast(crop_mul_param[2] * (original_image_width - crop_width)); - y1 = static_cast(crop_mul_param[3] * (original_image_height - crop_height)); - bvalid_crop = true; - break ; - } - } - constexpr static float ASPECT_RATIO_RANGE[2] = {0.75, 1.33}; - // Fallback on Central Crop - if(!bvalid_crop){ - float in_ratio; - in_ratio = static_cast(original_image_width) / original_image_height; - if(in_ratio < ASPECT_RATIO_RANGE[0]) - { - crop_width = original_image_width; - crop_height = crop_width / ASPECT_RATIO_RANGE[0]; - } - else if(in_ratio > ASPECT_RATIO_RANGE[1]) - { - crop_height = original_image_height; - crop_width = crop_height * ASPECT_RATIO_RANGE[1]; - } - else - { - crop_height = original_image_height; - crop_width = original_image_width; - } - x1 = (original_image_width - crop_width) / 2; - y1 = (original_image_height - crop_height) / 2; - } - } - - // std::cout<<"Fused Crop Decoder : " << x1 << " " << y1 << " " << crop_width << " " << crop_height << std::endl; + _crop_window.W = std::min(_crop_window.W, (unsigned int)max_decoded_width); + _crop_window.H = std::min(_crop_window.H, (unsigned int)max_decoded_height); //TODO : Turbo Jpeg supports multiple color packing and color formats, add more as an option to the API TJPF_RGB, TJPF_BGR, TJPF_RGBX, TJPF_BGRX, TJPF_RGBA, TJPF_GRAY, TJPF_CMYK , ... if( tjDecompress2_partial(m_jpegDecompressor, input_buffer, @@ -151,32 +91,27 @@ Decoder::Status FusedCropTJDecoder::decode(unsigned char *input_buffer, size_t i max_decoded_height, tjpf, TJFLAG_FASTDCT, &x1_diff, &crop_width_diff, - x1, y1, crop_width, crop_height) != 0) - - { + _crop_window.x, _crop_window.y, _crop_window.W, _crop_window.H) != 0) { WRN("Jpeg image decode failed " + STR(tjGetErrorStr2(m_jpegDecompressor))) return Status::CONTENT_DECODE_FAILED; } - if (x1 != x1_diff) { - //std::cout << "x_off changed by tjpeg decoder " << x1 << " " << x1_diff << std::endl; + // x1-diff should be set to x offset in tensor pipeline and removed. + if (_crop_window.x != x1_diff) { unsigned char *src_ptr_temp, *dst_ptr_temp; unsigned int elements_in_row = max_decoded_width * planes; - unsigned int elements_in_crop_row = crop_width * planes; - //unsigned int remainingElements = elements_in_row - elements_in_crop_row; - unsigned int xoffs = (x1-x1_diff) * planes; // in case x1 gets adjusted by tjpeg decoder - src_ptr_temp = output_buffer + xoffs; + unsigned int elements_in_crop_row = _crop_window.W * planes; + unsigned int xoffs = (_crop_window.x - x1_diff) * planes; // in case _crop_window.x gets adjusted by tjpeg decoder + src_ptr_temp = output_buffer; dst_ptr_temp = output_buffer; - for (unsigned int i = 0; i < crop_height; i++) - { + for (unsigned int i = 0; i < _crop_window.H; i++) { memcpy(dst_ptr_temp, src_ptr_temp + xoffs, elements_in_crop_row * sizeof(unsigned char)); - //memset(dst_ptr_temp + elements_in_crop_row, 0, remainingElements * sizeof(unsigned char)); src_ptr_temp += elements_in_row; dst_ptr_temp += elements_in_row; } } - actual_decoded_width = crop_width; - actual_decoded_height = crop_height; + actual_decoded_width = _crop_window.W; + actual_decoded_height = _crop_window.H; return Status::OK; } diff --git a/rocAL/rocAL/source/loaders/image/image_read_and_decode.cpp b/rocAL/rocAL/source/loaders/image/image_read_and_decode.cpp index fc43571e0b..72a93b6a87 100644 --- a/rocAL/rocAL/source/loaders/image/image_read_and_decode.cpp +++ b/rocAL/rocAL/source/loaders/image/image_read_and_decode.cpp @@ -83,6 +83,14 @@ ImageReadAndDecode::create(ReaderConfig reader_config, DecoderConfig decoder_con _original_height.resize(_batch_size); _original_width.resize(_batch_size); _decoder_config = decoder_config; + _random_crop_dec_param = nullptr; + if (_decoder_config._type == DecoderType::FUSED_TURBO_JPEG) { + auto random_aspect_ratio = decoder_config.get_random_aspect_ratio(); + auto random_area = decoder_config.get_random_area(); + AspectRatioRange aspect_ratio_range = std::make_pair((float)random_aspect_ratio[0], (float)random_aspect_ratio[1]); + AreaRange area_range = std::make_pair((float)random_area[0], (float)random_area[1]); + _random_crop_dec_param = new RocalRandomCropDecParam(aspect_ratio_range, area_range, (int64_t)decoder_config.get_seed(), decoder_config.get_num_attempts(), _batch_size); + } if ((_decoder_config._type != DecoderType::SKIP_DECODE)) { for (int i = 0; i < batch_size; i++) { _compressed_buff[i].resize(MAX_COMPRESSED_SIZE); // If we don't need MAX_COMPRESSED_SIZE we can remove this & resize in load module @@ -179,8 +187,7 @@ ImageReadAndDecode::load(unsigned char* buff, } //_file_load_time.end();// Debug timing //return LoaderModuleStatus::OK; - } - else { + } else { while ((file_counter != _batch_size) && _reader->count_items() > 0) { size_t fsize = _reader->open(); if (fsize == 0) { @@ -194,12 +201,12 @@ ImageReadAndDecode::load(unsigned char* buff, _compressed_image_size[file_counter] = fsize; file_counter++; } - - if (_randombboxcrop_meta_data_reader) - { + if (_randombboxcrop_meta_data_reader) { //Fetch the crop co-ordinates for a batch of images _bbox_coords = _randombboxcrop_meta_data_reader->get_batch_crop_coords(_image_names); set_batch_random_bbox_crop_coords(_bbox_coords); + } else if (_random_crop_dec_param) { + _random_crop_dec_param->generate_random_seeds(); } } @@ -225,9 +232,14 @@ ImageReadAndDecode::load(unsigned char* buff, _original_width[i] = original_width; // decode the image and get the actual decoded image width and height size_t scaledw, scaledh; - if(_decoder[i]->is_partial_decoder() && _randombboxcrop_meta_data_reader) - { - _decoder[i]->set_bbox_coords(_bbox_coords[i]); + if (_decoder[i]->is_partial_decoder()) { + if (_randombboxcrop_meta_data_reader) { + _decoder[i]->set_bbox_coords(_bbox_coords[i]); + } else if (_random_crop_dec_param) { + Shape dec_shape = {_original_height[i], _original_width[i]}; + auto crop_window = _random_crop_dec_param->generate_crop_window(dec_shape, i); + _decoder[i]->set_crop_window(crop_window); + } } if (_decoder[i]->decode(_compressed_buff[i].data(), _compressed_image_size[i], _decompressed_buff_ptrs[i], max_decoded_width, max_decoded_height, diff --git a/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop.cpp b/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop.cpp index cc052013cf..e483384269 100644 --- a/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop.cpp +++ b/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop.cpp @@ -31,7 +31,7 @@ FusedJpegCropNode::FusedJpegCropNode(Image *output, void *device_resources): void FusedJpegCropNode::init(unsigned internal_shard_count, const std::string &source_path, const std::string &json_path, StorageType storage_type, DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, - FloatParam *area_factor, FloatParam *aspect_ratio, FloatParam *x_drift, FloatParam *y_drift) + unsigned num_attempts, std::vector &random_area, std::vector &random_aspect_ratio) { if(!_loader_module) THROW("ERROR: loader module is not set for FusedJpegCropNode, cannot initialize") @@ -45,16 +45,10 @@ void FusedJpegCropNode::init(unsigned internal_shard_count, const std::string &s reader_cfg.set_meta_data_reader(meta_data_reader); auto decoder_cfg = DecoderConfig(decoder_type); - std::vector*> crop_param; - _area_factor = ParameterFactory::instance()->create_uniform_float_rand_param(AREA_FACTOR_RANGE[0], AREA_FACTOR_RANGE[1])->core; - _aspect_ratio = ParameterFactory::instance()->create_uniform_float_rand_param(ASPECT_RATIO_RANGE[0], ASPECT_RATIO_RANGE[1])->core; - _x_drift = ParameterFactory::instance()->create_uniform_float_rand_param(X_DRIFT_RANGE[0], X_DRIFT_RANGE[1])->core; - _y_drift = ParameterFactory::instance()->create_uniform_float_rand_param(Y_DRIFT_RANGE[0], Y_DRIFT_RANGE[1])->core; - crop_param.push_back(_area_factor); - crop_param.push_back(_aspect_ratio); - crop_param.push_back(_x_drift); - crop_param.push_back(_y_drift); - decoder_cfg.set_crop_param(crop_param); + decoder_cfg.set_random_area(random_area); + decoder_cfg.set_random_aspect_ratio(random_aspect_ratio); + decoder_cfg.set_num_attempts(num_attempts); + decoder_cfg.set_seed(ParameterFactory::instance()->get_seed()); _loader_module->initialize(reader_cfg, decoder_cfg, mem_type, _batch_size); diff --git a/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop_single_shard.cpp b/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop_single_shard.cpp index 5e210dcc13..42501c0c01 100644 --- a/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop_single_shard.cpp +++ b/rocAL/rocAL/source/loaders/image/node_fused_jpeg_crop_single_shard.cpp @@ -31,8 +31,8 @@ FusedJpegCropSingleShardNode::FusedJpegCropSingleShardNode(Image *output, void * } void FusedJpegCropSingleShardNode::init(unsigned shard_id, unsigned shard_count, const std::string &source_path, const std::string &json_path, StorageType storage_type, - DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, - FloatParam *area_factor, FloatParam *aspect_ratio, FloatParam *x_drift, FloatParam *y_drift) + DecoderType decoder_type, bool shuffle, bool loop, size_t load_batch_count, RocalMemType mem_type, std::shared_ptr meta_data_reader, + unsigned num_attempts, std::vector &area_factor, std::vector &aspect_ratio) { if(!_loader_module) THROW("ERROR: loader module is not set for FusedJpegCropSingleShardNode, cannot initialize") @@ -50,16 +50,10 @@ void FusedJpegCropSingleShardNode::init(unsigned shard_id, unsigned shard_count, auto decoder_cfg = DecoderConfig(decoder_type); - std::vector*> crop_param; - _area_factor = ParameterFactory::instance()->create_uniform_float_rand_param(AREA_FACTOR_RANGE[0], AREA_FACTOR_RANGE[1])->core; - _aspect_ratio = ParameterFactory::instance()->create_uniform_float_rand_param(ASPECT_RATIO_RANGE[0], ASPECT_RATIO_RANGE[1])->core; - _x_drift = ParameterFactory::instance()->create_uniform_float_rand_param(X_DRIFT_RANGE[0], X_DRIFT_RANGE[1])->core; - _y_drift = ParameterFactory::instance()->create_uniform_float_rand_param(Y_DRIFT_RANGE[0], Y_DRIFT_RANGE[1])->core; - crop_param.push_back(_area_factor); - crop_param.push_back(_aspect_ratio); - crop_param.push_back(_x_drift); - crop_param.push_back(_y_drift); - decoder_cfg.set_crop_param(crop_param); + decoder_cfg.set_random_area(area_factor); + decoder_cfg.set_random_aspect_ratio(aspect_ratio); + decoder_cfg.set_num_attempts(num_attempts); + decoder_cfg.set_seed(ParameterFactory::instance()->get_seed()); _loader_module->initialize(reader_cfg, decoder_cfg, mem_type, _batch_size); diff --git a/rocAL/rocAL/source/parameters/parameter_factory.cpp b/rocAL/rocAL/source/parameters/parameter_factory.cpp index 44a84911d1..1e6c3a67e6 100644 --- a/rocAL/rocAL/source/parameters/parameter_factory.cpp +++ b/rocAL/rocAL/source/parameters/parameter_factory.cpp @@ -72,10 +72,8 @@ bool validate_uniform_rand_param(pParam rand_obj) ParameterFactory::ParameterFactory() { - std::random_device rd; - _seed = rd(); + generate_seed(); } - ParameterFactory* ParameterFactory::instance() { if(_instance == nullptr)// For performance reasons @@ -116,6 +114,13 @@ ParameterFactory::get_seed() return _seed; } +void +ParameterFactory::generate_seed() +{ + std::random_device rd; + _seed = rd(); +} + void ParameterFactory::set_seed(unsigned seed) { diff --git a/rocAL/rocAL/source/parameters/parameter_random_crop_decoder.cpp b/rocAL/rocAL/source/parameters/parameter_random_crop_decoder.cpp new file mode 100644 index 0000000000..f6d0462940 --- /dev/null +++ b/rocAL/rocAL/source/parameters/parameter_random_crop_decoder.cpp @@ -0,0 +1,112 @@ +/* +Copyright (c) 2019 - 2022 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + + +#include "parameter_random_crop_decoder.h" +#include + +// Initializing the random generator so all objects of the class can share it. +thread_local std::mt19937 RocalRandomCropDecParam::_rand_gen(time(0)); + +RocalRandomCropDecParam::RocalRandomCropDecParam( + AspectRatioRange aspect_ratio_range, + AreaRange area_range, + int64_t seed, + int num_attempts, + int batch_size) + : _aspect_ratio_range(aspect_ratio_range) + , _aspect_ratio_log_dis(std::log(aspect_ratio_range.first), std::log(aspect_ratio_range.second)) + , _area_dis(area_range.first, area_range.second) + , _seed(seed) + , _num_attempts(num_attempts) + , _batch_size(batch_size) { + _seeds.resize(_batch_size); +} + + +CropWindow RocalRandomCropDecParam::generate_crop_window_implementation(const Shape& shape) { + assert(shape.size() == 2); + CropWindow crop; + int H = shape[0], W = shape[1]; + if (W <= 0 || H <= 0) { + return crop; + } + float min_wh_ratio = _aspect_ratio_range.first; + float max_wh_ratio = _aspect_ratio_range.second; + float max_hw_ratio = 1 / _aspect_ratio_range.first; + float min_area = W * H * _area_dis.a(); + int maxW = std::max(1, H * max_wh_ratio); + int maxH = std::max(1, W * max_hw_ratio); + // detect two impossible cases early + if (H * maxW < min_area) { // image too wide + crop.set_shape(H, maxW); + } else if (W * maxH < min_area) { // image too tall + crop.set_shape(maxH, W); + } else { // it can still fail for very small images when size granularity matters + int attempts_left = _num_attempts; + for (; attempts_left > 0; attempts_left--) { + float scale = _area_dis(_rand_gen); + size_t original_area = H * W; + float target_area = scale * original_area; + float ratio = std::exp(_aspect_ratio_log_dis(_rand_gen)); + auto w = static_cast( + std::roundf(sqrtf(target_area * ratio))); + auto h = static_cast( + std::roundf(sqrtf(target_area / ratio))); + w = std::max(1, w); + h = std::max(1, h); + crop.set_shape(h, w); + ratio = static_cast(w) / h; + if (w <= W && h <= H && ratio >= min_wh_ratio && ratio <= max_wh_ratio) + break; + } + if (attempts_left <= 0) { + float max_area = _area_dis.b() * W * H; + float ratio = static_cast(W) / H; + if (ratio > max_wh_ratio) { + crop.set_shape(H, maxW); + } else if (ratio < min_wh_ratio) { + crop.set_shape(maxH, W); + } else { + crop.set_shape(H, W); + } + float scale = std::min(1.0f, max_area / (crop.W * crop.H)); + crop.W = std::max(1, crop.W * std::sqrt(scale)); + crop.H = std::max(1, crop.H * std::sqrt(scale)); + } + } + crop.x = std::uniform_int_distribution(0, W - crop.W)(_rand_gen); + crop.y = std::uniform_int_distribution(0, H - crop.H)(_rand_gen); + return crop; + } + +// seed the rng for the instance and return the random crop window. +CropWindow RocalRandomCropDecParam::generate_crop_window(const Shape& shape, const int instance) { + _rand_gen.seed(_seeds[instance]); + return generate_crop_window_implementation(shape); +} + +void RocalRandomCropDecParam::generate_random_seeds() { + ParameterFactory::instance()->generate_seed(); // Renew and regenerate + std::seed_seq seq{ParameterFactory::instance()->get_seed()}; + seq.generate(_seeds.begin(), _seeds.end()); +} diff --git a/rocAL/rocAL/source/parameters/parameter_rali_crop.cpp b/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp similarity index 58% rename from rocAL/rocAL/source/parameters/parameter_rali_crop.cpp rename to rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp index 715c73601e..1efa52b06b 100644 --- a/rocAL/rocAL/source/parameters/parameter_rali_crop.cpp +++ b/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp @@ -27,62 +27,59 @@ THE SOFTWARE. #include "parameter_rocal_crop.h" #include "commons.h" -void RocalCropParam::set_crop_height_factor(Parameter* crop_h_factor) -{ +void RocalCropParam::set_crop_height_factor(Parameter* crop_h_factor) { if(!crop_h_factor) return ; ParameterFactory::instance()->destroy_param(crop_height_factor); crop_height_factor = crop_h_factor; } -void RocalCropParam::set_crop_width_factor(Parameter* crop_w_factor) -{ +void RocalCropParam::set_crop_width_factor(Parameter* crop_w_factor) { if(!crop_w_factor) return ; ParameterFactory::instance()->destroy_param(crop_width_factor); crop_width_factor = crop_w_factor; } -void RocalCropParam::update_array() -{ +void RocalCropParam::update_array() { fill_crop_dims(); update_crop_array(); } -void RocalCropParam::fill_crop_dims() -{ - for(uint img_idx =0; img_idx < batch_size; img_idx++) - { - if(!(_random)) - { +void RocalCropParam::fill_crop_dims() { + for(uint img_idx = 0; img_idx < batch_size; img_idx++) { + if (!(_random)) { // Evaluating user given crop - (crop_w > in_width[img_idx]) ? (cropw_arr_val[img_idx] = in_width[img_idx]) : (cropw_arr_val[img_idx] = crop_w); - (crop_h > in_height[img_idx]) ? (croph_arr_val[img_idx] = in_height[img_idx]) : (croph_arr_val[img_idx] = crop_h); - (x1 >= in_width[img_idx]) ? (x1_arr_val[img_idx] = 0) : (x1_arr_val[img_idx] = x1); - (y1 >= in_height[img_idx]) ? (y1_arr_val[img_idx] = 0) : (y1_arr_val[img_idx] = y1); - } - else - { - float crop_h_factor_, crop_w_factor_, x_drift, y_drift; - crop_height_factor->renew(); - crop_h_factor_ = crop_height_factor->get(); - crop_width_factor->renew(); - crop_w_factor_ = crop_width_factor->get(); - cropw_arr_val[img_idx] = static_cast (crop_w_factor_ * in_width[img_idx]); - croph_arr_val[img_idx] = static_cast (crop_h_factor_ * in_height[img_idx]); - x_drift_factor->renew(); - y_drift_factor->renew(); - y_drift_factor->renew(); - x_drift = x_drift_factor->get(); - y_drift = y_drift_factor->get(); - x1_arr_val[img_idx] = static_cast(x_drift * (in_width[img_idx] - cropw_arr_val[img_idx])); - y1_arr_val[img_idx] = static_cast(y_drift * (in_height[img_idx] - croph_arr_val[img_idx])); + cropw_arr_val[img_idx] = (crop_w <= in_width[img_idx] && crop_w > 0) ? crop_w : in_width[img_idx]; + croph_arr_val[img_idx] = (crop_h <= in_height[img_idx] && crop_h > 0) ? crop_h : in_height[img_idx]; + if (_is_center_crop) { + x1_arr_val[img_idx] = static_cast(0.5 * (in_width[img_idx] - cropw_arr_val[img_idx])); + y1_arr_val[img_idx] = static_cast(0.5 * (in_height[img_idx] - croph_arr_val[img_idx])); + } else { + x1_arr_val[img_idx] = (x1 >= in_width[img_idx]) ? 0 : x1; + y1_arr_val[img_idx] = (y1 >= in_height[img_idx]) ? 0 : y1; + } + } else { + float crop_h_factor_, crop_w_factor_, x_drift, y_drift; + crop_height_factor->renew(); + crop_h_factor_ = crop_height_factor->get(); + crop_width_factor->renew(); + crop_w_factor_ = crop_width_factor->get(); + cropw_arr_val[img_idx] = static_cast (crop_w_factor_ * in_width[img_idx]); + croph_arr_val[img_idx] = static_cast (crop_h_factor_ * in_height[img_idx]); + x_drift_factor->renew(); + y_drift_factor->renew(); + y_drift_factor->renew(); + x_drift = x_drift_factor->get(); + y_drift = y_drift_factor->get(); + x1_arr_val[img_idx] = static_cast(x_drift * (in_width[img_idx] - cropw_arr_val[img_idx])); + y1_arr_val[img_idx] = static_cast(y_drift * (in_height[img_idx] - croph_arr_val[img_idx])); } x2_arr_val[img_idx] = x1_arr_val[img_idx] + cropw_arr_val[img_idx]; y2_arr_val[img_idx] = y1_arr_val[img_idx] + croph_arr_val[img_idx]; // Evaluating the crop - (x2_arr_val[img_idx] > in_width[img_idx]) ? x2_arr_val[img_idx] = in_width[img_idx] : x2_arr_val[img_idx] = x2_arr_val[img_idx]; - (y2_arr_val[img_idx] > in_height[img_idx]) ? y2_arr_val[img_idx] = in_height[img_idx] : y2_arr_val[img_idx] = y2_arr_val[img_idx]; + x2_arr_val[img_idx] = std::min(x2_arr_val[img_idx], in_width[img_idx]); + y2_arr_val[img_idx] = std::min(y2_arr_val[img_idx], in_height[img_idx]); } } diff --git a/rocAL/rocAL/source/pipeline/image.cpp b/rocAL/rocAL/source/pipeline/image.cpp index b0a673825d..691ca021fd 100644 --- a/rocAL/rocAL/source/pipeline/image.cpp +++ b/rocAL/rocAL/source/pipeline/image.cpp @@ -132,7 +132,7 @@ ImageInfo::ImageInfo( _height(height_), _color_planes(planes), _batch_size(batches), - _data_size(width_ * height_ * _batch_size * planes), + _data_size((static_cast(width_ * height_ * _batch_size * planes))), _mem_type(mem_type_), _color_fmt(col_fmt_) { @@ -164,7 +164,7 @@ void Image::update_image_roi(const std::vector &width, const std::vect if(height[i] > _info.height_single()) { - ERR("Given ROI height is larger than buffer with for image[" + TOSTR(i) + "] " + TOSTR(height[i]) +" > " + TOSTR(_info.height_single())) + ERR("Given ROI height is larger than buffer height for image[" + TOSTR(i) + "] " + TOSTR(height[i]) +" > " + TOSTR(_info.height_single())) _info._roi_height->at(i) = _info.height_single(); } else diff --git a/rocAL/rocAL/source/pipeline/master_graph.cpp b/rocAL/rocAL/source/pipeline/master_graph.cpp index 0e0e0853a2..b6b50e1827 100644 --- a/rocAL/rocAL/source/pipeline/master_graph.cpp +++ b/rocAL/rocAL/source/pipeline/master_graph.cpp @@ -667,7 +667,7 @@ MasterGraph::copy_out_tensor(void *out_ptr, RocalTensorFormat format, float mult for( auto&& out_image: output_buffers) { unsigned int single_image_size = w * c * h; - #pragma omp parallel for + #pragma omp parallel for num_threads(_internal_batch_size) for(unsigned int batchCount = 0; batchCount < n; batchCount ++) { size_t dest_buf_offset = dest_buf_offset_start + single_image_size*batchCount; diff --git a/rocAL/rocAL_pybind/amd/rocal/decoders.py b/rocAL/rocAL_pybind/amd/rocal/decoders.py index e4d4c1bb1a..d9606f8f79 100644 --- a/rocAL/rocAL_pybind/amd/rocal/decoders.py +++ b/rocAL/rocAL_pybind/amd/rocal/decoders.py @@ -22,16 +22,17 @@ import rocal_pybind as b from amd.rocal.pipeline import Pipeline -def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotations_file= '', shard_id = 0, num_shards = 1, random_shuffle = False, affine=True, bytes_per_sample_hint=0, cache_batch_copy= True, cache_debug = False, cache_size = 0, cache_threshold = 0, - cache_type='', device_memory_padding=16777216, host_memory_padding=8388608, hybrid_huffman_threshold= 1000000, output_type = types.RGB, - preserve=False, seed=-1, split_stages=False, use_chunk_allocator= False, use_fast_idct = False, device = None): +def image(*inputs, user_feature_key_map=None, path='', file_root='', annotations_file='', shard_id=0, num_shards=1, random_shuffle=False, + affine=True, bytes_per_sample_hint=0, cache_batch_copy=True, cache_debug=False, cache_size=0, cache_threshold=0, cache_type='', + device_memory_padding=16777216, host_memory_padding=8388608, hybrid_huffman_threshold=1000000, output_type=types.RGB, + decoder_type=types.DECODER_TJPEG, preserve=False, seed=1, split_stages=False, use_chunk_allocator=False, use_fast_idct=False, + device=None, decode_size_policy=types.USER_GIVEN_SIZE_ORIG, max_decoded_width=1000, max_decoded_height=1000): reader = Pipeline._current_pipeline._reader if (device == "gpu"): - decoder_type = types.DECODER_HW_JEPG + decoder_type = types.DECODER_HW_JEPG else: - decoder_type = types.DECODER_TJPEG - - if( reader == 'COCOReader'): + decoder_type = types.DECODER_TJPEG + if(reader == 'COCOReader'): kwargs_pybind = { "source_path": file_root, "json_path": annotations_file, @@ -41,10 +42,11 @@ def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotati 'is_output': False, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - decoded_image = b.COCO_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height, + "dec_type": decoder_type} + decoded_image = b.COCO_ImageDecoderShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif (reader == "TFRecordReaderClassification" or reader == "TFRecordReaderDetection"): kwargs_pybind = { @@ -56,10 +58,11 @@ def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotati "user_key_for_filename": user_feature_key_map["image/filename"], "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.USER_GIVEN_SIZE, - "max_width": 2000, - "max_height": 2000} - decoded_image = b.TF_ImageDecoder(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height, + "dec_type": decoder_type} + decoded_image = b.TF_ImageDecoder(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif (reader == "Caffe2Reader" or reader == "Caffe2ReaderDetection"): kwargs_pybind = { @@ -70,10 +73,11 @@ def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotati 'is_output': False, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - decoded_image = b.Caffe2_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height, + "dec_type" : decoder_type} + decoded_image = b.Caffe2_ImageDecoderShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif reader == "CaffeReader" or reader == "CaffeReaderDetection": kwargs_pybind = { @@ -84,10 +88,11 @@ def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotati 'is_output': False, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - decoded_image = b.Caffe_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height, + "dec_type" : decoder_type} + decoded_image = b.Caffe_ImageDecoderShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) else: kwargs_pybind = { @@ -98,18 +103,20 @@ def image(*inputs, user_feature_key_map = None, path='', file_root ='', annotati 'is_output': False, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.USER_GIVEN_SIZE, - "max_width": 2000, - "max_height":2000, - "dec_type":decoder_type} - decoded_image = b.ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height, + "dec_type": decoder_type} + decoded_image = b.ImageDecoderShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) return (decoded_image) -def image_raw(*inputs, user_feature_key_map = None, path='', file_root ='', annotations_file= '', shard_id = 0, num_shards = 1, random_shuffle = False, affine=True, bytes_per_sample_hint=0, cache_batch_copy= True, cache_debug = False, cache_size = 0, cache_threshold = 0, - cache_type='', device_memory_padding=16777216, host_memory_padding=8388608, hybrid_huffman_threshold= 1000000, output_type = types.RGB, - preserve=False, seed=-1, split_stages=False, use_chunk_allocator= False, use_fast_idct = False, device = None): +def image_raw(*inputs, user_feature_key_map=None, path='', file_root='', annotations_file='', shard_id=0, num_shards=1, random_shuffle=False, + affine=True, bytes_per_sample_hint=0, cache_batch_copy=True, cache_debug=False, cache_size=0, cache_threshold=0, cache_type='', + device_memory_padding=16777216, host_memory_padding=8388608, hybrid_huffman_threshold=1000000, output_type=types.RGB, + preserve=False, seed=1, split_stages=False, use_chunk_allocator=False, use_fast_idct=False, device=None, + decode_size_policy=types.USER_GIVEN_SIZE_ORIG, max_decoded_width=1000, max_decoded_height=1000): reader = Pipeline._current_pipeline._reader if (reader == "TFRecordReaderClassification" or reader == "TFRecordReaderDetection"): @@ -121,22 +128,20 @@ def image_raw(*inputs, user_feature_key_map = None, path='', file_root ='', anno 'is_output': False, "shuffle": random_shuffle, "loop": False, - "out_width": 2000, - "out_height": 2000} - decoded_image = b.TF_ImageDecoderRaw(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - # decoded_image = b.TF_ImageDecoderRaw(handle, input_image, self._user_feature_key_map["image/encoded"], self._user_feature_key_map["image/filename"], , is_output, shuffle, False, decode_width, decode_height) + "max_width": max_decoded_width, + "max_height": max_decoded_height} + decoded_image = b.TF_ImageDecoderRaw(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) return (decoded_image) - -def image_random_crop(*inputs,user_feature_key_map=None ,path = '', file_root= '', annotations_file='', num_shards = 1, shard_id = 0, random_shuffle = False, affine=True, bytes_per_sample_hint=0, device_memory_padding= 16777216, host_memory_padding = 8388608, hybrid_huffman_threshold = 1000000, - num_attempts=10, output_type=types.RGB, preserve=False, random_area = None, random_aspect_ratio = None, - seed=1, split_stages=False, use_chunk_allocator=False, use_fast_idct= False, device = None): +def image_random_crop(*inputs, user_feature_key_map=None, path='', file_root='', annotations_file='', num_shards=1, shard_id=0, + random_shuffle=False, affine=True, bytes_per_sample_hint=0, device_memory_padding=16777216, host_memory_padding=8388608, + hybrid_huffman_threshold=1000000, num_attempts=10, output_type=types.RGB, preserve=False, random_area=[0.08, 1.0], + random_aspect_ratio=[0.8, 1.25], seed=1, split_stages=False, use_chunk_allocator=False, use_fast_idct=False, device=None, + decode_size_policy=types.USER_GIVEN_SIZE_ORIG, max_decoded_width=1000, max_decoded_height=1000): reader = Pipeline._current_pipeline._reader - b.setSeed(seed) - #Creating 2 Nodes here (Image Decoder + Random Crop Node) - #Node 1 Image Decoder - if( reader == 'COCOReader'): + # Internally calls the C++ Partial decoder's + if(reader == 'COCOReader'): kwargs_pybind = { "source_path": file_root, "json_path": annotations_file, @@ -144,12 +149,15 @@ def image_random_crop(*inputs,user_feature_key_map=None ,path = '', file_root= ' "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - image_decoder_output_image = b.COCO_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + crop_output_image = b.COCO_ImageDecoderSliceShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif (reader == "TFRecordReaderClassification" or reader == "TFRecordReaderDetection"): kwargs_pybind = { "source_path": path, @@ -160,39 +168,42 @@ def image_random_crop(*inputs,user_feature_key_map=None ,path = '', file_root= ' "user_key_for_filename": user_feature_key_map["image/filename"], "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.USER_GIVEN_SIZE, - "max_width": 2000, - "max_height": 2000} - image_decoder_output_image = b.TF_ImageDecoder(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - - elif (reader == "Caffe2Reader" or reader == "Caffe2ReaderDetection"): + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + crop_output_image = b.TF_ImageDecoder(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + elif (reader == "CaffeReader" or reader == "CaffeReaderDetection"): kwargs_pybind = { "source_path": path, "color_format": output_type, "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - image_decoder_output_image = b.Caffe2_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - - elif (reader == "CaffeReader" or reader == "CaffeReaderDetection"): + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + crop_output_image = b.Caffe_ImageDecoderPartialShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) + elif (reader == "Caffe2Reader" or reader == "Caffe2ReaderDetection"): kwargs_pybind = { "source_path": path, "color_format": output_type, "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0} - image_decoder_output_image = b.Caffe_ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + crop_output_image = b.Caffe2_ImageDecoderPartialShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) else: kwargs_pybind = { "source_path": file_root, @@ -200,42 +211,31 @@ def image_random_crop(*inputs,user_feature_key_map=None ,path = '', file_root= ' "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 0, - "max_height":0, - "dec_type": types.DECODER_TJPEG} - image_decoder_output_image = b.ImageDecoderShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - - - #Node 2 Random Crop - kwargs_pybind_2 = { - "input_image0": image_decoder_output_image, - 'is_output': False, - "crop_width": None, - "crop_height": None, - "crop_depth": None, - "crop_pox_x": None, - "crop_pos_y": None, - "crop_pox_z": None - } - crop_output_image = b.Crop(Pipeline._current_pipeline._handle ,*(kwargs_pybind_2.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + crop_output_image = b.FusedDecoderCropShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) return (crop_output_image) - -def image_slice(*inputs,file_root='',path='',annotations_file='',shard_id = 0, num_shards = 1, random_shuffle = False, affine = True, axes = None, axis_names = "WH",bytes_per_sample_hint = 0, device_memory_padding = 16777216, - device_memory_padding_jpeg2k = 0, host_memory_padding = 8388608, - host_memory_padding_jpeg2k = 0, hybrid_huffman_threshold = 1000000, - memory_stats = False, normalized_anchor = True, normalized_shape = True, output_type = types.RGB, - preserve = False, seed = -1, split_stages = False, use_chunk_allocator = False, use_fast_idct = False,device = None): - +def image_slice(*inputs, file_root='', path='', annotations_file='', shard_id=0, num_shards=1, random_shuffle=False, affine=True, axes=None, + axis_names="WH", bytes_per_sample_hint=0, device_memory_padding=16777216, device_memory_padding_jpeg2k=0, + host_memory_padding=8388608, random_aspect_ratio=[0.8, 1.25], random_area=[0.08, 1.0], num_attempts=100, + host_memory_padding_jpeg2k=0, hybrid_huffman_threshold=1000000, memory_stats=False, normalized_anchor=True, + normalized_shape=True, output_type=types.RGB, preserve=False, seed=1, split_stages=False, use_chunk_allocator=False, + use_fast_idct=False, device=None, decode_size_policy=types.USER_GIVEN_SIZE_ORIG, max_decoded_width=1000, max_decoded_height=1000): reader = Pipeline._current_pipeline._reader - #Reader -> Randon BBox Crop -> ImageDecoderSlice - if( reader == 'COCOReader'): + #TODO:To pass the crop co-ordinates from random_bbox_crop to image_slice + #in tensor branch integration, + #for now calling partial decoder to match SSD training outer API's . + if(reader == 'COCOReader'): kwargs_pybind = { "source_path": file_root, @@ -244,16 +244,15 @@ def image_slice(*inputs,file_root='',path='',annotations_file='',shard_id = 0, n "shard_id": shard_id, "shard_count": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 1200, #TODO: what happens when we give user given size = multiplier * max_decoded_width - "max_height":1200, #TODO: what happens when we give user given size = multiplier * max_decoded_width - "area_factor": None, - "aspect_ratio": None, - "x_drift_factor": None, - "y_drift_factor": None} - image_decoder_slice = b.COCO_ImageDecoderSliceShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + image_decoder_slice = b.COCO_ImageDecoderSliceShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif (reader == "CaffeReader" or reader == "CaffeReaderDetection"): kwargs_pybind = { "source_path": path, @@ -261,16 +260,15 @@ def image_slice(*inputs,file_root='',path='',annotations_file='',shard_id = 0, n "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 1200, - "max_height":1200, - "area_factor": None, - "aspect_ratio": None, - "x_drift_factor": None, - "y_drift_factor": None} - image_decoder_slice = b.Caffe_ImageDecoderPartialShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + image_decoder_slice = b.Caffe_ImageDecoderPartialShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) elif (reader == "Caffe2Reader" or reader == "Caffe2ReaderDetection"): kwargs_pybind = { "source_path": path, @@ -278,31 +276,29 @@ def image_slice(*inputs,file_root='',path='',annotations_file='',shard_id = 0, n "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.MAX_SIZE, - "max_width": 1200, - "max_height":1200, - "area_factor": None, - "aspect_ratio": None, - "x_drift_factor": None, - "y_drift_factor": None} - image_decoder_slice = b.Caffe2_ImageDecoderPartialShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) - else : + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + image_decoder_slice = b.Caffe2_ImageDecoderPartialShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) + else: kwargs_pybind = { "source_path": file_root, "color_format": output_type, "shard_id": shard_id, "num_shards": num_shards, 'is_output': False, + "area_factor": random_area, + "aspect_ratio": random_aspect_ratio, + "num_attempts": num_attempts, "shuffle": random_shuffle, "loop": False, - "decode_size_policy": types.USER_GIVEN_SIZE, - "max_width": 3000, - "max_height":3000, - "area_factor": None, - "aspect_ratio": None, - "x_drift_factor": None, - "y_drift_factor": None} - image_decoder_slice = b.FusedDecoderCropShard(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) + "decode_size_policy": decode_size_policy, + "max_width": max_decoded_width, + "max_height": max_decoded_height} + image_decoder_slice = b.FusedDecoderCropShard(Pipeline._current_pipeline._handle, *(kwargs_pybind.values())) return (image_decoder_slice) diff --git a/rocAL/rocAL_pybind/amd/rocal/fn.py b/rocAL/rocAL_pybind/amd/rocal/fn.py index 6db180673a..cac62ec570 100644 --- a/rocAL/rocAL_pybind/amd/rocal/fn.py +++ b/rocAL/rocAL_pybind/amd/rocal/fn.py @@ -416,8 +416,6 @@ def crop_mirror_normalize(*inputs, bytes_per_sample_hint=0, crop=[0, 0], crop_d= crop_depth = crop_d crop_height = crop_h crop_width = crop_w - #Set Seed - b.setSeed(seed) if isinstance(mirror,int): if(mirror == 0): @@ -426,9 +424,8 @@ def crop_mirror_normalize(*inputs, bytes_per_sample_hint=0, crop=[0, 0], crop_d= mirror = b.CreateIntParameter(1) # pybind call arguments - kwargs_pybind = {"input_image0": inputs[0], "crop_depth":crop_depth, "crop_height":crop_height, "crop_width":crop_width, "start_x":1, "start_y":1, "start_z":1, "mean":mean, "std_dev":std, + kwargs_pybind = {"input_image0": inputs[0], "crop_depth":crop_depth, "crop_height":crop_height, "crop_width":crop_width, "start_x":0, "start_y":0, "start_z":0, "mean":mean, "std_dev":std, "is_output": False, "mirror": mirror} - b.setSeed(seed) cmn = b.CropMirrorNormalize(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) Pipeline._current_pipeline._tensor_layout = output_layout Pipeline._current_pipeline._tensor_dtype = output_dtype diff --git a/rocAL/rocAL_pybind/amd/rocal/pipeline.py b/rocAL/rocAL_pybind/amd/rocal/pipeline.py index 441fc0cb2e..bc55d911c2 100644 --- a/rocAL/rocAL_pybind/amd/rocal/pipeline.py +++ b/rocAL/rocAL_pybind/amd/rocal/pipeline.py @@ -143,6 +143,7 @@ def __init__(self, batch_size=-1, num_threads=-1, device_id=-1, seed=-1, self._current_pipeline = None self._reader = None self._define_graph_set = False + self.set_seed(self._seed) def build(self): """Build the pipeline using rocalVerify call diff --git a/rocAL/rocAL_pybind/amd/rocal/readers.py b/rocAL/rocAL_pybind/amd/rocal/readers.py index 05e730e326..01dcb5b3c7 100644 --- a/rocAL/rocAL_pybind/amd/rocal/readers.py +++ b/rocAL/rocAL_pybind/amd/rocal/readers.py @@ -22,23 +22,25 @@ from amd.rocal.pipeline import Pipeline import amd.rocal.types as types -def coco(*inputs,file_root, annotations_file='', bytes_per_sample_hint=0, dump_meta_files=False, dump_meta_files_path='', file_list='', initial_fill=1024, lazy_init=False, ltrb=False, masks=False, meta_files_path='', num_shards=1, pad_last_batch=False, prefetch_queue_depth=1, - preserve=False, random_shuffle=False, ratio=False, read_ahead=False, - save_img_ids=False, seed=-1, shard_id=0, shuffle_after_epoch=False, size_threshold=0.1, - skip_cached_images=False, skip_empty=False, stick_to_shard=False, tensor_init_bytes=1048576): +def coco(*inputs, file_root, annotations_file='', bytes_per_sample_hint=0, dump_meta_files=False, + dump_meta_files_path='', file_list='', initial_fill=1024, lazy_init=False, ltrb=False, masks=False, + meta_files_path='', num_shards=1, pad_last_batch=False, prefetch_queue_depth=1, preserve=False, + random_shuffle=False, ratio=False, read_ahead=False, save_img_ids=False, seed=-1, shard_id=0, + shuffle_after_epoch=False, size_threshold=0.1, skip_cached_images=False, skip_empty=False, + stick_to_shard=False, tensor_init_bytes=1048576): Pipeline._current_pipeline._reader = "COCOReader" #Output labels = [] bboxes = [] kwargs_pybind = {"source_path": annotations_file, "is_output":True} - b.setSeed(seed) meta_data = b.COCOReader(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) return (meta_data, labels, bboxes) -def file(*inputs, file_root, bytes_per_sample_hint=0, file_list='', initial_fill='', lazy_init='', num_shards=1, - pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, - seed=-1, shard_id=0, shuffle_after_epoch=False, skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): +def file(*inputs, file_root, bytes_per_sample_hint=0, file_list='', initial_fill='', lazy_init='', + num_shards=1, pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, + read_ahead=False, seed=-1, shard_id=0, shuffle_after_epoch=False, skip_cached_images=False, + stick_to_shard=False, tensor_init_bytes=1048576, device=None): Pipeline._current_pipeline._reader = "labelReader" #Output @@ -47,8 +49,11 @@ def file(*inputs, file_root, bytes_per_sample_hint=0, file_list='', initial_fill label_reader_meta_data = b.labelReader(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) return (label_reader_meta_data, labels) -def tfrecord(*inputs, path, user_feature_key_map, features, index_path="", reader_type=0, bytes_per_sample_hint=0, initial_fill=1024, lazy_init=False, - num_shards=1, pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): +def tfrecord(*inputs, path, user_feature_key_map, features, index_path="", reader_type=0, + bytes_per_sample_hint=0, initial_fill=1024, lazy_init=False, num_shards=1, pad_last_batch=False, + prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, + skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): + labels=[] if reader_type == 1: Pipeline._current_pipeline._reader = "TFRecordReaderDetection" @@ -76,10 +81,10 @@ def tfrecord(*inputs, path, user_feature_key_map, features, index_path="", reade features["image/class/label"] = labels return features -def caffe(*inputs, path, bbox=False, bytes_per_sample_hint=0, image_available=True, initial_fill=1024, label_available=True, - lazy_init=False, num_shards=1, - pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, - seed=-1, shard_id=0, skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): +def caffe(*inputs, path, bbox=False, bytes_per_sample_hint=0, image_available=True, initial_fill=1024, + label_available=True, lazy_init=False, num_shards=1, pad_last_batch=False, prefetch_queue_depth=1, + preserve=False, random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, + stick_to_shard=False, tensor_init_bytes=1048576, device=None): #Output bboxes = [] @@ -98,11 +103,10 @@ def caffe(*inputs, path, bbox=False, bytes_per_sample_hint=0, image_available=Tr else: return (caffe_reader_meta_data, labels) - -def caffe2(*inputs, path, bbox=False, additional_inputs=0, bytes_per_sample_hint=0, image_available=True, initial_fill=1024, label_type=0, - lazy_init=False, num_labels=1, num_shards=1, - pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, - seed=-1, shard_id=0, skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): +def caffe2(*inputs, path, bbox=False, additional_inputs=0, bytes_per_sample_hint=0, image_available=True, + initial_fill=1024, label_type=0, lazy_init=False, num_labels=1, num_shards=1, pad_last_batch=False, + prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, + skip_cached_images=False, stick_to_shard=False, tensor_init_bytes=1048576, device=None): #Output bboxes = [] @@ -119,10 +123,14 @@ def caffe2(*inputs, path, bbox=False, additional_inputs=0, bytes_per_sample_hint else: return (caffe2_meta_data, labels) -def video(*inputs,sequence_length, additional_decode_surfaces=2, bytes_per_sample_hint=0, channels=3, dont_use_mmap=False, dtype=types.FLOAT, enable_frame_num=False, enable_timestamps=False, file_list="", file_list_frame_num=False, file_list_include_preceding_frame=False, file_root="", filenames=[], image_type=types.RGB, - initial_fill=1024, labels="", lazy_init=False, normalized=False, - num_shards=1, pad_last_batch=False, pad_sequences=False, prefetch_queue_depth=1, preserve=False, - random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, skip_vfr_check=False,step=1,stick_to_shard=False, stride=1, tensor_init_bytes = 1048576, decoder_mode = types.SOFTWARE_DECODE, device=None, name=None): +def video(*inputs, sequence_length, additional_decode_surfaces=2, bytes_per_sample_hint=0, channels=3, + dont_use_mmap=False, dtype=types.FLOAT, enable_frame_num=False, enable_timestamps=False, file_list="", + file_list_frame_num=False, file_list_include_preceding_frame=False, file_root="", filenames=[], + image_type=types.RGB, initial_fill=1024, labels="", lazy_init=False, normalized=False, num_shards=1, + pad_last_batch=False, pad_sequences=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, + read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, skip_vfr_check=False, step=1, + stick_to_shard=False, stride=1, tensor_init_bytes=1048576, decoder_mode=types.SOFTWARE_DECODE, + device=None, name=None): Pipeline._current_pipeline._reader = "VideoDecoder" #Output @@ -134,10 +142,13 @@ def video(*inputs,sequence_length, additional_decode_surfaces=2, bytes_per_sampl videos = b.VideoDecoder(Pipeline._current_pipeline._handle ,*(kwargs_pybind_decoder.values())) return (videos) -def video_resize(*inputs,sequence_length, resize_width, resize_height, additional_decode_surfaces=2, bytes_per_sample_hint=0, channels=3, dont_use_mmap=False, dtype=types.FLOAT, enable_frame_num=False, enable_timestamps=False, file_list="", file_list_frame_num=False, file_list_include_preceding_frame=False, file_root="", filenames=[], image_type=types.RGB, - initial_fill=1024, labels="", lazy_init=False, normalized=False, - num_shards=1, pad_last_batch=False, pad_sequences=False, prefetch_queue_depth=1, preserve=False, - random_shuffle=False, read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, skip_vfr_check=False,step=3,stick_to_shard=False, stride=3, tensor_init_bytes = 1048576, decoder_mode = types.SOFTWARE_DECODE, device=None, name=None): +def video_resize(*inputs, sequence_length, resize_width, resize_height, additional_decode_surfaces=2, + bytes_per_sample_hint=0, channels=3, dont_use_mmap=False, dtype=types.FLOAT, enable_frame_num=False, + enable_timestamps=False, file_list="", file_list_frame_num=False, file_list_include_preceding_frame=False, + file_root="", filenames=[], image_type=types.RGB, initial_fill=1024, labels="", lazy_init=False, normalized=False, + num_shards=1, pad_last_batch=False, pad_sequences=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, + read_ahead=False, seed=-1, shard_id=0, skip_cached_images=False, skip_vfr_check=False, step=3, stick_to_shard=False, + stride=3, tensor_init_bytes=1048576, decoder_mode=types.SOFTWARE_DECODE, device=None, name=None): Pipeline._current_pipeline._reader = "VideoDecoderResize" #Output @@ -149,9 +160,11 @@ def video_resize(*inputs,sequence_length, resize_width, resize_height, additiona videos = b.VideoDecoderResize(Pipeline._current_pipeline._handle ,*(kwargs_pybind_decoder.values())) return (videos, meta_data) -def sequence_reader(*inputs, file_root, sequence_length, bytes_per_sample_hint=0, dont_use_mmap=False, image_type=types.RGB, initial_fill='', lazy_init='', num_shards=1, - pad_last_batch=False, prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, - seed=-1, shard_id=0, skip_cached_images=False, step = 3, stick_to_shard=False, stride=1, tensor_init_bytes=1048576, device=None): +def sequence_reader(*inputs, file_root, sequence_length, bytes_per_sample_hint=0, dont_use_mmap=False, + image_type=types.RGB, initial_fill='', lazy_init='', num_shards=1, pad_last_batch=False, + prefetch_queue_depth=1, preserve=False, random_shuffle=False, read_ahead=False, seed=-1, + shard_id=0, skip_cached_images=False, step=3, stick_to_shard=False, stride=1, tensor_init_bytes=1048576, + device=None): Pipeline._current_pipeline._reader = "SequenceReader" #Output diff --git a/rocAL/rocAL_pybind/rocal_pybind.cpp b/rocAL/rocAL_pybind/rocal_pybind.cpp index c1db456e5c..13aea57bda 100644 --- a/rocAL/rocAL_pybind/rocal_pybind.cpp +++ b/rocAL/rocAL_pybind/rocal_pybind.cpp @@ -73,7 +73,7 @@ namespace rocal{ auto buf = array.request(); unsigned char* ptr = (unsigned char*) buf.ptr; // call pure C++ function - int status = rocalCopyToOutput(context,ptr, buf.size); + int status = rocalCopyToOutput(context, ptr, buf.size); return py::cast(Py_None); } @@ -242,10 +242,10 @@ namespace rocal{ return py::cast(Py_None); } - py::object wrapper_random_bbox_crop(RocalContext context, bool all_boxes_overlap, bool no_crop, RocalFloatParam p_aspect_ratio, bool has_shape, int crop_width, int crop_height, int num_attemps, RocalFloatParam p_scaling, int total_num_attempts ) + py::object wrapper_random_bbox_crop(RocalContext context, bool all_boxes_overlap, bool no_crop, RocalFloatParam p_aspect_ratio, bool has_shape, int crop_width, int crop_height, int num_attempts, RocalFloatParam p_scaling, int total_num_attempts ) { // call pure C++ function - rocalRandomBBoxCrop(context, all_boxes_overlap, no_crop, p_aspect_ratio, has_shape, crop_width, crop_height, num_attemps, p_scaling, total_num_attempts); + rocalRandomBBoxCrop(context, all_boxes_overlap, no_crop, p_aspect_ratio, has_shape, crop_width, crop_height, num_attempts, p_scaling, total_num_attempts); return py::cast(Py_None); } @@ -390,216 +390,39 @@ namespace rocal{ m.def("rocalCopyToOutputTensor16",&wrapper_tensor16); // rocal_api_data_loaders.h m.def("COCO_ImageDecoderSlice",&rocalJpegCOCOFileSourcePartial,"Reads file from the source given and decodes it according to the policy", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("json_path"), - py::arg("color_format"), - py::arg("num_threads"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("area_factor") = NULL, - py::arg("aspect_ratio") = NULL, - py::arg("x_drift_factor") = NULL, - py::arg("y_drift_factor") = NULL - ); + py::return_value_policy::reference); m.def("COCO_ImageDecoderSliceShard",&rocalJpegCOCOFileSourcePartialSingleShard,"Reads file from the source given and decodes it according to the policy", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("json_path"), - py::arg("color_format"), - py::arg("shard_id"), - py::arg("shard_count"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("area_factor") = NULL, - py::arg("aspect_ratio") = NULL, - py::arg("x_drift_factor") = NULL, - py::arg("y_drift_factor") = NULL - ); + py::return_value_policy::reference); m.def("ImageDecoder",&rocalJpegFileSource,"Reads file from the source given and decodes it according to the policy", py::return_value_policy::reference); m.def("ImageDecoderShard",&rocalJpegFileSourceSingleShard,"Reads file from the source given and decodes it according to the shard id and number of shards", py::return_value_policy::reference); m.def("COCO_ImageDecoder",&rocalJpegCOCOFileSource,"Reads file from the source given and decodes it according to the policy", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("json_path"), - py::arg("color_format"), - py::arg("num_threads"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("COCO_ImageDecoderShard",&rocalJpegCOCOFileSourceSingleShard,"Reads file from the source given and decodes it according to the shard id and number of shards", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("json_path"), - py::arg("color_format"), - py::arg("shard_id"), - py::arg("shard_count"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("TF_ImageDecoder",&rocalJpegTFRecordSource,"Reads file from the source given and decodes it according to the policy only for TFRecords", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("internal_shard_count"), - py::arg("is_output"), - py::arg("user_key_for_encoded"), - py::arg("user_key_for_filename"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("Caffe_ImageDecoder",&rocalJpegCaffeLMDBRecordSource,"Reads file from the source given and decodes it according to the policy only for TFRecords", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("num_threads"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("Caffe_ImageDecoderShard",&rocalJpegCaffeLMDBRecordSourceSingleShard, "Reads file from the source given and decodes it according to the shard id and number of shards", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("shard_id"), - py::arg("shard_count"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("Caffe_ImageDecoderPartialShard",&rocalJpegCaffeLMDBRecordSourcePartialSingleShard); m.def("Caffe2_ImageDecoder",&rocalJpegCaffe2LMDBRecordSource,"Reads file from the source given and decodes it according to the policy only for TFRecords", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("num_threads"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("Caffe2_ImageDecoderShard",&rocalJpegCaffe2LMDBRecordSourceSingleShard,"Reads file from the source given and decodes it according to the shard id and number of shards", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("shard_id"), - py::arg("shard_count"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("dec_type") = ROCAL_DECODER_TJPEG); + py::return_value_policy::reference); m.def("Caffe2_ImageDecoderPartialShard",&rocalJpegCaffe2LMDBRecordSourcePartialSingleShard); m.def("FusedDecoderCrop",&rocalFusedJpegCrop,"Reads file from the source and decodes them partially to output random crops", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("color_format"), - py::arg("num_threads"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MOST_FREQUENT_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("area_factor") = NULL, - py::arg("aspect_ratio") = NULL, - py::arg("y_drift_factor") = NULL, - py::arg("x_drift_factor") = NULL); + py::return_value_policy::reference); m.def("FusedDecoderCropShard",&rocalFusedJpegCropSingleShard,"Reads file from the source and decodes them partially to output random crops", - py::return_value_policy::reference, - py::arg("context"), - py::arg("source_path"), - py::arg("color_format"), - py::arg("shard_id"), - py::arg("shard_count"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("decode_size_policy") = ROCAL_USE_MAX_SIZE, - py::arg("max_width") = 0, - py::arg("max_height") = 0, - py::arg("area_factor") = NULL, - py::arg("aspect_ratio") = NULL, - py::arg("y_drift_factor") = NULL, - py::arg("x_drift_factor") = NULL); + py::return_value_policy::reference); m.def("TF_ImageDecoderRaw",&rocalRawTFRecordSource,"Reads file from the source given and decodes it according to the policy only for TFRecords", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("user_key_for_encoded"), - py::arg("user_key_for_filename"), - py::arg("rocal_color_format"), - py::arg("is_output"), - py::arg("shuffle") = false, - py::arg("loop") = false, - py::arg("out_width") = 0, - py::arg("out_height") = 0, - py::arg("record_name_prefix") = ""); + py::return_value_policy::reference); m.def("Cifar10Decoder",&rocalRawCIFAR10Source,"Reads file from the source given and decodes it according to the policy only for TFRecords", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("rocal_color_format"), - py::arg("is_output"), - py::arg("out_width") = 0, - py::arg("out_height") = 0, - py::arg("file_name_prefix") = "", - py::arg("loop") = false); + py::return_value_policy::reference); m.def("VideoDecoder",&rocalVideoFileSource,"Reads videos from the source given and decodes it according to the policy only for Videos as inputs", - py::return_value_policy::reference, - py::arg("p_context"), - py::arg("source_path"), - py::arg("color_format"), - py::arg("decoder_mode"), - py::arg("shard_count"), - py::arg("sequence_length"), - py::arg("shuffle") = false, - py::arg("is_output"), - py::arg("loop") = false, - py::arg("frame_step"), - py::arg("frame_stride"), - py::arg("file_list_frame_num") = false); + py::return_value_policy::reference); m.def("VideoDecoderResize",&rocalVideoFileResize,"Reads videos from the source given and decodes it according to the policy only for Videos as inputs. Resizes the decoded frames to the dest width and height.", py::return_value_policy::reference, py::arg("p_context"), @@ -628,7 +451,6 @@ namespace rocal{ py::arg("loop") = false, py::arg("frame_step"), py::arg("frame_stride")); - m.def("rocalResetLoaders",&rocalResetLoaders); // rocal_api_augmentation.h m.def("SSDRandomCrop",&rocalSSDRandomCrop, diff --git a/rocAL/rocAL_pybind/run.sh b/rocAL/rocAL_pybind/run.sh index 7069c4600a..cba417c958 100755 --- a/rocAL/rocAL_pybind/run.sh +++ b/rocAL/rocAL_pybind/run.sh @@ -42,7 +42,7 @@ if [[ $# -eq 1 ]]; then do echo "Going to install $WHEEL_NAME" done - python$PYTHON_VERSION -m pip uninstall $WHEEL_NAME + python$PYTHON_VERSION -m pip uninstall -y $WHEEL_NAME python$PYTHON_VERSION -m pip install $WHEEL_NAME else echo @@ -63,6 +63,6 @@ else do echo "Going to install $WHEEL_NAME" done - python$PYTHON_VERSION -m pip uninstall $WHEEL_NAME + python$PYTHON_VERSION -m pip uninstall -y $WHEEL_NAME python$PYTHON_VERSION -m pip install $WHEEL_NAME fi \ No newline at end of file diff --git a/utilities/rocAL/rocAL_unittests/rocAL_unittests.cpp b/utilities/rocAL/rocAL_unittests/rocAL_unittests.cpp index a472f91756..03f5081be3 100644 --- a/utilities/rocAL/rocAL_unittests/rocAL_unittests.cpp +++ b/utilities/rocAL/rocAL_unittests/rocAL_unittests.cpp @@ -141,7 +141,9 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co { std::cout << ">>>>>>> Running PARTIAL DECODE" << std::endl; rocalCreateLabelReader(handle, path); - input1 = rocalFusedJpegCrop(handle, path, color_format, num_threads, false, false); + std::vector area = {0.08, 1}; + std::vector aspect_ratio = {3.0f/4, 4.0f/3}; + input1 = rocalFusedJpegCrop(handle, path, color_format, num_threads, false, area, aspect_ratio, 10, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 2: //coco detection @@ -157,7 +159,7 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co if (decode_max_height <= 0 || decode_max_width <= 0) input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false); else - input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 3: //coco detection partial @@ -173,7 +175,9 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co #if defined RANDOMBBOXCROP rocalRandomBBoxCrop(handle, all_boxes_overlap, no_crop); #endif - input1 = rocalJpegCOCOFileSourcePartial(handle, path, json_path, color_format, num_threads, false, true, false); + std::vector area = {0.08, 1}; + std::vector aspect_ratio = {3.0f/4, 4.0f/3}; + input1 = rocalJpegCOCOFileSourcePartial(handle, path, json_path, color_format, num_threads, false, area, aspect_ratio, 10, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 4: //tf classification @@ -183,7 +187,7 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co char key2[25] = "image/class/label"; char key8[25] = "image/filename"; rocalCreateTFReader(handle, path, true, key2, key8); - input1 = rocalJpegTFRecordSource(handle, path, color_format, num_threads, false, key1, key8, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegTFRecordSource(handle, path, color_format, num_threads, false, key1, key8, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 5: //tf detection @@ -198,35 +202,35 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co char key7[25] = "image/object/bbox/ymax"; char key8[25] = "image/filename"; rocalCreateTFReaderDetection(handle, path, true, key2, key3, key4, key5, key6, key7, key8); - input1 = rocalJpegTFRecordSource(handle, path, color_format, num_threads, false, key1, key8, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegTFRecordSource(handle, path, color_format, num_threads, false, key1, key8, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 6: //caffe classification { std::cout << ">>>>>>> Running CAFFE CLASSIFICATION READER" << std::endl; rocalCreateCaffeLMDBLabelReader(handle, path); - input1 = rocalJpegCaffeLMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCaffeLMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 7: //caffe detection { std::cout << ">>>>>>> Running CAFFE DETECTION READER" << std::endl; rocalCreateCaffeLMDBReaderDetection(handle, path); - input1 = rocalJpegCaffeLMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCaffeLMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 8: //caffe2 classification { std::cout << ">>>>>>> Running CAFFE2 CLASSIFICATION READER" << std::endl; rocalCreateCaffe2LMDBLabelReader(handle, path, true); - input1 = rocalJpegCaffe2LMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCaffe2LMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 9: //caffe2 detection { std::cout << ">>>>>>> Running CAFFE2 DETECTION READER" << std::endl; rocalCreateCaffe2LMDBReaderDetection(handle, path, true); - input1 = rocalJpegCaffe2LMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCaffe2LMDBRecordSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; case 10: //coco reader keypoints @@ -243,7 +247,7 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co if (decode_max_height <= 0 || decode_max_width <= 0) input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false); else - input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegCOCOFileSource(handle, path, json_path, color_format, num_threads, false, true, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; default: //image pipeline @@ -253,7 +257,7 @@ int test(int test_case, int reader_type, int pipeline_type, const char *path, co if (decode_max_height <= 0 || decode_max_width <= 0) input1 = rocalJpegFileSource(handle, path, color_format, num_threads, false, true); else - input1 = rocalJpegFileSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE, decode_max_width, decode_max_height); + input1 = rocalJpegFileSource(handle, path, color_format, num_threads, false, false, false, ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED, decode_max_width, decode_max_height); } break; } From eae86b67a1b2d415e6ca02a18327034c3a224607 Mon Sep 17 00:00:00 2001 From: Kiriti Gowda Date: Thu, 19 Jan 2023 06:45:22 -0800 Subject: [PATCH 2/3] Setup & CMakeList - Updates (#1021) * RPP - Upgrade to V0.99 (#1018) * CMakeList - Adding RPATH flag (#995)" (#1017) This reverts commit a5a4948f40ef1b50019137d6085e947d06d0d7e7. * Setup - Support for RedHat and Updates (#1020) * Setup - Updates * Setup - Fix MIOpen Install * Readme - Updates * RPP Find - Fix * RPP - Find Include files * RedHat - rocAL Install Fix * Setup - Add rocBLAS install * Setup - Install Inference Deps * Set - Inference Re-Install * CMakeList - Find AMDRPP Backend Fix * Backend Find - Updates Co-authored-by: arvindcheru <90783369+arvindcheru@users.noreply.github.com> --- MIVisionX-setup.py | 41 +++++++++++++++++++ README.md | 1 + amd_openvx_extensions/CMakeLists.txt | 6 +-- .../cloud_inference/server_app/CMakeLists.txt | 2 +- model_compiler/python/nnir_to_clib.py | 1 + model_compiler/python/nnir_to_openvx.py | 1 + rocAL/rocAL/CMakeLists.txt | 6 +-- rocAL/rocAL_pybind/CMakeLists.txt | 6 +-- samples/inference/mv_objdetect/CMakeLists.txt | 2 +- samples/model_compiler_samples/CMakeLists.txt | 2 +- 10 files changed, 56 insertions(+), 12 deletions(-) diff --git a/MIVisionX-setup.py b/MIVisionX-setup.py index 80421b5485..8136800a71 100644 --- a/MIVisionX-setup.py +++ b/MIVisionX-setup.py @@ -335,6 +335,47 @@ os.system('sudo '+linuxFlag+' '+linuxSystemInstall + ' '+linuxSystemInstall_check+' install -y rocblas-devel miopen-hip-devel migraphx-devel') + # Install Model Compiler Deps + if inferenceInstall == 'ON': + modelCompilerDeps = os.path.expanduser( + '~/.mivisionx-model-compiler-deps') + + # Delete previous install + if os.path.exists(modelCompilerDeps) and reinstall == 'ON': + os.system('sudo -v') + os.system('sudo rm -rf '+modelCompilerDeps) + print("\nMIVisionX Setup: Removing Previous Inference Install -- "+modelCompilerDeps+"\n") + + if not os.path.exists(modelCompilerDeps): + print("STATUS: Model Compiler Deps Install - " + + modelCompilerDeps+"\n") + os.makedirs(modelCompilerDeps) + os.system('sudo -v') + if "Ubuntu" in platfromInfo: + os.system( + 'sudo '+linuxSystemInstall+' ' + + linuxSystemInstall_check+' install git inxi python3 python3-pip protobuf-compiler libprotoc-dev') + elif "centos" in platfromInfo or "redhat" in platfromInfo: + os.system( + 'sudo '+linuxSystemInstall+' ' + + linuxSystemInstall_check+' install git inxi python3-devel python3-pip protobuf python3-protobuf') + os.system('sudo pip3 install future pytz numpy') + # Install CAFFE Deps + os.system('sudo pip3 install google protobuf==3.12.4') + # Install ONNX Deps + os.system('sudo pip3 install onnx') + # Install NNEF Deps + os.system('mkdir -p '+modelCompilerDeps+'/nnef-deps') + os.system( + '(cd '+modelCompilerDeps+'/nnef-deps; git clone https://github.com/KhronosGroup/NNEF-Tools.git)') + os.system( + '(cd '+modelCompilerDeps+'/nnef-deps/NNEF-Tools/parser/cpp; mkdir -p build && cd build; '+linuxCMake+' ..; make)') + os.system( + '(cd '+modelCompilerDeps+'/nnef-deps/NNEF-Tools/parser/python; sudo python3 setup.py install)') + else: + print("STATUS: Model Compiler Deps Pre-Installed - " + + modelCompilerDeps+"\n") + # Install OpenCV os.system('(cd '+deps_dir+'/build; mkdir OpenCV )') # Install pre-reqs diff --git a/README.md b/README.md index a1a1f70be9..5b2f390887 100644 --- a/README.md +++ b/README.md @@ -186,6 +186,7 @@ For the convenience of the developer, we here provide the setup script which wil --ffmpeg [FFMPEG V4.4.2 Installation - optional (default:ON) [options:ON/OFF]] --rocal [MIVisionX rocAL Dependency Install - optional (default:ON) [options:ON/OFF]] --neural_net[MIVisionX Neural Net Dependency Install - optional (default:ON) [options:ON/OFF]] + --inference [MIVisionX Neural Net Inference Dependency Install - optional (default:ON) [options:ON/OFF]] --reinstall [Remove previous setup and reinstall (default:OFF)[options:ON/OFF]] --backend [MIVisionX Dependency Backend - optional (default:HIP) [options:HIP/OCL/CPU]] --rocm_path [ROCm Installation Path - optional (default:/opt/rocm) - ROCm Installation Required] diff --git a/amd_openvx_extensions/CMakeLists.txt b/amd_openvx_extensions/CMakeLists.txt index 049f6072c2..0d3d6d3300 100644 --- a/amd_openvx_extensions/CMakeLists.txt +++ b/amd_openvx_extensions/CMakeLists.txt @@ -127,14 +127,14 @@ if(AMDRPP_FOUND AND GPU_SUPPORT) #find the RPP backend type set(RPP_BACKEND_OPENCL_FOUND 0) set(RPP_BACKEND_HIP_FOUND 0) - if(EXISTS ${ROCM_PATH}/include/rpp/rpp_backend.h) - file(READ ${ROCM_PATH}/include/rpp/rpp_backend.h RPP_BACKEND_FILE) + if(EXISTS ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h) + file(READ ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h RPP_BACKEND_FILE) string(REGEX MATCH "RPP_BACKEND_OPENCL ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_OPENCL_FOUND ${CMAKE_MATCH_1}) string(REGEX MATCH "RPP_BACKEND_HIP ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_HIP_FOUND ${CMAKE_MATCH_1}) else() - message("-- ${Red}WARNING: ${ROCM_PATH}/include/rpp/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") + message("-- ${Red}WARNING: ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") endif() if ("${BACKEND}" STREQUAL "OPENCL" AND OpenCL_FOUND) diff --git a/apps/cloud_inference/server_app/CMakeLists.txt b/apps/cloud_inference/server_app/CMakeLists.txt index 1081e444c3..1496aeef17 100644 --- a/apps/cloud_inference/server_app/CMakeLists.txt +++ b/apps/cloud_inference/server_app/CMakeLists.txt @@ -41,7 +41,7 @@ list(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake) # OpenCV -- Display Component find_package(OpenCV REQUIRED) -# Choose Backend +# Choose Backend - TBD: ADD FindMIVisionX.cmake set(MIVISIONX_BACKEND_OPENCL_FOUND 0) set(MIVISIONX_BACKEND_HIP_FOUND 0) if(EXISTS ${ROCM_PATH}/${CMAKE_INSTALL_INCLUDEDIR}/mivisionx/openvx_backend.h) diff --git a/model_compiler/python/nnir_to_clib.py b/model_compiler/python/nnir_to_clib.py index ab0296457a..4e5b404654 100644 --- a/model_compiler/python/nnir_to_clib.py +++ b/model_compiler/python/nnir_to_clib.py @@ -135,6 +135,7 @@ def generateCMakeFiles(graph,outputFolder): #find the OPENVX backend type set(OPENVX_BACKEND_OPENCL_FOUND 0) set(OPENVX_BACKEND_HIP_FOUND 0) +#TBD: ADD FindMIVisionX.cmake if(EXISTS ${ROCM_PATH}/include/mivisionx/openvx_backend.h) file(READ ${ROCM_PATH}/include/mivisionx/openvx_backend.h OPENVX_BACKEND_FILE) string(REGEX MATCH "ENABLE_OPENCL ([0-9]*)" _ ${OPENVX_BACKEND_FILE}) diff --git a/model_compiler/python/nnir_to_openvx.py b/model_compiler/python/nnir_to_openvx.py index 07df9c6024..d3fc390737 100644 --- a/model_compiler/python/nnir_to_openvx.py +++ b/model_compiler/python/nnir_to_openvx.py @@ -122,6 +122,7 @@ def generateCMakeFiles(graph,outputFolder): #find the OPENVX backend type set(OPENVX_BACKEND_OPENCL_FOUND 0) set(OPENVX_BACKEND_HIP_FOUND 0) +#TBD: ADD FindMIVisionX.cmake if(EXISTS ${ROCM_PATH}/include/mivisionx/openvx_backend.h) file(READ ${ROCM_PATH}/include/mivisionx/openvx_backend.h OPENVX_BACKEND_FILE) string(REGEX MATCH "ENABLE_OPENCL ([0-9]*)" _ ${OPENVX_BACKEND_FILE}) diff --git a/rocAL/rocAL/CMakeLists.txt b/rocAL/rocAL/CMakeLists.txt index 155e7d629b..1c5c379c67 100644 --- a/rocAL/rocAL/CMakeLists.txt +++ b/rocAL/rocAL/CMakeLists.txt @@ -73,14 +73,14 @@ else() #find the RPP backend type set(RPP_BACKEND_OPENCL_FOUND 0) set(RPP_BACKEND_HIP_FOUND 0) - if(EXISTS ${ROCM_PATH}/include/rpp/rpp_backend.h) - file(READ ${ROCM_PATH}/include/rpp/rpp_backend.h RPP_BACKEND_FILE) + if(EXISTS ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h) + file(READ ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h RPP_BACKEND_FILE) string(REGEX MATCH "RPP_BACKEND_OPENCL ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_OPENCL_FOUND ${CMAKE_MATCH_1}) string(REGEX MATCH "RPP_BACKEND_HIP ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_HIP_FOUND ${CMAKE_MATCH_1}) else() - message("-- ${Red}WARNING: ${ROCM_PATH}/include/rpp/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") + message("-- ${Red}WARNING: ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") endif() if ("${BACKEND}" STREQUAL "OPENCL" AND OpenCL_FOUND) diff --git a/rocAL/rocAL_pybind/CMakeLists.txt b/rocAL/rocAL_pybind/CMakeLists.txt index c2766e3243..d507281b21 100644 --- a/rocAL/rocAL_pybind/CMakeLists.txt +++ b/rocAL/rocAL_pybind/CMakeLists.txt @@ -62,14 +62,14 @@ else() #find the RPP backend type set(RPP_BACKEND_OPENCL_FOUND 0) set(RPP_BACKEND_HIP_FOUND 0) - if(EXISTS ${ROCM_PATH}/include/rpp/rpp_backend.h) - file(READ ${ROCM_PATH}/include/rpp/rpp_backend.h RPP_BACKEND_FILE) + if(EXISTS ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h) + file(READ ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h RPP_BACKEND_FILE) string(REGEX MATCH "RPP_BACKEND_OPENCL ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_OPENCL_FOUND ${CMAKE_MATCH_1}) string(REGEX MATCH "RPP_BACKEND_HIP ([0-9]*)" _ ${RPP_BACKEND_FILE}) set(RPP_BACKEND_HIP_FOUND ${CMAKE_MATCH_1}) else() - message("-- ${Red}WARNING: ${ROCM_PATH}/include/rpp/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") + message("-- ${Red}WARNING: ${AMDRPP_INCLUDE_DIRS}/rpp_backend.h file Not Found. please run the setup script to install latest RPP package ${ColourReset}") endif() if ("${BACKEND}" STREQUAL "OPENCL" AND OpenCL_FOUND) if (NOT RPP_BACKEND_OPENCL_FOUND) diff --git a/samples/inference/mv_objdetect/CMakeLists.txt b/samples/inference/mv_objdetect/CMakeLists.txt index 5b22a86ceb..2975ba0234 100644 --- a/samples/inference/mv_objdetect/CMakeLists.txt +++ b/samples/inference/mv_objdetect/CMakeLists.txt @@ -32,7 +32,7 @@ list(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake) find_package(OpenCV QUIET) set(ROCM_PATH /opt/rocm CACHE PATH "ROCm Installation Path") -#find the OPENVX backend type +#find the OPENVX backend type - TBD: ADD FindMIVisionX.cmake set(OPENVX_BACKEND_OPENCL_FOUND 0) set(OPENVX_BACKEND_HIP_FOUND 0) if(EXISTS ${ROCM_PATH}/include/mivisionx/openvx_backend.h) diff --git a/samples/model_compiler_samples/CMakeLists.txt b/samples/model_compiler_samples/CMakeLists.txt index 734d288520..f79a080a8e 100644 --- a/samples/model_compiler_samples/CMakeLists.txt +++ b/samples/model_compiler_samples/CMakeLists.txt @@ -33,7 +33,7 @@ list(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake) find_package(OpenCV REQUIRED) - #find the OPENVX backend type + #find the OPENVX backend type - TBD: ADD FindMIVisionX.cmake set(OPENVX_BACKEND_OPENCL_FOUND 0) set(OPENVX_BACKEND_HIP_FOUND 0) if(EXISTS ${ROCM_PATH}/include/mivisionx/openvx_backend.h) From a1e8bb5de005cc018d685ba5c257f6546546be93 Mon Sep 17 00:00:00 2001 From: rrawther Date: Fri, 20 Jan 2023 10:42:23 -0800 Subject: [PATCH 3/3] fix crop_mirror_normalize node to do center_crop by default --- .../image_processing/inference_pipeline.py | 135 ++++++++++++++++++ .../rocAL/include/parameters/parameter_crop.h | 7 +- .../source/api/rocal_api_augmentation.cpp | 5 +- .../geometry_augmentations/node_crop.cpp | 2 +- .../node_crop_mirror_normalize.cpp | 8 +- .../parameters/parameter_rocal_crop.cpp | 6 +- rocAL/rocAL_pybind/amd/rocal/fn.py | 2 +- 7 files changed, 152 insertions(+), 13 deletions(-) create mode 100644 rocAL/docs/examples/image_processing/inference_pipeline.py diff --git a/rocAL/docs/examples/image_processing/inference_pipeline.py b/rocAL/docs/examples/image_processing/inference_pipeline.py new file mode 100644 index 0000000000..a7db74e167 --- /dev/null +++ b/rocAL/docs/examples/image_processing/inference_pipeline.py @@ -0,0 +1,135 @@ +# Copyright (c) 2018 - 2023 Advanced Micro Devices, Inc. All rights reserved. +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +# THE SOFTWARE. + +import sys +from tkinter import W +from amd.rocal.pipeline import pipeline_def +import numpy as np +import rocal_pybind as b +from amd.rocal.plugin.pytorch import ROCALClassificationIterator +import amd.rocal.fn as fn +import amd.rocal.types as types +import matplotlib.gridspec as gridspec +import matplotlib.pyplot as plt + + +seed = 1549361629 +image_dir = "../../../../data/images/AMD-tinyDataSet/" +batch_size = 4 +gpu_id = 0 + +def show_images(image_batch, device): + columns = 4 + rows = (batch_size + 1) // (columns) + #fig = plt.figure(figsize = (32,(32 // columns) * rows)) + gs = gridspec.GridSpec(rows, columns) + for j in range(rows*columns): + #print('\n Display image: ', j) + plt.subplot(gs[j]) + img = image_batch[j] + plt.axis("off") + if device == "cpu": + plt.imshow(img) + else: + plt.imshow(img.cpu()) + plt.show() + +def show_images(image_batch, image_batch1, device): + columns = 4 + rows = (batch_size + 1) // (columns) + #fig = plt.figure(figsize = (32,(32 // columns) * rows)) + gs = gridspec.GridSpec(rows, columns*2) + for j in range(rows*columns): + #print('\n Display image: ', j) + k = j*2 + plt.subplot(gs[k]) + img = image_batch[j] + plt.axis("off") + if device == "cpu": + plt.imshow(img) + else: + plt.imshow(img.cpu()) + + plt.subplot(gs[k+1]) + img = image_batch1[j] + plt.axis("off") + if device == "cpu": + plt.imshow(img) + else: + plt.imshow(img.cpu()) + plt.show() + + + +def show_pipeline_output(pipe, device): + pipe.build() + data_loader = ROCALClassificationIterator(pipe, device) + images = next(iter(data_loader)) + show_images(images[0], device) + +def show_pipeline_outputs(pipe0, pipe1, device): + pipe0.build() + pipe1.build() + data_loader = ROCALClassificationIterator(pipe0, device) + data_loader1 = ROCALClassificationIterator(pipe1, device) + images = next(iter(data_loader)) + images1 = next(iter(data_loader1)) + show_images(images[0], images1[0], device) + +@pipeline_def(seed=seed) +def inference_pipeline(device="cpu", path=image_dir): + jpegs, labels = fn.readers.file(file_root=path, shard_id=0, num_shards=1, random_shuffle=False) + images = fn.decoders.image(jpegs, file_root=path, max_decoded_width=1024, max_decoded_height=1024, device="cpu", output_type=types.RGB, shard_id=0, num_shards=1, random_shuffle=False) + images_res = fn.resize(images, scaling_mode=types.SCALING_MODE_NOT_SMALLER, interpolation_type=types.TRIANGULAR_INTERPOLATION, resize_shorter=256) + return fn.centre_crop(images_res, crop=(224, 224)) + +@pipeline_def(seed=seed) +def inference_pipeline_cmn(device="cpu", path=image_dir): + jpegs, labels = fn.readers.file(file_root=path, shard_id=0, num_shards=1, random_shuffle=False) + images = fn.decoders.image(jpegs, file_root=path, max_decoded_width=1024, max_decoded_height=1024, device="cpu", output_type=types.RGB, shard_id=0, num_shards=1, random_shuffle=False) + images_res = fn.resize(images, scaling_mode=types.SCALING_MODE_NOT_SMALLER, interpolation_type=types.TRIANGULAR_INTERPOLATION, resize_shorter=256) + return fn.crop_mirror_normalize(images_res , device="cpu", + output_dtype=types.FLOAT, + output_layout=types.NHWC, + crop=(224, 224), + mirror=0, + image_type=types.RGB, + mean=[0,0,0], + std=[255.0,255.0,255.0]) + +def main(): + print ('Optional arguments: ') + bs = batch_size + rocal_device = "cpu" + img_folder = image_dir + if len(sys.argv) > 1: + if(sys.argv[1] == "gpu"): + rocal_device = "gpu" + if len(sys.argv) > 2: + img_folder = sys.argv[2] + + pipe = inference_pipeline(batch_size=bs, num_threads=1, device_id=gpu_id, rocal_cpu=True, tensor_layout=types.NHWC, + reverse_channels=True, multiplier = [0.00392,0.00392,0.00392], device=rocal_device, path=img_folder) + pipe1 = inference_pipeline_cmn(batch_size=bs, num_threads=1, device_id=gpu_id, rocal_cpu=True, tensor_layout=types.NHWC, + reverse_channels=True, multiplier = [0.00392,0.00392,0.00392], device=rocal_device, path=img_folder) + show_pipeline_outputs(pipe, pipe1, device=rocal_device) + +if __name__ == '__main__': + main() diff --git a/rocAL/rocAL/include/parameters/parameter_crop.h b/rocAL/rocAL/include/parameters/parameter_crop.h index 058184fee4..8ace2d04b3 100644 --- a/rocAL/rocAL/include/parameters/parameter_crop.h +++ b/rocAL/rocAL/include/parameters/parameter_crop.h @@ -45,7 +45,7 @@ class CropParam // V Y directoin public: CropParam() = delete; - CropParam(unsigned int batch_size): batch_size(batch_size), _random(false), _is_center_crop(false) + CropParam(unsigned int batch_size): batch_size(batch_size), _random(false), _is_fixed_crop(false) { x_drift_factor = default_x_drift_factor(); y_drift_factor = default_y_drift_factor(); @@ -58,7 +58,7 @@ class CropParam in_height = in_height_; } void set_random() {_random = true;} - void set_center_crop() { _is_center_crop = true; } + void set_fixed_crop(float anchor_x, float anchor_y) { _is_fixed_crop = true; _random = false; _crop_anchor[0] = anchor_x; _crop_anchor[1] = anchor_y;} void set_x_drift_factor(Parameter* x_drift); void set_y_drift_factor(Parameter* y_drift); std::vector in_width, in_height; @@ -83,7 +83,8 @@ class CropParam Parameter* default_x_drift_factor(); Parameter* default_y_drift_factor(); std::vector x1_arr_val, y1_arr_val, croph_arr_val, cropw_arr_val, x2_arr_val, y2_arr_val; - bool _random, _is_center_crop; + bool _random, _is_fixed_crop; + float _crop_anchor [2] = {0.5, 0.5}; virtual void fill_crop_dims(){}; void update_crop_array(); }; diff --git a/rocAL/rocAL/source/api/rocal_api_augmentation.cpp b/rocAL/rocAL/source/api/rocal_api_augmentation.cpp index 414d8a1583..02d47086e8 100644 --- a/rocAL/rocAL/source/api/rocal_api_augmentation.cpp +++ b/rocAL/rocAL/source/api/rocal_api_augmentation.cpp @@ -475,6 +475,7 @@ rocalResize( unsigned resize_shorter, unsigned resize_longer, RocalResizeInterpolationType interpolation_type) { + Image* output = nullptr; if ((p_context == nullptr) || (p_input == nullptr)) { ERR("Invalid ROCAL context or invalid input image") @@ -1747,7 +1748,7 @@ rocalCropFixed( ImageInfo output_info = input->info(); output_info.width(crop_width); output_info.height(crop_height); - output = context->master_graph->create_image(input->info(), is_output); + output = context->master_graph->create_image(output_info, is_output); output->reset_image_roi(); std::shared_ptr crop_node = context->master_graph->add_node({input}, {output}); crop_node->init(crop_height, crop_width, crop_pos_x, crop_pos_y); @@ -1786,7 +1787,7 @@ rocalCropCenterFixed( ImageInfo output_info = input->info(); output_info.width(crop_width); output_info.height(crop_height); - output = context->master_graph->create_image(input->info(), is_output); + output = context->master_graph->create_image(output_info, is_output); output->reset_image_roi(); std::shared_ptr crop_node = context->master_graph->add_node({input}, {output}); crop_node->init(crop_height, crop_width); diff --git a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp index 1e987f2a82..7ea434f6a6 100644 --- a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp +++ b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop.cpp @@ -80,7 +80,7 @@ void CropNode::init(unsigned int crop_h, unsigned int crop_w) _crop_param->crop_h = crop_h; _crop_param->x1 = 0; _crop_param->y1 = 0; - _crop_param->set_center_crop(); + _crop_param->set_fixed_crop(0.5, 0.5); // for center_crop } diff --git a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp index 6398893438..eff05b4da8 100644 --- a/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp +++ b/rocAL/rocAL/source/augmentations/geometry_augmentations/node_crop_mirror_normalize.cpp @@ -77,12 +77,14 @@ void CropMirrorNormalizeNode::update_node() _mirror.update_array(); } -void CropMirrorNormalizeNode::init(int crop_h, int crop_w, float start_x, float start_y, float mean, float std_dev, IntParam *mirror) +void CropMirrorNormalizeNode::init(int crop_h, int crop_w, float anchor_x, float anchor_y, float mean, float std_dev, IntParam *mirror) { - _crop_param->x1 = start_x; - _crop_param->y1 = start_y; + // current implementation does a fixed crop with specified dims and anchor + _crop_param->x1 = 0; + _crop_param->y1 = 0; _crop_param->crop_h = crop_h; _crop_param->crop_w = crop_w; + _crop_param->set_fixed_crop(anchor_x, anchor_y); _mean = mean; _std_dev = std_dev; _mirror.set_param(core(mirror)); diff --git a/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp b/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp index 1efa52b06b..ae38879729 100644 --- a/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp +++ b/rocAL/rocAL/source/parameters/parameter_rocal_crop.cpp @@ -52,9 +52,9 @@ void RocalCropParam::fill_crop_dims() { // Evaluating user given crop cropw_arr_val[img_idx] = (crop_w <= in_width[img_idx] && crop_w > 0) ? crop_w : in_width[img_idx]; croph_arr_val[img_idx] = (crop_h <= in_height[img_idx] && crop_h > 0) ? crop_h : in_height[img_idx]; - if (_is_center_crop) { - x1_arr_val[img_idx] = static_cast(0.5 * (in_width[img_idx] - cropw_arr_val[img_idx])); - y1_arr_val[img_idx] = static_cast(0.5 * (in_height[img_idx] - croph_arr_val[img_idx])); + if (_is_fixed_crop) { + x1_arr_val[img_idx] = static_cast(_crop_anchor[0] * (in_width[img_idx] - cropw_arr_val[img_idx])); + y1_arr_val[img_idx] = static_cast(_crop_anchor[1] * (in_height[img_idx] - croph_arr_val[img_idx])); } else { x1_arr_val[img_idx] = (x1 >= in_width[img_idx]) ? 0 : x1; y1_arr_val[img_idx] = (y1 >= in_height[img_idx]) ? 0 : y1; diff --git a/rocAL/rocAL_pybind/amd/rocal/fn.py b/rocAL/rocAL_pybind/amd/rocal/fn.py index cac62ec570..b4a05c915c 100644 --- a/rocAL/rocAL_pybind/amd/rocal/fn.py +++ b/rocAL/rocAL_pybind/amd/rocal/fn.py @@ -424,7 +424,7 @@ def crop_mirror_normalize(*inputs, bytes_per_sample_hint=0, crop=[0, 0], crop_d= mirror = b.CreateIntParameter(1) # pybind call arguments - kwargs_pybind = {"input_image0": inputs[0], "crop_depth":crop_depth, "crop_height":crop_height, "crop_width":crop_width, "start_x":0, "start_y":0, "start_z":0, "mean":mean, "std_dev":std, + kwargs_pybind = {"input_image0": inputs[0], "crop_depth":crop_depth, "crop_height":crop_height, "crop_width":crop_width, "start_x":crop_pos_x, "start_y":crop_pos_y, "start_z":crop_pos_z, "mean":mean, "std_dev":std, "is_output": False, "mirror": mirror} cmn = b.CropMirrorNormalize(Pipeline._current_pipeline._handle ,*(kwargs_pybind.values())) Pipeline._current_pipeline._tensor_layout = output_layout