-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include path reorg - update pending file #672
Merged
kiritigowda
merged 101 commits into
ROCm:topic/ROCm_folder_reorg
from
arvindcheru:topic/ROCm_folder_reorg
Nov 7, 2021
Merged
Include path reorg - update pending file #672
kiritigowda
merged 101 commits into
ROCm:topic/ROCm_folder_reorg
from
arvindcheru:topic/ROCm_folder_reorg
Nov 7, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Backend Support - AMD EXT Expanded Support * MIVisionX Backend - Cleanup * rocAL CMakeList - Fix MSG * CPU Backend - Fix CMakeList * AMD RPP - Warning MSG for CPU Backend
* optimize ColorDepth kernels * Add new coding style for arithmetic/logical/color hip kernels * Merge pull request ROCm#32 from asalmanp/as/hip_kernels_style Add new coding style for arithmetic/logical/color hip kernels * Add auto OCL dump generator script * Add gdfs for arithmetic, logical, color kernels * Modify arithmetic kernels as per new std * Add the missing buffer_offset to the hip_memory * Arithmetic kernels fixes * Modify logical kernels as per new std * Revert to previous min max impl * changed Threshold to support new OpenVX 1.3 format (ROCm#38) Co-authored-by: paveltc <pavel.tcherniaev@amd.com> * add the optimized ChannelExtract_U8_U32_Pos0 and ChannelExtract_U8_U24_Pos0 color kernels * Threshold - Update to 1.3 * Add new gdfs and modify generator script * Jenkins - Check Build & Artifacts * Tests - Fix platform name * Modify generator script for ocl/hip dumps and fixes for gdfs * Add optimized box filter * Modify kernelGDFs, automate script for OCL/HIP bin dumps for different image sizes * Optimize phase, magnitude, weighted average and remove trailing spaces * Optimize magnitude, phase, weighted_average, Minor fix * Formatting fixes * Formatting changes * modify hip pack_ function to fix SAT issue in some kernels * Place kernelGDFs in independent folders * Fix runvxTestAllScript, readme and Modify gitignore * Revert "Optimize phase, magnitude, weighted average and remove trailing spaces" This reverts commit ae97d35. * Move all common types/device codes into a new header * GPU Fix - multiply gpu (ROCm#39) * CMake * multiply fix * code cleanup * GPU Flow - Canny Fix (ROCm#36) * CMake * canny fix * code cleanup * optimize hip_clamp function * Partial changes to color kernels * Optimize color kernels * Cleanup * Change typecast float to make_float4() * Add UYVY/YUYV options for ChannelExtract * Modify globalThreads_x and globalThreads_y * Kernel GDF modifications * Script enhancements - add support for single kernel testing, optional build * Edit script readme * minor optimization for Phase kernel * fix comment * GPU Flow - Bug Fixes (ROCm#35) * fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap * Graph.GraphState * fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8 * removing unwanted commits * fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8 * fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8 * removing unnecessary changes * Add filter kernel GDFs * Add test script support for filter kernel diff checks * Optimizations for filter kernels - initial commit * Optimize ScaleGaussianHalf, other minor fixes * Correct some test names in runVisionTests script * Disable ScaleGaussianHalf temporarily * Optimize Median3_/min3_/max3_ * Fix convolotion issue for hip * fix seg fault for ScaleGaussian * Add support for channelCopy and Lut * Minor change * Optimize statistical kernels * Optimize UV12/UV/IUV and ScaleUp2x2 * Minor change * Add kernelGDFs for IUV/UV12/UV converts, threshold, convolve * Update runVisionTests.py and runvxTestAllScript.sh to run with arithmetic/logical/color/filter/statistical kernels * Add uniform-image inputs with hex pixel values * Remove all U1 kernel testing * Test script mods * Uncomment all kernels except geometric/vision * Minor fix * Optimize geometric kernels - initial commit * Minor changes * Mods to use floorf, mul24, mad24, Scale_U8_U8_Area * ScaleImage_U8_U8_Area fixes and Remap initial commit * Remove #defines for remap * Pass hip_memory for remap * Enable scale, warpAffine, warpPerspective testing * Add kernelGDFs for geometric functions, runvxTestAllScript.sh update * Fix the bug for ScaleImage_Bilinear_Constant and ScaleImage_Bilinear_Replicate * GDF and test script corrections * Disable kernels with attr * Disable UV12/UV/IUV converts and ScaleUp2x2 * Add vision kernelGDFs * Vision kernels - initial commit * Modify helpers to use hip built in functions * Remove code used for testing * Minor changes * use consistent device function names and code clean up * remove extra semicolon * switch to builtin functions for hip_lerp * Formatting fixes * minor cmake change to print HIP path/version correctly * Modify harris corners * Test script mod * cmake file changes for building GPU backends and CPU properly * code clean up to make it more readable that there will be a fatal error if OPENCL or HIP not found in the case of the default GPU_SUPPORT=ON * Remove samples/hip_samples, Add openvx_runvx_tests * Enhance runvxTestAllScript, Change ReadMe * Formatting fixes, Code cleanup * Rename openvx_runvx_tests to openvx_node_tests * fix a seg fault for Canny node * remove unused parameter from CannySuppThreshold * Delete vision_tests outer folder * Enhancements to runVisionTests.py * Remove blank lines * Vision kernel mods * Formatting fix * Codacy fixes 1 * Codacy fixes 2 * Codacy fixes 3 * fix cmake * Make pandas optional * Code cleanup * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Add backend_type OCL * Fix CMake issues for HIP backend build. Fix issues caused by merge. * Add support for HIP backend. * add support for VX_DIRECTIVE_AMD_COPY_TO_HIPMEM * Add HIP backend support for Resize crop function. Modify unittest to save all images in local folder (test HIP support). * Fix minor issues in HIP backend. * Fix rocAL Pybind build issue. Update rocAL README.md for TurboJpeg installation. * Fix brightness updation issue. Set random seed in paramter factory constructor. * Fix issue with CMake to work for OCL and HIP backend. * Fix requested deviceID not found error. * Fix issue with HIP load routine. * Rename rali to rocAL. * Fix merge issues. * Fix build issue for rocAL pybind module. (cherry picked from commit 0e1a43a) * Add prefetching support in RALI pipeline. (cherry picked from commit 0d5cf66) * Fix build warnings. (cherry picked from commit b063ca6) * Fix warnings. * Clean up. * Fix merge issues. * Made suggested PR changes. * Fix build error. * Added HIP functionality to AbsoluteDifference * added HIP support for some functions * Added HIP support for another batch of functions * Add HIP supprt for last batch of functions * Set correct affinity to the below amd_rpp nodes. 1. AbsoluteDifference 2. AccumulateSquared 3. AccumulateWeighted 4. Accumulate 5. Add * Set correct affinity to the below amd_rpp nodes. 1. BilateralFilter 2. BitwiseAND 3. BitwiseNOT 4. Blend 5. Blur 6. BoxFilter 7. Brightness * Set correct affinity to the below amd_rpp nodes. 1. CannyEdgeDetector. 2. ChannelCombine. 3. ChannelExtract. 4. ColorTemperature. 5. ColorTwist. 6. Contrast. 7. ControlFlow. 8. CropMirrorNormalize. 9. Crop. 10. CustomConvolution. * Set correct affinity to the below amd_rpp nodes. 1. DataObjectCopy. 2. Dilate. 3. Erode. 4. ExclusiveOR. 5. Exposure. * Set correct affinity to the below amd_rpp nodes. 1. FastCornerDetector. 2. Fisheye. 3. Flip. 4. Fog. 5. GammaCorrection. 6. GaussianFilter. 7. GaussianImagePyramid. * Set correct affinity to the below amd_rpp nodes. 1. HarrisCornerDetector 2. Histogram 3. HistogramBalance 4. Hue 5. WarpPerspective * Set correct affinity to the below amd_rpp nodes. 1. InclusiveOR 2. Jitter 3. LaplacianImagePyramid 4. LensCorrection 5. LocalBinaryPattern 6. LookUpTable * Set correct affinity to the below amd_rpp nodes. 1. Magnitude 2. Max 3. MeanStddev 4. MedianFilter 5. MinMaxLoc 6. Min 7. Multiply * Set correct affinity to the below amd_rpp nodes. 1. Noise 2. NonLinearFilter 3. NonMaxSupression 4. nop 5. Occlusion 6. Phase 7. Pixelate * Set correct affinity to the below amd_rpp nodes. 1. Rain 2. RandomCropLetterBox 3. RandomShadow 4. Remap 5. ResizeCropMirror 6. ResizeCrop 7. Rotate * Set correct affinity to the below amd_rpp nodes. 1. Saturation 2. Scale 3. Snow 4. Sobel 5. Subtract 6. TensorAdd * Set correct affinity to the below amd_rpp nodes. 1. TensorLookup 2. TensorMatrixMultiply 3. TensorMultiply 4. TensorSubtract 5. Thresholding 6. Vignette 7. WarpAffine * Clean up by reducing the variants from 4 -> 1 in amd_rpp. 1. Retain only batchPD variant and delete all the single, batchPS and batchPDROID variants. 2. Remove the support in header and other files. * Set affinity to CPU for OCL backend for all nodes in amd_rpp to run without codegen. * Fix issue with rocAL pybind installation. * Fix indendation issue with nodes in amd_rpp. * Add HIP backend support for single nodes in amd_rpp * Code clean up for amd_rpp nodes. 1. Move memory allocations to initialize function. 2. Add calls to free up memory in uninitialze function. 3. Remove unused declarations. 4. Move batchsize querying to initialize. * Error handling in amd_rpp nodes. Add return error status for functions which do not have GPU support in RPP. * Fix formatting for all amd_rpp nodes. * Fix codacy issue. Change copy_status to STATUS_ERROR_CHECK. Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@gmail.com> Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com> Co-authored-by: Abishek <52214183+r-abishekmcw@users.noreply.github.com> Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com> Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com> Co-authored-by: paveltc <pavel.tcherniaev@amd.com> Co-authored-by: Hansel Yang <hansyang@amd.com> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com>
* Neural Network Extension - add CMake support for HIP GPU backend - This PR also adds initial HIP kernel support for the gather layer. - The support for executing the gather layer with HIP GPU backend will be added in the next PR. * add backend type(OpenCL/HIP) in the message
* CMakeLists - Set all warnings RED * Docker - Updates * Readme - Docker Updates * Docker - Fix RPP install Location * Pytorch Docker - RPP Location
* coco meta_reader optimization to remove reading metadata many times * fix build error * fix bug in checking reader type * adjust spacing
…OCm#575) * Neural Network Extension, HIP GPU backend - add support for gather layer * Fix a bug for registering NN kernels for OCL backend
* CentOS 7 - L3 fix * CentOS 7 - L4 Fix * CentOS 7 - MIVisionX Docker
* fixes RGBX to NV12 and IYUV for OCL * fixes RGBX to NV12 and IYUV for HIP * fixes RGBX to iYUV HIP
* Docker - CentOS 7/8 Updates * CentOS 8 - Fix ROCm install * CentOS 8 - Docker updates * Docker - Typo Fix
…d by PR#596 (ROCm#598) * OpenVX GPU backend - fix a regression for HarrisCorner node introduced by PR#596 * set the environment variable only if the target is GPU * add ENABLE_OPENCL/ENABLE_HIP guards for HarrisCorner workaround * set the AGO_DEFAULT_TARGET if it is not set by the user
…er (ROCm#595) * Neural Network Extension - add support for cast layer * fix a typo * add missing return
* Code clean up * Fix codacy issue * Fix build issues Modify few function calls and its arguments to match with latest RPP changes Remove Bilateral filter Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com>
* OpenVX 1.3 - No Test Filters * HIP - Backend Install Path * AMD EXT - Fix OpenCL Flow * AMD RPP - CMakeLists Updates
* Jenkins - OpenVX 1.3 CTS Logs to artifacts * CTS LOG - Name Fix
* OpenVX HIP GPU backend - code clean up for filter kernels * remove extra space
…sor and tensor to image layers (ROCm#605)
* CI Code Coverage * CI - Upgrade to Python3
…EN backend (ROCm#610) * Neural Network Extension - add support for finding the installed MIOPEN backend * add missing check for miopen before finding it's backend type
…efore reading it (ROCm#661) * Neural Network extension - check if the MIOpen's config file exists before reading it * clean up
* fix for some codacy warnings * get rid of codacy bug in context null ref checking
…e/Tensor_log/Tensor_exp layers (ROCm#664)
* Fix a few bugs * caffe and caffe2 changes for optimization * removed commented code * Fix issue with SSD meta node. * Add support for original width and height for TF Detection * Add change required to reflect mean and std values in an images pixel value * Clean codes * Hardcoding the key values for tf detection and classification * Release RingBuffer memory * Modify RALI API's to return the bbox coords and bbox labels for all images in the output batch * Update rali_unittest.cpp * Clean repo * Add batching support in PYTHON API for labels,bboxes,img_sizes excluding image_names * Add support for image names for batching support * Merge RALI_Upgrade * Update raliunittest.cpp * Add support for bytes instead of str in rali pybind * Code clean Up * Fix codacy issues * Fix codacy issues * Fix indendation error * Fix codacy warnings * Fix trailing spaces warnings * Fix scope of the variable can be reduced warning * Fix errors in the RALI API * Fix Codacy warnings * Remove extra empty lines * Add support for Box Encoder in coco_pipeline.py * Add codes to retrieve meta data information in loader module * Add support for One Hot Labels for all classification based Readers * Add codes to retrieve meta data information in decoder module * Introduce data loader for coco reader using partail decoder * Add casting Support for Encoded Labels * Add support for RandomBBoxCrop augmenation * Clean codes * Introduce RandomBBoxCrop_MetaData Reader to store the crop params returned by RandomBBoxCrop function * Update RandomBBoxCrop_MetaData Reader * Add meta data update support for both vertical and horizontal flip. * Add RandomBBox support. Introduced map to store image name and crop generated by randomBbox. Look up to fetch CropCordsBatch. Get functionality to get crop wrt to image_name as key. Fixed the seg fault. * Add support for BBFlip * Add support for Random BBox Crop Reader to be part of load routine. Fetches the crop of the image to be decodes and does partial decoding for crop part. * Fix the warnings. * Fix issue with the meta data updation in master graph. * Add API changes. Fix issues with RandomBBoxCrop algorithm. * Add support for Random BBox Crop & ImageDecoderSlice * Small changes in Box Encoder * Small Change in Anchor boxes input comment * Add changes for Crop Dim exceding Image Dim * Minor Changes for RandomBboxCrop * Fix issues with RandomBboxCrop. * Fix Reader seg fault issue. * Add minor Changes for fp16 * Minor Changes * Clean Codes * Minor Changes for Random BBox Crop * Add support for Multi-GPU * Minor Changes for Multi-GPU support in COCO file souce partial * Remove unwanted code. * Minor Changes in RandomBBoxCrop * Fix Random BBox Crop * Comment out the print statements * Minor Changes for RBBOX * Code clean up. * Fix warnings wrt Ubuntu 20.04 * Resolve codacy warnings * Resolve codacy warnings * Fix PR issues. * Revert back Slice to Crop * Resolve Codacy warnings * Resolve codacy issues * Resolve Minor codacy issue * Fix issue to make branch compatible with AMDRPP master TOT. * Fix the crop_x difference in partial decoding image crop * Fix issue with crop fixed. * Crop_x & crop_y value fixed. * Check bounday conditions and update crop params. * Fix the crop_width difference in partial decoding image crop * Change wrt invalid_bboxes. * Rewrite RandomBBoxCrop Code * Fix issue with random generation. Used randomdevice for seed when initializing random param. * Fix Key value zero error. Error introduced by partial decoding crop correction. Fixed by adjusting the calculation of top and right. * Add codes for Video Reader and Loader module (cherry picked from commit d97ab927e9a6597fde666f7dec50ba987259dccf) * Changes in image_augmentation library (cherry picked from commit 153f59bd377cf7cbd87ecf58d722cd88ba79ff22) * Fix Build issues (cherry picked from commit e023b104c9a220e4745b83693b6c38a7411b54fc) * Fix Build errors (cherry picked from commit 908c599fe7b300dfd837a87215b33295bc2752a9) * Add reader type for video reader (cherry picked from commit 843a6cd6a6436513b4c000813a48b66461768ecc) * Adding codes to decode video input file (cherry picked from commit 2c249d7312e0c746fac600b5e38c5b4cb16f1910) * Introduce Video Decoder module to decode video files (cherry picked from commit 4d18ea384a2599aed3b0a0c975d7cd0a343720d2) * Add decoder functions in FFMPEG_VIDEO_DECODER (cherry picked from commit 02224f9601ee4269c68f6725a19468a171718276) * Clean Video Decoder codes (cherry picked from commit 8699179b282aa7870f60ca670a19b512ca45d9ac) * Clean codes to remove build issues (cherry picked from commit 39c5f45ff875d111f2977af0487e9cf8d22174c2) * Clean codes * Initial changes for video reader pipeline. [NBC] * To handle sequence length. * To handle shuffle. * Temp local changes. [NYC] * Changes in the video reader pipeline. [NWC] * Video Reader changes * Fix the segmentation fault in the video reader pipeline * Add support to save decoded output frames in video decoder * Working Pipeline - Single Video file input Add support to modify internal and user batch size in master graph Add ffmpeg seek operation * Minor Changes * Add support for decoding multiple video files and shuffle * Add support to initialize ffmpeg context for each video decoder instance * Code cleanup * Fix issue in Shuffling the images in video reader * Add seek_frame function in video decoder * Code clean up * Update rali_unittest * Add folder based label meta data reader for video reader * Add support for Sequence Reader in RALI * Fix codacy issue * Add Sequence Rearrange initial setup. Works only for sequence length equal to video reader. Introduce ovx node sequence rearrange to support. Introduce API in rali_api_augmentations. * Fix issue with Sequence Rearrange with different sequence length. * Introduce raliVideoFileResize node in RALI to fuse video decoding and resize * Add new_sequence_length parameter to sequence rearrange * Add sequence rearrange algorithm for RGB images * Add support for Sequence Reader in RALI * Fix random shuffling of sequences in video reader * Add support for folder based reader and label support for video decoder and labels. * Clean codes * Fix issue in raliVideoReaderResize * Code clean up. * set batchsize to internal batch size in video pipeline loaders. * Add flag in master graph to switch between video and image pipelines. * Add step and stride parameter to VideoReader and SequenceReader * Fix issue with the sequence rearrange. * Adjust remaining image count in master graph wrt sequence rearrange. * Add meta data support for video reader folder based. * Update decode image info name according to stride * Minor bug fix * Add support for text file input Add support to fetch video properties from text file Modify reader to read from the start to end frame specified in text file Add meta data support for text file input to the video reader * Add support to process repeated file inputs in text file * Add meta data reader support to parse timestamps from text file Introduce enable_timestamps parameter and set_timestamps_bool to the meta data readers * Add rali_video_unittests Video Reader Vidoe Reader Resize Sequence Reader Sequence Rearramge * Code clean up * Fix maximum limit for decoder instance creation. Check if instance is there for the video file if not initialize one using previously created instance. * Fix warnings. * Minor fix * Add file_list_frame_num parameter To switch between timestamp or frame number input passed with text file * Add data samples for testing Add video samples Add coco sample data with 10 images for train and val * Add support to generate frame number and timestamps output * Fix multiple video file input to video pipeline * Add labelled video folder samples * Modified test suite Modified rali_video_unittests.cpp Add testScript.sh to build and execute rali_video_unittests Remove video pipeline tests from rali_unittests.cpp * Code clean up * Modify frame_rate variable * Add step and stride parameters to SequenceReaderSingleSharded * Minor fix * Modify ffmpeg video decoder functions Initialize the ffmpeg context once for each video file * Fix ffmpeg deprecation warnings * Modify ffmpeg video decoder Add width, height, stride and pixel format paramters to Decode * Code clean up Change Video label reader folders to Video label reader * Remove text file input parameter to dataloader * Add support to check variable frame rate videos * Minor changes * Minor fix * Code clean up * Code clean up * Change rali to rocAL * Merge branch 'AMD-Master' into video_devel * Resolve build issues Code clean up * Fix bug with Sequence Rearrange * Add sharding support to Video Reader * Add sharding support for Sequence Reader * Introduce decoder mode parameter * Add U8 support for Sequence Rearrange Minor changes * Add SingleShard API for video readers * Add support to decode more than one sequence Modify the load routine to decode more than one sequence Add sequence count parameter to Sequence rearrange * Merge branch 'video_devel_PR' of https://github.com/MCW-Dev/MIVISION into video_devel_PR * Fix SequenceReader and SequenceReaderSingleShard * Resolve merge conflicts * Minor fix * Add codes for multithreading * Fix build isssue with HIP backend * Fix warnings * Resolve codacy issues Remove blank lines Adjust spacing * Resolve codacy issues * Modify the sequence reader arguments of the ImageLoaderNode * Remove rocAL sample data * Minor changes Add RALI_VIDEO flag to few files * Add seperate VideoReader Introduce VideoFilesourceReader and VideoReaderConfig * Introduce SequenceInfo struct Minor changes * Fix codacy issues in video unit test testScript.sh * Minor fix * Minor bug fix * Introduce the latest FFmpeg API in ffmpeg_video_decoder.cpp * Merge branch 'PR_changes' of https://github.com/fiona-gladwin/MIVisionX into video_devel_PR * Video Pipeline changes * Batch size changes for Video Reader * Video Pipeline Meta data reader changes to store the meta data for each sequence and not for every frame in the sequence * Code cleanup - Video Reader changes * Batch size variable changes Change batch size and internal batch size variables to constant. Introduce batch size and batch ratio variables for the Sequence Reader in master graph. * Change datatype of frame_rate to float * Enable HIP Backend support for video pipeline * Add codes to dump the images in each batch as AVI video file * Minor change in video unit tests * Add HIP backend support for sequence rearrange * Add OpenCL backend support for sequence rearrange * Add condtion to disable ResizeNode update in raliVideoFileResize if resize width and height is same as the videos * PR changes * Fix single folder of images issue in Sequence Reader * Fix codacy issues * PR changes * API changes * Introduce separate output routine for the video pipeline * PR changes * PR changes * Fix for codacy issues Co-authored-by: LokeshBonta <lokeswara@multicorewareinc.com> Co-authored-by: Swetha B S <swetha@multicorewareinc.com> Co-authored-by: shobana-mcw <shobana@multicorewareinc.com> Co-authored-by: fionagladwin <fionagladwin@multicorewareinc.com> Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com>
* Update README.md * Update README.md
…_reorg' into topic/ROCm_folder_reorg
Include Path Reorg Changes - moved to include/<compnm>/
Update CMakeLists.txt
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.