Skip to content

Commit

Permalink
rocAL - MCW changes (#562)
Browse files Browse the repository at this point in the history
* optimize ColorDepth kernels

* Add new coding style for arithmetic/logical/color hip kernels

* Merge pull request #32 from asalmanp/as/hip_kernels_style

Add new coding style for arithmetic/logical/color hip kernels

* Add auto OCL dump generator script

* Add gdfs for arithmetic, logical, color kernels

* Modify arithmetic kernels as per new std

* Add the missing buffer_offset to the hip_memory

* Arithmetic kernels fixes

* Modify logical kernels as per new std

* Revert to previous min max impl

* changed Threshold to support new OpenVX 1.3 format (#38)

Co-authored-by: paveltc <pavel.tcherniaev@amd.com>

* add the optimized ChannelExtract_U8_U32_Pos0 and ChannelExtract_U8_U24_Pos0 color kernels

* Threshold - Update to 1.3

* Add new gdfs and modify generator script

* Jenkins - Check Build & Artifacts

* Tests - Fix platform name

* Modify generator script for ocl/hip dumps and fixes for gdfs

* Add optimized box filter

* Modify kernelGDFs, automate script for OCL/HIP bin dumps for different image sizes

* Optimize phase, magnitude, weighted average and remove trailing spaces

* Optimize magnitude, phase, weighted_average, Minor fix

* Formatting fixes

* Formatting changes

* modify hip pack_ function to fix SAT issue in some kernels

* Place kernelGDFs in independent folders

* Fix runvxTestAllScript, readme and Modify gitignore

* Revert "Optimize phase, magnitude, weighted average and remove trailing spaces"

This reverts commit ae97d35.

* Move all common types/device codes into a new header

* GPU Fix - multiply gpu (#39)

* CMake

* multiply fix

* code cleanup

* GPU Flow - Canny Fix (#36)

* CMake

* canny fix

* code cleanup

* optimize hip_clamp function

* Partial changes to color kernels

* Optimize color kernels

* Cleanup

* Change typecast float to make_float4()

* Add UYVY/YUYV options for ChannelExtract

* Modify globalThreads_x and globalThreads_y

* Kernel GDF modifications

* Script enhancements - add support for single kernel testing, optional build

* Edit script readme

* minor optimization for Phase kernel

* fix comment

* GPU Flow - Bug Fixes (#35)

* fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap

* Graph.GraphState

* fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8

* removing unwanted commits

* fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8

* fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8

* removing unnecessary changes

* Add filter kernel GDFs

* Add test script support for filter kernel diff checks

* Optimizations for filter kernels - initial commit

* Optimize ScaleGaussianHalf, other minor fixes

* Correct some test names in runVisionTests script

* Disable ScaleGaussianHalf temporarily

* Optimize Median3_/min3_/max3_

* Fix convolotion issue for hip

* fix seg fault for ScaleGaussian

* Add support for channelCopy and Lut

* Minor change

* Optimize statistical kernels

* Optimize UV12/UV/IUV and ScaleUp2x2

* Minor change

* Add kernelGDFs for IUV/UV12/UV converts, threshold, convolve

* Update runVisionTests.py and runvxTestAllScript.sh to run with arithmetic/logical/color/filter/statistical kernels

* Add uniform-image inputs with hex pixel values

* Remove all U1 kernel testing

* Test script mods

* Uncomment all kernels except geometric/vision

* Minor fix

* Optimize geometric kernels - initial commit

* Minor changes

* Mods to use floorf, mul24, mad24, Scale_U8_U8_Area

* ScaleImage_U8_U8_Area fixes and Remap initial commit

* Remove #defines for remap

* Pass hip_memory for remap

* Enable scale, warpAffine, warpPerspective testing

* Add kernelGDFs for geometric functions, runvxTestAllScript.sh update

* Fix the bug for ScaleImage_Bilinear_Constant and ScaleImage_Bilinear_Replicate

* GDF and test script corrections

* Disable kernels with attr

* Disable UV12/UV/IUV converts and ScaleUp2x2

* Add vision kernelGDFs

* Vision kernels - initial commit

* Modify helpers to use hip built in functions

* Remove code used for testing

* Minor changes

* use consistent device function names and code clean up

* remove extra semicolon

* switch to builtin functions for hip_lerp

* Formatting fixes

* minor cmake change to print HIP path/version correctly

* Modify harris corners

* Test script mod

* cmake file changes for building GPU backends and CPU properly

* code clean up to make it more readable that there will be a fatal error if OPENCL or HIP not found in the case of the default GPU_SUPPORT=ON

* Remove samples/hip_samples, Add openvx_runvx_tests

* Enhance runvxTestAllScript, Change ReadMe

* Formatting fixes, Code cleanup

* Rename openvx_runvx_tests to openvx_node_tests

* fix a seg fault for Canny node

* remove unused parameter from CannySuppThreshold

* Delete vision_tests outer folder

* Enhancements to runVisionTests.py

* Remove blank lines

* Vision kernel mods

* Formatting fix

* Codacy fixes 1

* Codacy fixes 2

* Codacy fixes 3

* fix cmake

* Make pandas optional

* Code cleanup

* Codacy issue fix

* Codacy issue fix

* Codacy issue fix

* Codacy issue fix

* Codacy issue fix

* Codacy issue fix

* Add backend_type OCL

* Fix CMake issues for HIP backend build.
Fix issues caused by merge.

* Add support for HIP backend.

* add support for VX_DIRECTIVE_AMD_COPY_TO_HIPMEM

* Add HIP backend support for Resize crop function.
Modify unittest to save all images in local folder (test HIP support).

* Fix minor issues in HIP backend.

* Fix rocAL Pybind build issue.
Update rocAL README.md for TurboJpeg installation.

* Fix brightness updation issue.
Set random seed in paramter factory constructor.

* Fix issue with CMake to work for OCL and HIP backend.

* Fix requested deviceID not found error.

* Fix issue with HIP load routine.

* Rename rali to rocAL.

* Fix merge issues.

* Fix build issue for rocAL pybind module.

(cherry picked from commit 0e1a43a)

* Add prefetching support in RALI pipeline.

(cherry picked from commit 0d5cf66)

* Fix build warnings.

(cherry picked from commit b063ca6)

* Fix warnings.

* Clean up.

* Fix merge issues.

* Made suggested PR changes.

* Fix build error.

* Added HIP functionality to AbsoluteDifference

* added HIP support for some functions

* Added HIP support for another batch of functions

* Add HIP supprt for last batch of functions

* Set correct affinity to the below amd_rpp nodes.
1. AbsoluteDifference
2. AccumulateSquared
3. AccumulateWeighted
4. Accumulate
5. Add

* Set correct affinity to the below amd_rpp nodes.
1. BilateralFilter
2. BitwiseAND
3. BitwiseNOT
4. Blend
5. Blur
6. BoxFilter
7. Brightness

* Set correct affinity to the below amd_rpp nodes.
1. CannyEdgeDetector.
2. ChannelCombine.
3. ChannelExtract.
4. ColorTemperature.
5. ColorTwist.
6. Contrast.
7. ControlFlow.
8. CropMirrorNormalize.
9. Crop.
10. CustomConvolution.

* Set correct affinity to the below amd_rpp nodes.
1. DataObjectCopy.
2. Dilate.
3. Erode.
4. ExclusiveOR.
5. Exposure.

* Set correct affinity to the below amd_rpp nodes.
1. FastCornerDetector.
2. Fisheye.
3. Flip.
4. Fog.
5. GammaCorrection.
6. GaussianFilter.
7. GaussianImagePyramid.

* Set correct affinity to the below amd_rpp nodes.
1. HarrisCornerDetector
2. Histogram
3. HistogramBalance
4. Hue
5. WarpPerspective

* Set correct affinity to the below amd_rpp nodes.
1. InclusiveOR
2. Jitter
3. LaplacianImagePyramid
4. LensCorrection
5. LocalBinaryPattern
6. LookUpTable

* Set correct affinity to the below amd_rpp nodes.
1. Magnitude
2. Max
3. MeanStddev
4. MedianFilter
5. MinMaxLoc
6. Min
7. Multiply

* Set correct affinity to the below amd_rpp nodes.
1. Noise
2. NonLinearFilter
3. NonMaxSupression
4. nop
5. Occlusion
6. Phase
7. Pixelate

* Set correct affinity to the below amd_rpp nodes.
1. Rain
2. RandomCropLetterBox
3. RandomShadow
4. Remap
5. ResizeCropMirror
6. ResizeCrop
7. Rotate

* Set correct affinity to the below amd_rpp nodes.
1. Saturation
2. Scale
3. Snow
4. Sobel
5. Subtract
6. TensorAdd

* Set correct affinity to the below amd_rpp nodes.
1. TensorLookup
2. TensorMatrixMultiply
3. TensorMultiply
4. TensorSubtract
5. Thresholding
6. Vignette
7. WarpAffine

* Clean up by reducing the variants from 4 -> 1 in amd_rpp.
1. Retain only batchPD variant and delete all the single, batchPS and batchPDROID variants.
2. Remove the support in header and other files.

* Set affinity to CPU for OCL backend for all nodes in amd_rpp to run without codegen.

* Fix issue with rocAL pybind installation.

* Fix indendation issue with nodes in amd_rpp.

* Add HIP backend support for single nodes in amd_rpp

* Code clean up for amd_rpp nodes.
1. Move memory allocations to initialize function.
2. Add calls to free up memory in uninitialze function.
3. Remove unused declarations.
4. Move batchsize querying to initialize.

* Error handling in amd_rpp nodes. Add return error status for functions which do not have GPU support in RPP.

* Fix formatting for all amd_rpp nodes.

* Fix codacy issue. Change copy_status to STATUS_ERROR_CHECK.

Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@gmail.com>
Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com>
Co-authored-by: Abishek <52214183+r-abishekmcw@users.noreply.github.com>
Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com>
Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com>
Co-authored-by: paveltc <pavel.tcherniaev@amd.com>
Co-authored-by: Hansel Yang <hansyang@amd.com>
Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com>
  • Loading branch information
9 people authored Jul 24, 2021
1 parent dcd3b9f commit d39a4b8
Show file tree
Hide file tree
Showing 281 changed files with 19,210 additions and 63,004 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
build*
docker_build*
hip_build*
.vscode*
opencv-3.4.0/*
openvx_test_results*
Expand Down
190 changes: 0 additions & 190 deletions amd_openvx_extensions/amd_rpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,276 +49,86 @@ include_directories(../../amd_openvx/openvx/include/
)

list(APPEND SOURCES
source/AbsoluteDifference.cpp
source/AbsoluteDifferencebatchPD.cpp
source/AbsoluteDifferencebatchPDROID.cpp
source/AbsoluteDifferencebatchPS.cpp
source/Accumulate.cpp
source/AccumulatebatchPD.cpp
source/AccumulatebatchPDROID.cpp
source/AccumulatebatchPS.cpp
source/AccumulateSquared.cpp
source/AccumulateSquaredbatchPD.cpp
source/AccumulateSquaredbatchPDROID.cpp
source/AccumulateSquaredbatchPS.cpp
source/AccumulateWeighted.cpp
source/AccumulateWeightedbatchPD.cpp
source/AccumulateWeightedbatchPDROID.cpp
source/AccumulateWeightedbatchPS.cpp
source/Add.cpp
source/AddbatchPD.cpp
source/AddbatchPDROID.cpp
source/AddbatchPS.cpp
source/BilateralFilter.cpp
source/BilateralFilterbatchPD.cpp
source/BilateralFilterbatchPDROID.cpp
source/BilateralFilterbatchPS.cpp
source/BitwiseAND.cpp
source/BitwiseANDbatchPD.cpp
source/BitwiseANDbatchPDROID.cpp
source/BitwiseANDbatchPS.cpp
source/BitwiseNOT.cpp
source/BitwiseNOTbatchPD.cpp
source/BitwiseNOTbatchPDROID.cpp
source/BitwiseNOTbatchPS.cpp
source/Blend.cpp
source/BlendbatchPD.cpp
source/BlendbatchPS.cpp
source/BlendbatchPDROID.cpp
source/Blur.cpp
source/BlurbatchPD.cpp
source/BlurbatchPDROID.cpp
source/BlurbatchPS.cpp
source/BoxFilter.cpp
source/BoxFilterbatchPD.cpp
source/BoxFilterbatchPDROID.cpp
source/BoxFilterbatchPS.cpp
source/Brightness.cpp
source/BrightnessbatchPD.cpp
source/BrightnessbatchPDROID.cpp
source/BrightnessbatchPS.cpp
source/CannyEdgeDetector.cpp
source/CannyEdgeDetector.cpp
source/ChannelCombine.cpp
source/ChannelCombinebatchPD.cpp
source/ChannelCombinebatchPS.cpp
source/ChannelExtract.cpp
source/ChannelExtractbatchPD.cpp
source/ChannelExtractbatchPS.cpp
source/ColorTemperature.cpp
source/ColorTemperaturebatchPD.cpp
source/ColorTemperaturebatchPDROID.cpp
source/ColorTemperaturebatchPS.cpp
source/ColorTwist.cpp
source/ColorTwistbatchPD.cpp
source/Contrast.cpp
source/ContrastbatchPD.cpp
source/ContrastbatchPDROID.cpp
source/ContrastbatchPS.cpp
source/ControlFlow.cpp
source/ControlFlowbatchPD.cpp
source/ControlFlowbatchPDROID.cpp
source/ControlFlowbatchPS.cpp
source/copy.cpp
source/CropMirrorNormalizePD.cpp
source/CropPD.cpp
source/CustomConvolution.cpp
source/CustomConvolutionbatchPD.cpp
source/CustomConvolutionbatchPDROID.cpp
source/CustomConvolutionbatchPS.cpp
source/DataObjectCopy.cpp
source/DataObjectCopybatchPD.cpp
source/DataObjectCopybatchPDROID.cpp
source/DataObjectCopybatchPS.cpp
source/Dilate.cpp
source/DilatebatchPD.cpp
source/DilatebatchPDROID.cpp
source/DilatebatchPS.cpp
source/Erode.cpp
source/ErodebatchPD.cpp
source/ErodebatchPDROID.cpp
source/ErodebatchPS.cpp
source/ExclusiveOR.cpp
source/ExclusiveORbatchPD.cpp
source/ExclusiveORbatchPDROID.cpp
source/ExclusiveORbatchPS.cpp
source/Exposure.cpp
source/ExposurebatchPD.cpp
source/ExposurebatchPDROID.cpp
source/ExposurebatchPS.cpp
source/FastCornerDetector.cpp
source/Fisheye.cpp
source/FisheyebatchPD.cpp
source/FisheyebatchPDROID.cpp
source/FisheyebatchPS.cpp
source/Flip.cpp
source/FlipbatchPD.cpp
source/FlipbatchPDROID.cpp
source/FlipbatchPS.cpp
source/Fog.cpp
source/FogbatchPD.cpp
source/FogbatchPDROID.cpp
source/FogbatchPS.cpp
source/GammaCorrection.cpp
source/GammaCorrectionbatchPD.cpp
source/GammaCorrectionbatchPDROID.cpp
source/GammaCorrectionbatchPS.cpp
source/GaussianFilter.cpp
source/GaussianFilterbatchPD.cpp
source/GaussianFilterbatchPDROID.cpp
source/GaussianFilterbatchPS.cpp
source/GaussianImagePyramid.cpp
source/GaussianImagePyramidbatchPD.cpp
source/GaussianImagePyramidbatchPS.cpp
source/HarrisCornerDetector.cpp
source/Histogram.cpp
source/HistogramBalance.cpp
source/HistogramBalancebatchPD.cpp
source/HistogramBalancebatchPDROID.cpp
source/HistogramBalancebatchPS.cpp
source/HistogramEqualize.cpp
source/HistogramEqualizebatchPD.cpp
source/HistogramEqualizebatchPDROID.cpp
source/HistogramEqualizebatchPS.cpp
source/Hue.cpp
source/HuebatchPD.cpp
source/HuebatchPDROID.cpp
source/HuebatchPS.cpp
source/InclusiveOR.cpp
source/InclusiveORbatchPD.cpp
source/InclusiveORbatchPDROID.cpp
source/InclusiveORbatchPS.cpp
source/Jitter.cpp
source/JitterbatchPD.cpp
source/JitterbatchPDROID.cpp
source/JitterbatchPS.cpp
source/LaplacianImagePyramid.cpp
source/LensCorrection.cpp
source/LensCorrectionbatchPD.cpp
source/LensCorrectionbatchPDROID.cpp
source/LensCorrectionbatchPS.cpp
source/LocalBinaryPattern.cpp
source/LocalBinaryPatternbatchPD.cpp
source/LocalBinaryPatternbatchPDROID.cpp
source/LocalBinaryPatternbatchPS.cpp
source/LookUpTable.cpp
source/LookUpTablebatchPD.cpp
source/LookUpTablebatchPDROID.cpp
source/LookUpTablebatchPS.cpp
source/Magnitude.cpp
source/MagnitudebatchPD.cpp
source/MagnitudebatchPDROID.cpp
source/MagnitudebatchPS.cpp
source/Max.cpp
source/MaxbatchPD.cpp
source/MaxbatchPDROID.cpp
source/MaxbatchPS.cpp
source/MeanStddev.cpp
source/MedianFilter.cpp
source/MedianFilterbatchPD.cpp
source/MedianFilterbatchPDROID.cpp
source/MedianFilterbatchPS.cpp
source/Min.cpp
source/MinbatchPD.cpp
source/MinbatchPDROID.cpp
source/MinbatchPS.cpp
source/MinMaxLoc.cpp
source/Multiply.cpp
source/MultiplybatchPD.cpp
source/MultiplybatchPDROID.cpp
source/MultiplybatchPS.cpp
source/Noise.cpp
source/NoisebatchPD.cpp
source/NoisebatchPDROID.cpp
source/NoisebatchPS.cpp
source/NonLinearFilter.cpp
source/NonLinearFilterbatchPD.cpp
source/NonLinearFilterbatchPDROID.cpp
source/NonLinearFilterbatchPS.cpp
source/NonMaxSupression.cpp
source/NonMaxSupressionbatchPD.cpp
source/NonMaxSupressionbatchPDROID.cpp
source/NonMaxSupressionbatchPS.cpp
source/nop.cpp
source/Occlusion.cpp
source/OcclusionbatchPD.cpp
source/OcclusionbatchPDROID.cpp
source/OcclusionbatchPS.cpp
source/Phase.cpp
source/PhasebatchPD.cpp
source/PhasebatchPDROID.cpp
source/PhasebatchPS.cpp
source/Pixelate.cpp
source/PixelatebatchPD.cpp
source/PixelatebatchPDROID.cpp
source/PixelatebatchPS.cpp
source/Rain.cpp
source/RainbatchPD.cpp
source/RainbatchPDROID.cpp
source/RainbatchPS.cpp
source/RandomCropLetterBox.cpp
source/RandomCropLetterBoxbatchPD.cpp
source/RandomCropLetterBoxbatchPDROID.cpp
source/RandomCropLetterBoxbatchPS.cpp
source/RandomShadow.cpp
source/RandomShadowbatchPD.cpp
source/RandomShadowbatchPDROID.cpp
source/RandomShadowbatchPS.cpp
source/Remap.cpp
source/Resize.cpp
source/ResizebatchPD.cpp
source/ResizebatchPDROID.cpp
source/ResizebatchPS.cpp
source/ResizeCrop.cpp
source/ResizeCropbatchPD.cpp
source/ResizeCropbatchPDROID.cpp
source/ResizeCropbatchPS.cpp
source/ResizeCropMirrorPD.cpp
source/Rotate.cpp
source/RotatebatchPD.cpp
source/RotatebatchPDROID.cpp
source/RotatebatchPS.cpp
source/Saturation.cpp
source/SaturationbatchPD.cpp
source/SaturationbatchPDROID.cpp
source/SaturationbatchPS.cpp
source/Scale.cpp
source/ScalebatchPD.cpp
source/ScalebatchPDROID.cpp
source/ScalebatchPS.cpp
source/Snow.cpp
source/SnowbatchPD.cpp
source/SnowbatchPDROID.cpp
source/SnowbatchPS.cpp
source/Sobel.cpp
source/SobelbatchPD.cpp
source/SobelbatchPDROID.cpp
source/SobelbatchPS.cpp
source/Subtract.cpp
source/SubtractbatchPD.cpp
source/SubtractbatchPDROID.cpp
source/SubtractbatchPS.cpp
source/TensorAdd.cpp
source/TensorLookup.cpp
source/TensorMatrixMultiply.cpp
source/TensorMultiply.cpp
source/TensorSubtract.cpp
source/Thresholding.cpp
source/ThresholdingbatchPD.cpp
source/ThresholdingbatchPDROID.cpp
source/ThresholdingbatchPS.cpp
source/Vignette.cpp
source/VignettebatchPD.cpp
source/VignettebatchPDROID.cpp
source/VignettebatchPS.cpp
source/WarpAffine.cpp
source/WarpAffinebatchPD.cpp
source/WarpAffinebatchPDROID.cpp
source/WarpAffinebatchPS.cpp
source/WarpPerspective.cpp
source/WarpPerspectivebatchPD.cpp
source/WarpPerspectivebatchPDROID.cpp
source/WarpPerspectivebatchPS.cpp
source/kernel_rpp.cpp
source/internal_publishKernels.cpp
)
Expand Down
Loading

0 comments on commit d39a4b8

Please sign in to comment.