forked from ROCm/MIVisionX
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
swapImageHandle #32
Merged
kiritigowda
merged 6 commits into
kiritigowda:kg/openvx-1.3-port
from
LakshmiKumar23:lk/swapImageHandle
Nov 17, 2020
Merged
swapImageHandle #32
kiritigowda
merged 6 commits into
kiritigowda:kg/openvx-1.3-port
from
LakshmiKumar23:lk/swapImageHandle
Nov 17, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kiritigowda
approved these changes
Nov 17, 2020
kiritigowda
added a commit
that referenced
this pull request
Mar 15, 2021
* OpenVX 1.3 Headers Added * Temp Changes to build MIVisionX with OpenVX 1.3 Headers * VX NN 1.3 Port * Lk/openvx port 1.3 (#2) * changes for object-array * threshold functions & adding objectarray to ago_util * remap functions * advanced array functions * vxCreateMatrixFromPatternAndOrigin * setImagePixelValues * OpenVX 1.3 Conformance build * createvirtualscalar * vxCreateVirtualConvolution * syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulMatrix * vxWeightedAverageNode vxuWeightedAverage * vxNonLinearFilterNode, vxuNonLinearFilter * vxLaplacianPyramidNode, vxuLaplacianPyramid * vxLaplacianReconstructNode, vxuLaplacianReconstruct * type fix * latest changes (#5) * createvirtualscalar * vxCreateVirtualConvolution * syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulMatrix * vxWeightedAverageNode vxuWeightedAverage * vxNonLinearFilterNode, vxuNonLinearFilter * vxLaplacianPyramidNode, vxuLaplacianPyramid * vxLaplacianReconstructNode, vxuLaplacianReconstruct * type fix * fixes vxGetUserStructNameByEnm and EnumByName * fixes vxCreateObjectArray and vxCreateVirtualObjectArray * conformance test nodes * bug ifx * declaration added * threshold functions * fixes matrix functions * fixes vxSetImagePixelValues * fixes remap functions * threshold typo fix * changed all of objectarray to replicate delay * threshold kernels * weighted average invalid format fix * fixes vxCreateVirtualScalar * fixes vxCReateVirtualConvolution * fixes vxCopyLUT & vxMapLUT * fixing * weightedaverage passed * fix comment * graph/context/refernce+ fixes * target base passes all - vx_context * waitGraph and verifyGraphBase * undo verifyGraphBase * nonlinearfilter passing * latest chanegs * Update CMakeLists.txt * Update ago_platform.h * Update vx_api.cpp * moving #def from VX/include to vx_ext_amd.h * moving #def from VX/include to vx_ext_amd.h * adjusting spaces Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * Lk/port1.3 fix new (#6) * createvirtualscalar * vxCreateVirtualConvolution * syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulDistribution, scalar syntax fix * vxCreateVirtiaulMatrix * vxWeightedAverageNode vxuWeightedAverage * vxNonLinearFilterNode, vxuNonLinearFilter * vxLaplacianPyramidNode, vxuLaplacianPyramid * vxLaplacianReconstructNode, vxuLaplacianReconstruct * type fix * fixes vxGetUserStructNameByEnm and EnumByName * fixes vxCreateObjectArray and vxCreateVirtualObjectArray * conformance test nodes * bug ifx * declaration added * threshold functions * fixes matrix functions * fixes vxSetImagePixelValues * fixes remap functions * threshold typo fix * changed all of objectarray to replicate delay * threshold kernels * weighted average invalid format fix * fixes vxCreateVirtualScalar * fixes vxCReateVirtualConvolution * fixes vxCopyLUT & vxMapLUT * fixing * weightedaverage passed * fix comment * graph/context/refernce+ fixes * target base passes all - vx_context * waitGraph and verifyGraphBase * undo verifyGraphBase * nonlinearfilter passing * latest chanegs * Update CMakeLists.txt * Update ago_platform.h * Update vx_api.cpp * moving #def from VX/include to vx_ext_amd.h * moving #def from VX/include to vx_ext_amd.h * adjusting spaces * moving things from openvx/include to ext_amd * laplacian update * disable opencl * Update CMakeLists.txt * Update ago_interface.cpp * Update vx_api.cpp * LaplacianReconstruct pass * bug fix * bracket fix * remove debug comments * Update ago_haf_cpu_generic_functions.cpp Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * changes from Hansel and threshold changes (#7) * remove debug comments * Update ago_haf_cpu_generic_functions.cpp * changes to threshold Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * OpenVX 1.3 port - Laplacian Pyramid (#8) * Laplacian pyramid fix (#9) * LaplacianPyramid Pass * CMake fix * Revert change Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * OpenVX Port - fix build failures (#10) * Revert change * opencl changes to threshold * threshold with C code Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * mean std deviation fix (#11) * fixes meanstddev Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * user node 20/74 tests passes (#12) * usernode 20 out of 74 tests pass * removing printf statements * spacing fix Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * USerNode: All tests pass (#13) * spacing fix * adding verification path to fix user node * fixes all tests of the user node Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * SwapImageHandle roi=false cases passes (#14) * fixes all tests of userNode * swapImageHandle - roi=false cases Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * Fixes ReplicateNode-ObjectArray (#17) * swapImageHandle - roi=false cases * fixes replicateNode object array Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * fixes LUT type=S16 failures (#18) * fixes LUT failures Co-authored-by: Hansel Yang <hanselyang123@gmail.com> * H/openvx 1.3 port new (#19) * wait graph fixed * deadlock debug * deadlock fixed * cmake fix * comments added Co-authored-by: LakshmiKumar23 <kumar.lakshmi1994@gmail.com> * OpenVX 1.3 Fix * weighted average validation fix * graphstate fixed * check fix * bug fix * SmokeTestBase.vxReleaseReferenceBase fix (#21) * SmokeTestBase.vxReleaseReferenceBase fix * SmokeTestBase.vxSetReferenceName fix * Graph State - Fix & code cleanup (#22) * Graph Tests - Fix (#23) * code cleanup * user node fixes * graph state fixes * smoketestbase - vxRetainReferenceBase fix (#24) * smoketestbase-vxRetainReferenceBase fix * histogram - Fix CTS Errors(#25) * debugging histogram * histogram fix * code cleanup * Resource Release Fix * OpenVX 1.3 - half scale gaussian (#27) * halfscalegaussian fix * code cleanup * OpenVX 1.3 - smokeTest.vxRetainReference & smokeTestBase.vxSetReferenceName fix (#26) * vxRetainRef line 371 fix * vxSetReferenceName fix * pyramid fixes * smoke test fixes for dangling references * resolves all dangling reference issues * vxUnloadKernels Fix (#28) * CTS - graph delay with pyramid fix (#29) * Canny - CTS bug fix (#30) * divide by 4 when grad size 7 * canny fix * merge * canny fix * code cleanup * Run OpenVX 1.3 CPU Conformance * Travis Fix - OpenVX 1.3 * Travis Cleanup * OpenVX 1.3 - Harris corner CTS Fix (#31) * CMake * harris fix * code cleanup * swapImageHandle (#32) * changing buffers to reflect right values * getting correct pointers for ROI * buffer values for multiple plane images * fixes 12/39 swapImageHandle cases * fixes all swapImageHandle errors * fixes color convert errors (#33) * fixes conversion from RGBX to NV12&IYUV * fixes conversion from UYVY to NV12&IYUV * fixes conversion from YUYV to NV12&IYUV * fixes converion from NV12,NV21&IYUV to RGBX * fixes conversion from NV12,NV21&IYUV to RGB * warp affine - fix (#34) * warp affine fix * code cleanup * Travis - Trace Error * Travis - Check VGA * Travis Updates * Fix - variable scope * CXX Flags & OpenVX Version Update * changed Threshold to support new OpenVX 1.3 format (#38) Co-authored-by: paveltc <pavel.tcherniaev@amd.com> * Threshold - Update to 1.3 * Jenkins - Check Build & Artifacts * Tests - Fix platform name * GPU Fix - multiply gpu (#39) * CMake * multiply fix * code cleanup * GPU Flow - Canny Fix (#36) * CMake * canny fix * code cleanup * GPU Flow - Bug Fixes (#35) * fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap * Graph.GraphState * fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8 * removing unwanted commits * fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8 * fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8 * removing unnecessary changes * GPU Flow - fixes LUT for data type s16 (#40) * GPU Flow - channel combine (#41) * channel combine fix * merge fix Co-authored-by: LakshmiKumar23 <kumar.lakshmi1994@gmail.com> Co-authored-by: Hansel Yang <hanselyang123@gmail.com> Co-authored-by: Kiriti Gowda <kiriti@Kiritis-MacBook-Pro.local> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com> Co-authored-by: Hansel Yang <hansyang@amd.com> Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@Kiritis-MacBook-Pro.local> Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com> Co-authored-by: paveltc <pavel.tcherniaev@amd.com>
kiritigowda
added a commit
that referenced
this pull request
Apr 14, 2021
* Add support for agoKernel_HarrisScore_HVC_HG3_5x5 * Add initial support for integral image * Histogram Node Test Suite Changes & some minor changes in test suite * Minor Changes for Color Convert Cases 128 & 129 * Add test case for integral, fix 7x3 case * Fix integral image * Merge branch 'hip-porting' of https://github.com/MCW-Dev/MIVISION into hip-porting * fixes magnitude kernel * fixes case 128&129 * fixes 119&120 * pass a stream for launching hip arithmetic kernels * fix some formatting issues * remove hip events from some vision kernels * use hipStreamSynchronize and cpu wall time to measure time/wait for launching/completion of hip kernels * Add basic profiling scripts * threshold verifyGraph fix * Add stream parameter to all Hip Kernels Fix formatting issues * Minor Magnitude and Color convert kernel fix * minor modification of runVisionTests script to group nodes for better comparison * Fix Threshold U1 HIP kernel and test suite * Fix - variable scope * Automate OCL/HIP rocprof with runvx * Minor Changes * Add host test cases for Dilate and Erode * Add profiling option param to script * Optimize scale, warpaffine, warpperspective, lut * Optimize filters - sobel, median, erode, dilate, box * cherry-pick "Build Fix - Release/Debug (ROCm#423)" from MIVisionX/master branch * Release/Debug Build Fix * CMakeList.txt cleanup * Readme Updates * cmake clean up for hip * CXX Flags & OpenVX Version Update * Add support for HarrisScore_HVC_HG3_7x7 * Add lut and convolve memory support in HIP * optimize float4_to_s16s function for arithmetic kernels - use vector data type for writting to oa buffer for better performance compared to pixel by pixel write * use make_short4 * optimize s16s_to_float4_ungrouped function to use vector read for s16 data type * Optimized Color Convert kernels * Modifiied LUT kernel * Modifiied LUT kernel * update node names in VisionTests script * optimize ColorDepth kernels * Add new coding style for arithmetic/logical/color hip kernels * Merge pull request #32 from asalmanp/as/hip_kernels_style Add new coding style for arithmetic/logical/color hip kernels * Add auto OCL dump generator script * Add gdfs for arithmetic, logical, color kernels * Modify arithmetic kernels as per new std * Add the missing buffer_offset to the hip_memory * Arithmetic kernels fixes * Modify logical kernels as per new std * Revert to previous min max impl * changed Threshold to support new OpenVX 1.3 format (#38) Co-authored-by: paveltc <pavel.tcherniaev@amd.com> * add the optimized ChannelExtract_U8_U32_Pos0 and ChannelExtract_U8_U24_Pos0 color kernels * Threshold - Update to 1.3 * Add new gdfs and modify generator script * Jenkins - Check Build & Artifacts * Tests - Fix platform name * Modify generator script for ocl/hip dumps and fixes for gdfs * Add optimized box filter * Modify kernelGDFs, automate script for OCL/HIP bin dumps for different image sizes * Optimize phase, magnitude, weighted average and remove trailing spaces * Optimize magnitude, phase, weighted_average, Minor fix * Formatting fixes * Formatting changes * modify hip pack_ function to fix SAT issue in some kernels * Place kernelGDFs in independent folders * Fix runvxTestAllScript, readme and Modify gitignore * Revert "Optimize phase, magnitude, weighted average and remove trailing spaces" This reverts commit ae97d35. * Move all common types/device codes into a new header * GPU Fix - multiply gpu (#39) * CMake * multiply fix * code cleanup * GPU Flow - Canny Fix (#36) * CMake * canny fix * code cleanup * optimize hip_clamp function * Partial changes to color kernels * Optimize color kernels * Cleanup * Change typecast float to make_float4() * Add UYVY/YUYV options for ChannelExtract * Modify globalThreads_x and globalThreads_y * Kernel GDF modifications * Script enhancements - add support for single kernel testing, optional build * Edit script readme * minor optimization for Phase kernel * fix comment * GPU Flow - Bug Fixes (#35) * fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap * Graph.GraphState * fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8 * removing unwanted commits * fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8 * fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8 * removing unnecessary changes * Add filter kernel GDFs * Add test script support for filter kernel diff checks * Optimizations for filter kernels - initial commit * Optimize ScaleGaussianHalf, other minor fixes * Correct some test names in runVisionTests script * Disable ScaleGaussianHalf temporarily * Optimize Median3_/min3_/max3_ * Fix convolotion issue for hip * fix seg fault for ScaleGaussian * Add support for channelCopy and Lut * Minor change * Optimize statistical kernels * Optimize UV12/UV/IUV and ScaleUp2x2 * Minor change * Add kernelGDFs for IUV/UV12/UV converts, threshold, convolve * Update runVisionTests.py and runvxTestAllScript.sh to run with arithmetic/logical/color/filter/statistical kernels * Add uniform-image inputs with hex pixel values * Remove all U1 kernel testing * Test script mods * Uncomment all kernels except geometric/vision * Minor fix * Optimize geometric kernels - initial commit * Minor changes * Mods to use floorf, mul24, mad24, Scale_U8_U8_Area * ScaleImage_U8_U8_Area fixes and Remap initial commit * Remove #defines for remap * Pass hip_memory for remap * Enable scale, warpAffine, warpPerspective testing * Add kernelGDFs for geometric functions, runvxTestAllScript.sh update * Fix the bug for ScaleImage_Bilinear_Constant and ScaleImage_Bilinear_Replicate * GDF and test script corrections * Disable kernels with attr * Disable UV12/UV/IUV converts and ScaleUp2x2 * Add vision kernelGDFs * Vision kernels - initial commit * Modify helpers to use hip built in functions * Remove code used for testing * Minor changes * use consistent device function names and code clean up * remove extra semicolon * switch to builtin functions for hip_lerp * Formatting fixes * minor cmake change to print HIP path/version correctly * Modify harris corners * Test script mod * cmake file changes for building GPU backends and CPU properly * code clean up to make it more readable that there will be a fatal error if OPENCL or HIP not found in the case of the default GPU_SUPPORT=ON * Remove samples/hip_samples, Add openvx_runvx_tests * Enhance runvxTestAllScript, Change ReadMe * Formatting fixes, Code cleanup * Rename openvx_runvx_tests to openvx_node_tests * fix a seg fault for Canny node * remove unused parameter from CannySuppThreshold * Delete vision_tests outer folder * Enhancements to runVisionTests.py * Remove blank lines * Vision kernel mods * Formatting fix * Codacy fixes 1 * Codacy fixes 2 * Codacy fixes 3 * fix cmake * Make pandas optional * Code cleanup * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Add backend_type OCL Co-authored-by: Swetha B S <swetha@multicorewareinc.com> Co-authored-by: Abishek <52214183+r-abishekmcw@users.noreply.github.com> Co-authored-by: kiritigowda <kiritigowda@gmail.com> Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@Kiritis-MacBook-Pro.local> Co-authored-by: LakshmiKumar23 <kumar.lakshmi1994@gmail.com> Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com> Co-authored-by: fiona-gladwin <fionagladwin@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: rrawther <Rajy.MeeyakhanRawther@amd.com> Co-authored-by: Ulagammai <ulagammai@multicorewareinc.com> Co-authored-by: Ulagammai <--local> Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com> Co-authored-by: paveltc <pavel.tcherniaev@amd.com> Co-authored-by: Hansel Yang <hansyang@amd.com> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com>
kiritigowda
added a commit
that referenced
this pull request
Jun 29, 2021
* Optimize scale, warpaffine, warpperspective, lut * Optimize filters - sobel, median, erode, dilate, box * cherry-pick "Build Fix - Release/Debug (ROCm#423)" from MIVisionX/master branch * Release/Debug Build Fix * CMakeList.txt cleanup * Readme Updates * cmake clean up for hip * CXX Flags & OpenVX Version Update * Add support for HarrisScore_HVC_HG3_7x7 * Add lut and convolve memory support in HIP * optimize float4_to_s16s function for arithmetic kernels - use vector data type for writting to oa buffer for better performance compared to pixel by pixel write * use make_short4 * optimize s16s_to_float4_ungrouped function to use vector read for s16 data type * Optimized Color Convert kernels * Modifiied LUT kernel * Modifiied LUT kernel * update node names in VisionTests script * optimize ColorDepth kernels * Add new coding style for arithmetic/logical/color hip kernels * Merge pull request #32 from asalmanp/as/hip_kernels_style Add new coding style for arithmetic/logical/color hip kernels * Add auto OCL dump generator script * Add gdfs for arithmetic, logical, color kernels * Modify arithmetic kernels as per new std * Add the missing buffer_offset to the hip_memory * Arithmetic kernels fixes * Modify logical kernels as per new std * Revert to previous min max impl * changed Threshold to support new OpenVX 1.3 format (#38) Co-authored-by: paveltc <pavel.tcherniaev@amd.com> * add the optimized ChannelExtract_U8_U32_Pos0 and ChannelExtract_U8_U24_Pos0 color kernels * Threshold - Update to 1.3 * Add new gdfs and modify generator script * Jenkins - Check Build & Artifacts * Tests - Fix platform name * Modify generator script for ocl/hip dumps and fixes for gdfs * Add optimized box filter * Modify kernelGDFs, automate script for OCL/HIP bin dumps for different image sizes * Optimize phase, magnitude, weighted average and remove trailing spaces * Optimize magnitude, phase, weighted_average, Minor fix * Formatting fixes * Formatting changes * modify hip pack_ function to fix SAT issue in some kernels * Place kernelGDFs in independent folders * Fix runvxTestAllScript, readme and Modify gitignore * Revert "Optimize phase, magnitude, weighted average and remove trailing spaces" This reverts commit ae97d35. * Move all common types/device codes into a new header * GPU Fix - multiply gpu (#39) * CMake * multiply fix * code cleanup * GPU Flow - Canny Fix (#36) * CMake * canny fix * code cleanup * optimize hip_clamp function * Partial changes to color kernels * Optimize color kernels * Cleanup * Change typecast float to make_float4() * Add UYVY/YUYV options for ChannelExtract * Modify globalThreads_x and globalThreads_y * Kernel GDF modifications * Script enhancements - add support for single kernel testing, optional build * Edit script readme * minor optimization for Phase kernel * fix comment * GPU Flow - Bug Fixes (#35) * fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap * Graph.GraphState * fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8 * removing unwanted commits * fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8 * fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8 * removing unnecessary changes * Add filter kernel GDFs * Add test script support for filter kernel diff checks * Optimizations for filter kernels - initial commit * Optimize ScaleGaussianHalf, other minor fixes * Correct some test names in runVisionTests script * Disable ScaleGaussianHalf temporarily * Optimize Median3_/min3_/max3_ * Fix convolotion issue for hip * fix seg fault for ScaleGaussian * Add support for channelCopy and Lut * Minor change * Optimize statistical kernels * Optimize UV12/UV/IUV and ScaleUp2x2 * Minor change * Add kernelGDFs for IUV/UV12/UV converts, threshold, convolve * Update runVisionTests.py and runvxTestAllScript.sh to run with arithmetic/logical/color/filter/statistical kernels * Add uniform-image inputs with hex pixel values * Remove all U1 kernel testing * Test script mods * Uncomment all kernels except geometric/vision * Minor fix * Optimize geometric kernels - initial commit * Minor changes * Mods to use floorf, mul24, mad24, Scale_U8_U8_Area * ScaleImage_U8_U8_Area fixes and Remap initial commit * Remove #defines for remap * Pass hip_memory for remap * Enable scale, warpAffine, warpPerspective testing * Add kernelGDFs for geometric functions, runvxTestAllScript.sh update * Fix the bug for ScaleImage_Bilinear_Constant and ScaleImage_Bilinear_Replicate * GDF and test script corrections * Disable kernels with attr * Disable UV12/UV/IUV converts and ScaleUp2x2 * Add vision kernelGDFs * Vision kernels - initial commit * Modify helpers to use hip built in functions * Remove code used for testing * Minor changes * use consistent device function names and code clean up * remove extra semicolon * switch to builtin functions for hip_lerp * Formatting fixes * minor cmake change to print HIP path/version correctly * Modify harris corners * Test script mod * cmake file changes for building GPU backends and CPU properly * code clean up to make it more readable that there will be a fatal error if OPENCL or HIP not found in the case of the default GPU_SUPPORT=ON * Remove samples/hip_samples, Add openvx_runvx_tests * Enhance runvxTestAllScript, Change ReadMe * Formatting fixes, Code cleanup * Rename openvx_runvx_tests to openvx_node_tests * fix a seg fault for Canny node * remove unused parameter from CannySuppThreshold * Delete vision_tests outer folder * Enhancements to runVisionTests.py * Remove blank lines * Vision kernel mods * Formatting fix * Codacy fixes 1 * Codacy fixes 2 * Codacy fixes 3 * fix cmake * Make pandas optional * Code cleanup * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Add backend_type OCL * Fix CMake issues for HIP backend build. Fix issues caused by merge. * Add support for HIP backend. * add support for VX_DIRECTIVE_AMD_COPY_TO_HIPMEM * Add HIP backend support for Resize crop function. Modify unittest to save all images in local folder (test HIP support). * Fix minor issues in HIP backend. * Fix rocAL Pybind build issue. Update rocAL README.md for TurboJpeg installation. * Fix brightness updation issue. Set random seed in paramter factory constructor. * Fix issue with CMake to work for OCL and HIP backend. * Fix requested deviceID not found error. * Fix issue with HIP load routine. * Rename rali to rocAL. * Fix merge issues. * Fix build issue for rocAL pybind module. (cherry picked from commit 0e1a43a) * Add prefetching support in RALI pipeline. (cherry picked from commit 0d5cf66) * Fix build warnings. (cherry picked from commit b063ca6) * Fix warnings. * Clean up. * Fix merge issues. * Made suggested PR changes. * Fix build error. * set correct affinity in amd_rpp * Add CMake changes and fix codacy warnings. * Fix core dump issue in rali unittest. * Fix build issue. * cmake cleanup * fix for review comments and unit_test change * fix build error for OpenCL backend Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@gmail.com> Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Abishek <52214183+r-abishekmcw@users.noreply.github.com> Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com> Co-authored-by: Swetha B S <swetha@multicorewareinc.com> Co-authored-by: Ulagammai <ulagammai@multicorewareinc.com> Co-authored-by: fiona-gladwin <fionagladwin@multicorewareinc.com> Co-authored-by: Ulagammai <--local> Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com> Co-authored-by: paveltc <pavel.tcherniaev@amd.com> Co-authored-by: Hansel Yang <hansyang@amd.com> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com> Co-authored-by: shobana-mcw <shobana@multicorewareinc.com>
kiritigowda
added a commit
that referenced
this pull request
Jul 24, 2021
* optimize ColorDepth kernels * Add new coding style for arithmetic/logical/color hip kernels * Merge pull request #32 from asalmanp/as/hip_kernels_style Add new coding style for arithmetic/logical/color hip kernels * Add auto OCL dump generator script * Add gdfs for arithmetic, logical, color kernels * Modify arithmetic kernels as per new std * Add the missing buffer_offset to the hip_memory * Arithmetic kernels fixes * Modify logical kernels as per new std * Revert to previous min max impl * changed Threshold to support new OpenVX 1.3 format (#38) Co-authored-by: paveltc <pavel.tcherniaev@amd.com> * add the optimized ChannelExtract_U8_U32_Pos0 and ChannelExtract_U8_U24_Pos0 color kernels * Threshold - Update to 1.3 * Add new gdfs and modify generator script * Jenkins - Check Build & Artifacts * Tests - Fix platform name * Modify generator script for ocl/hip dumps and fixes for gdfs * Add optimized box filter * Modify kernelGDFs, automate script for OCL/HIP bin dumps for different image sizes * Optimize phase, magnitude, weighted average and remove trailing spaces * Optimize magnitude, phase, weighted_average, Minor fix * Formatting fixes * Formatting changes * modify hip pack_ function to fix SAT issue in some kernels * Place kernelGDFs in independent folders * Fix runvxTestAllScript, readme and Modify gitignore * Revert "Optimize phase, magnitude, weighted average and remove trailing spaces" This reverts commit ae97d35. * Move all common types/device codes into a new header * GPU Fix - multiply gpu (#39) * CMake * multiply fix * code cleanup * GPU Flow - Canny Fix (#36) * CMake * canny fix * code cleanup * optimize hip_clamp function * Partial changes to color kernels * Optimize color kernels * Cleanup * Change typecast float to make_float4() * Add UYVY/YUYV options for ChannelExtract * Modify globalThreads_x and globalThreads_y * Kernel GDF modifications * Script enhancements - add support for single kernel testing, optional build * Edit script readme * minor optimization for Phase kernel * fix comment * GPU Flow - Bug Fixes (#35) * fixes GraphROI.Simple & vxMapRemapPatch.MapRandomRemap * Graph.GraphState * fixes Threshold.OnRandom/4/Graph/BINARY/U8/U8 * removing unwanted commits * fixes Threshold.OnRandom/5/Graph/BINARY/S16/U8 * fixes Threshold.OnRandom/7/Graph/RANGE/S16/U8 * removing unnecessary changes * Add filter kernel GDFs * Add test script support for filter kernel diff checks * Optimizations for filter kernels - initial commit * Optimize ScaleGaussianHalf, other minor fixes * Correct some test names in runVisionTests script * Disable ScaleGaussianHalf temporarily * Optimize Median3_/min3_/max3_ * Fix convolotion issue for hip * fix seg fault for ScaleGaussian * Add support for channelCopy and Lut * Minor change * Optimize statistical kernels * Optimize UV12/UV/IUV and ScaleUp2x2 * Minor change * Add kernelGDFs for IUV/UV12/UV converts, threshold, convolve * Update runVisionTests.py and runvxTestAllScript.sh to run with arithmetic/logical/color/filter/statistical kernels * Add uniform-image inputs with hex pixel values * Remove all U1 kernel testing * Test script mods * Uncomment all kernels except geometric/vision * Minor fix * Optimize geometric kernels - initial commit * Minor changes * Mods to use floorf, mul24, mad24, Scale_U8_U8_Area * ScaleImage_U8_U8_Area fixes and Remap initial commit * Remove #defines for remap * Pass hip_memory for remap * Enable scale, warpAffine, warpPerspective testing * Add kernelGDFs for geometric functions, runvxTestAllScript.sh update * Fix the bug for ScaleImage_Bilinear_Constant and ScaleImage_Bilinear_Replicate * GDF and test script corrections * Disable kernels with attr * Disable UV12/UV/IUV converts and ScaleUp2x2 * Add vision kernelGDFs * Vision kernels - initial commit * Modify helpers to use hip built in functions * Remove code used for testing * Minor changes * use consistent device function names and code clean up * remove extra semicolon * switch to builtin functions for hip_lerp * Formatting fixes * minor cmake change to print HIP path/version correctly * Modify harris corners * Test script mod * cmake file changes for building GPU backends and CPU properly * code clean up to make it more readable that there will be a fatal error if OPENCL or HIP not found in the case of the default GPU_SUPPORT=ON * Remove samples/hip_samples, Add openvx_runvx_tests * Enhance runvxTestAllScript, Change ReadMe * Formatting fixes, Code cleanup * Rename openvx_runvx_tests to openvx_node_tests * fix a seg fault for Canny node * remove unused parameter from CannySuppThreshold * Delete vision_tests outer folder * Enhancements to runVisionTests.py * Remove blank lines * Vision kernel mods * Formatting fix * Codacy fixes 1 * Codacy fixes 2 * Codacy fixes 3 * fix cmake * Make pandas optional * Code cleanup * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Codacy issue fix * Add backend_type OCL * Fix CMake issues for HIP backend build. Fix issues caused by merge. * Add support for HIP backend. * add support for VX_DIRECTIVE_AMD_COPY_TO_HIPMEM * Add HIP backend support for Resize crop function. Modify unittest to save all images in local folder (test HIP support). * Fix minor issues in HIP backend. * Fix rocAL Pybind build issue. Update rocAL README.md for TurboJpeg installation. * Fix brightness updation issue. Set random seed in paramter factory constructor. * Fix issue with CMake to work for OCL and HIP backend. * Fix requested deviceID not found error. * Fix issue with HIP load routine. * Rename rali to rocAL. * Fix merge issues. * Fix build issue for rocAL pybind module. (cherry picked from commit 0e1a43a) * Add prefetching support in RALI pipeline. (cherry picked from commit 0d5cf66) * Fix build warnings. (cherry picked from commit b063ca6) * Fix warnings. * Clean up. * Fix merge issues. * Made suggested PR changes. * Fix build error. * Added HIP functionality to AbsoluteDifference * added HIP support for some functions * Added HIP support for another batch of functions * Add HIP supprt for last batch of functions * Set correct affinity to the below amd_rpp nodes. 1. AbsoluteDifference 2. AccumulateSquared 3. AccumulateWeighted 4. Accumulate 5. Add * Set correct affinity to the below amd_rpp nodes. 1. BilateralFilter 2. BitwiseAND 3. BitwiseNOT 4. Blend 5. Blur 6. BoxFilter 7. Brightness * Set correct affinity to the below amd_rpp nodes. 1. CannyEdgeDetector. 2. ChannelCombine. 3. ChannelExtract. 4. ColorTemperature. 5. ColorTwist. 6. Contrast. 7. ControlFlow. 8. CropMirrorNormalize. 9. Crop. 10. CustomConvolution. * Set correct affinity to the below amd_rpp nodes. 1. DataObjectCopy. 2. Dilate. 3. Erode. 4. ExclusiveOR. 5. Exposure. * Set correct affinity to the below amd_rpp nodes. 1. FastCornerDetector. 2. Fisheye. 3. Flip. 4. Fog. 5. GammaCorrection. 6. GaussianFilter. 7. GaussianImagePyramid. * Set correct affinity to the below amd_rpp nodes. 1. HarrisCornerDetector 2. Histogram 3. HistogramBalance 4. Hue 5. WarpPerspective * Set correct affinity to the below amd_rpp nodes. 1. InclusiveOR 2. Jitter 3. LaplacianImagePyramid 4. LensCorrection 5. LocalBinaryPattern 6. LookUpTable * Set correct affinity to the below amd_rpp nodes. 1. Magnitude 2. Max 3. MeanStddev 4. MedianFilter 5. MinMaxLoc 6. Min 7. Multiply * Set correct affinity to the below amd_rpp nodes. 1. Noise 2. NonLinearFilter 3. NonMaxSupression 4. nop 5. Occlusion 6. Phase 7. Pixelate * Set correct affinity to the below amd_rpp nodes. 1. Rain 2. RandomCropLetterBox 3. RandomShadow 4. Remap 5. ResizeCropMirror 6. ResizeCrop 7. Rotate * Set correct affinity to the below amd_rpp nodes. 1. Saturation 2. Scale 3. Snow 4. Sobel 5. Subtract 6. TensorAdd * Set correct affinity to the below amd_rpp nodes. 1. TensorLookup 2. TensorMatrixMultiply 3. TensorMultiply 4. TensorSubtract 5. Thresholding 6. Vignette 7. WarpAffine * Clean up by reducing the variants from 4 -> 1 in amd_rpp. 1. Retain only batchPD variant and delete all the single, batchPS and batchPDROID variants. 2. Remove the support in header and other files. * Set affinity to CPU for OCL backend for all nodes in amd_rpp to run without codegen. * Fix issue with rocAL pybind installation. * Fix indendation issue with nodes in amd_rpp. * Add HIP backend support for single nodes in amd_rpp * Code clean up for amd_rpp nodes. 1. Move memory allocations to initialize function. 2. Add calls to free up memory in uninitialze function. 3. Remove unused declarations. 4. Move batchsize querying to initialize. * Error handling in amd_rpp nodes. Add return error status for functions which do not have GPU support in RPP. * Fix formatting for all amd_rpp nodes. * Fix codacy issue. Change copy_status to STATUS_ERROR_CHECK. Co-authored-by: Kiriti Nagesh Gowda <kiritigowda@gmail.com> Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com> Co-authored-by: Abishek <52214183+r-abishekmcw@users.noreply.github.com> Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com> Co-authored-by: Pavel Tcherniaev <ptcherni@amd.com> Co-authored-by: paveltc <pavel.tcherniaev@amd.com> Co-authored-by: Hansel Yang <hansyang@amd.com> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes all swapImageHandle errors.
@kiritigowda Ready to be merged