rocAL - Tf pets training (#947)

* Zen DNN - Docker & Tests (#924) * Zen DNN - Docker Updates * Zen DNN - Sample Updates * Codacy - Fix * Zen DNN - Cleanup * Zen DNN - single layer sample * Rocal Updates (#921) * rocal updates for tf training * updates for rocal * tf updates and pytorch bug fixes * repo name change * Update README.md * dockerfile update * [rocAL] Fix rocAL Pybind build issue. * [rocAL] Remove unused function in pipeline. * [rocAL] Change rocAL pybind installation from setup.py to wheel. setup.py install is deprecated in python 3.9 * [rocAL] Make TF pets example dataset compatible with tf2. * [rocAL] Change getImageLabels() compatible with tf. * [rocAL] Add fix to pick wheel from dist installation folder. Remove the old installation files in conda environment. * [rocAL] Remove commented statement. Co-authored-by: shobana-mcw <shobana@multicorewareinc.com> * Docker Update (#928) * turboJPEG version update * turboJPEG version update * turboJPEG version update * TurboJPEG version update * Update mivisionx-opencl-on-ubuntu20.dockerfile * Update zenDNN-HIP.dockerfile * Update level-5.dockerfile * Update level-5.dockerfile * Zen DNN Updates Sync Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> * AMD OpenVX Custom Extension - implementation (#925) * custom node implementation files * fix build errors * custom extension changes for working implementation * add README and documentation * update readme * fix codacy issues and CPU flow * fix cadacy warning * Addressed review comments * minor change * fix formating * amd_migraphx - update readme for extension (#929) * amd_custom - fixes build issue (#935) * fixes build issue * Update CMakeLists.txt * tf_pets_v2 * code_cleanup * minor code cleanup * migraphx extension - update the readme(#936) * vx_amd_migraphx - tests (#923) * batch size support for migraphx * changing to accept tensors of all batch sizes * creates file with results * bug fix * changes to singular test cases - mnist and resnet50 * readme updates * resolving PR comments * resolving PR comments * resolving PR comments * Readme update to reflect tot * formatting * fixing typo * readme update * readme update * OpenVX HIP backend - report correct number of CUs for gfx10+ in the logs (#930) * PyTorch docker file - add argument for specifying version (#938) * add argument for specifying pytorch version for building docker file * add readme for pytorch * rocAL - Fix ROCAL_USE_USER_GIVEN_SIZE_RESTRICTED (#940) * Docker - codacy fix for pr937 (#942) * rocAL - Adding NCHW FP16 SIMD kernel (#926) * Adding NCHW FP16 SIMD kernel for normalization and buffer copy * Fixed some codestyle issues with FP16 kernel * Using FMA SSE instruction for multiply-add ops * Removed extra spaces * Adding fma flag to rocAL CMakeLists * Adding FP16 intrinsics for buffer copies * Setting rounding mode to _MM_FROUND_TO_ZERO * rocAL - README updates for video unit test (#939) * Add README support for video unit test * Update Readme for video unit test * Minor test_suite fix * Update video unit test Readme * Update Readme * Updated README with the explation of test cases and arguments * Minor fix * Add test case samples to video unit test README Also add images for README * Modify sample images * Change sample image dimension * Minor README changes * Minor README changes * Minor change * Minor fix to handle relative input path in video unit test * Resolve codacy warnings * Minor change * Add correct video reader outputs * MIVisionX - cmake cleanup (#943) * OpenCV EXT - Updates & Tests (#944) * OpenCV - Readme updates * Updates - Readme & Tests * OpenCV - Tests Added * Updates * ZenDNN - model compiler (#941) * model compiler - zendnn - mnist layers * bug fix + lrn * layers: batch norm, sum ; bug fixes * fixes lgtm errors * bug fixes * codacy fixes * bug fix * codacy fixes * Update train_withROCAL_withTFRecordReader.py * Resolved PR comments Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: LakshmiKumar23 <lakshmi.kumar@amd.com> Co-authored-by: shobana-mcw <shobana@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Rajy Rawther <Rajy.MeeyakhanRawther@amd.com> Co-authored-by: Aryan Salmanpour <aryan.salmanpour@amd.com> Co-authored-by: root <root@jenkins-worker-rocm-amd-104.local.lan> Co-authored-by: Sundar Rajan Vaithiyanathan <99159823+SundarRajan28@users.noreply.github.com> Co-authored-by: Fiona-MCW <70996026+fiona-gladwin@users.noreply.github.com>
ROCm · Sep 22, 2022 · c6071b6 · c6071b6
1 parent 1ab79b2
commit c6071b6
Show file tree

Hide file tree

Showing 2 changed files with 35 additions and 14 deletions.
diff --git a/rocAL/rocAL_pybind/amd/rocal/plugin/tf.py b/rocAL/rocAL_pybind/amd/rocal/plugin/tf.py
@@ -49,9 +49,16 @@ def __init__(self, pipeline, tensor_layout = types.NCHW, reverse_channels = Fals
             self.loader._name = self.loader._reader
         color_format = b.getOutputColorFormat(self.loader._handle)
         self.p = (1 if (color_format == int(types.GRAY)) else 3)
-
-        self.out = np.zeros(( self.bs*self.n, self.p, int(self.h/self.bs), self.w,), dtype = "uint8")
-
+        if self.tensor_dtype == types.FLOAT:
+            data_type="float32"
+        elif self.tensor_dtype == types.FLOAT16:
+            data_type="float16"
+
+        if(types.NHWC == self.tensor_format):
+            self.out = np.zeros(( self.bs*self.n, int(self.h/self.bs), self.w, self.p), dtype = data_type)
+        else: 
+            self.out = np.zeros(( self.bs*self.n, self.p, int(self.h/self.bs), self.w), dtype = data_type)
+
     def next(self):
         return self.__next__()
 
@@ -68,8 +75,11 @@ def __next__(self):
         if self.loader.run() != 0:
             self.reset()
             raise StopIteration
-
-        self.loader.copyImage(self.out)
+
+        if(types.NCHW == self.tensor_format):
+            self.loader.copyToTensorNCHW(self.out, self.multiplier, self.offset, self.reverse_channels, int(self.tensor_dtype))
+        else:
+            self.loader.copyToTensorNHWC(self.out, self.multiplier, self.offset, self.reverse_channels, int(self.tensor_dtype))
 
         if(self.loader._name == "TFRecordReaderDetection"):
             self.bbox_list =[]

diff --git a/...rocAL_pybind/example/new_api/tf_petsTrainingExample/train_withROCAL_withTFRecordReader.py b/...rocAL_pybind/example/new_api/tf_petsTrainingExample/train_withROCAL_withTFRecordReader.py
@@ -4,9 +4,10 @@
 import amd.rocal.fn as fn
 import amd.rocal.types as types
 
-import tensorflow as tf
+import tensorflow.compat.v1 as tf
 tf.compat.v1.disable_v2_behavior()
 
+
 import numpy as np
 import tensorflow_hub as hub
 
@@ -104,7 +105,7 @@ def main():
 	}
 
 
-	trainPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST)
+	trainPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST, tensor_layout = types.NHWC)
 	with trainPipe:
 		inputs = fn.readers.tfrecord(path=TRAIN_RECORDS_DIR, index_path = "", reader_type=TFRecordReaderType, user_feature_key_map=featureKeyMap,
 		features={
@@ -117,11 +118,17 @@ def main():
 		images = fn.decoders.image(jpegs, user_feature_key_map=featureKeyMap, output_type=types.RGB, path=TRAIN_RECORDS_DIR)
 		resized = fn.resize(images, resize_x=crop_size[0], resize_y=crop_size[1])
 		flip_coin = fn.random.coin_flip(probability=0.5)
-		cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]), mean=[0,0,0], std=[255,255,255], mirror=flip_coin, output_dtype=types.FLOAT, output_layout=types.NCHW, pad_output=False)
+		cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]),
+                                              mean=[0,0,0],
+                                              std=[255,255,255],
+                                              mirror=flip_coin,
+                                              output_dtype=types.FLOAT,
+                                              output_layout=types.NHWC,
+                                              pad_output=False)
 		trainPipe.set_outputs(cmn_images)
 	trainPipe.build()
 
-	valPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST)
+	valPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST, tensor_layout = types.NHWC)
 	with valPipe:
 		inputs = fn.readers.tfrecord(path=VAL_RECORDS_DIR, index_path = "", reader_type=TFRecordReaderType, user_feature_key_map=featureKeyMap,
 		features={
@@ -134,7 +141,13 @@ def main():
 		images = fn.decoders.image(jpegs, user_feature_key_map=featureKeyMap, output_type=types.RGB, path=VAL_RECORDS_DIR)
 		resized = fn.resize(images, resize_x=crop_size[0], resize_y=crop_size[1])
 		flip_coin = fn.random.coin_flip(probability=0.5)
-		cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]), mean=[0,0,0], std=[255,255,255], mirror=flip_coin, output_dtype=types.FLOAT, output_layout=types.NCHW, pad_output=False)
+		cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]),
+                                              mean=[0,0,0],
+                                              std=[255,255,255],
+                                              mirror=flip_coin,
+                                              output_dtype=types.FLOAT,
+                                              output_layout=types.NHWC,
+                                              pad_output=False)
 		valPipe.set_outputs(cmn_images)
 	valPipe.build()
 
@@ -148,23 +161,21 @@ def main():
 		while i < NUM_TRAIN_STEPS:
 
 			for t, (train_image_ndArray, train_label_ndArray) in enumerate(trainIterator, 0):
-				train_image_ndArray_transposed = np.transpose(train_image_ndArray, [0, 2, 3, 1])
 				train_label_one_hot_list = get_label_one_hot(train_label_ndArray)
 				train_loss, _, train_accuracy = sess.run(
 					[cross_entropy_mean, train_op, accuracy],
-					feed_dict={decoded_images: train_image_ndArray_transposed, labels: train_label_one_hot_list})
+					feed_dict={decoded_images: train_image_ndArray, labels: train_label_one_hot_list})
 				print ("Step :: %s\tTrain Loss :: %.2f\tTrain Accuracy :: %.2f%%\t" % (i, train_loss, (train_accuracy * 100)))
 				is_final_step = (i == (NUM_TRAIN_STEPS - 1))
 				if i % EVAL_EVERY == 0 or is_final_step:
 					mean_acc = 0
 					mean_loss = 0
 					print("\n\n-------------------------------------------------------------------------------- BEGIN VALIDATION --------------------------------------------------------------------------------")
 					for j, (val_image_ndArray, val_label_ndArray) in enumerate(valIterator, 0):
-						val_image_ndArray_transposed = np.transpose(val_image_ndArray, [0, 2, 3, 1])
 						val_label_one_hot_list = get_label_one_hot(val_label_ndArray)
 						val_loss, val_accuracy, val_prediction, val_target, correct_predicate = sess.run(
 							[cross_entropy_mean, accuracy, prediction, correct_label, correct_prediction],
-							feed_dict={decoded_images: val_image_ndArray_transposed, labels: val_label_one_hot_list})
+							feed_dict={decoded_images: val_image_ndArray, labels: val_label_one_hot_list})
 						mean_acc += val_accuracy
 						mean_loss += val_loss
 						num_correct_predicate = 0