Open
Description
Hello! I've found a performance issue in tensorlayer/examples: batch()
should be called before map()
, which could make your program more efficient. Here is the tensorflow document to support it.
Detailed description is listed below:
- examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py:
test_ds = test_ds.batch(batch_size)
(here) shoule be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py:
test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_quanconv_cifar10.py:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_quanconv_cifar10.py:
test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py:
test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/data_process/tutorial_fast_affine_transform.py:
dataset = dataset.batch(batch_size)
(here) should be called beforedataset = dataset.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/data_process/tutorial_tf_dataset_voc.py:
ds = ds.batch(batch_size)
(here) should be called beforeds = ds.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/basic_tutorials/tutorial_cifar10_cnn_static.py:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/basic_tutorials/tutorial_cifar10_cnn_static.py:
test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here). - examples/deprecated_tutorials/tutorial_imagenet_inceptionV3_distributed.py:
dataset = dataset.batch(batch_size)
(here) should be called beforedataset = dataset.map(_map_fn, num_parallel_calls=max_cpus)
(here).
Besides, you need to check the function called in map()
(e.g., _map_fn
called in dataset.map()
) whether to be affected or not to make the changed code work properly. For example, if _map_fn
needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Activity
zsdonghao commentedon Aug 20, 2021
thanks, we will have a check asap
DLPerf commentedon Aug 31, 2021
Hello,
How long do you need to confirm this problem? @zsdonghao
Thank you~
hanjr92 commentedon Sep 8, 2021
Sorry! It is too late to reply you. I will modify them and update. @DLPerf