Update faq/local/index_en.rst #9947

jamesbing · 2018-04-16T09:00:56Z

translation version 1.0
fix #8953

translation version 1.0

CLAassistant · 2018-04-16T09:01:05Z

All committers have signed the CLA.

abhinavarora · 2018-04-16T18:51:51Z

doc/v2/faq/local/index_en.rst

-TBD
+..  contents::
+
+1. Reduce Memory Consuming


Consuming -> Consumption

abhinavarora · 2018-04-16T18:52:07Z

doc/v2/faq/local/index_en.rst

+1. Reduce Memory Consuming
+-------------------
+
+The training procedure of neural networks demands dozens gigabytes of host memory or serval gigabytes of device memory, which is a rather memory consuming work. The memory consumed by PaddlePaddle framework mainly includes:


dozens gigabytes -> dozens of gigabytes

abhinavarora · 2018-04-16T18:52:41Z

doc/v2/faq/local/index_en.rst

+Reduce DataProvider cache memory
++++++++++++++++++++++++++
+
+PyDataProvider works under asynchronously mechanism, it loads together with the data fetch and shuffle procedure in host memory:


asynchronously -> asynchronous

abhinavarora · 2018-04-16T18:53:20Z

doc/v2/faq/local/index_en.rst

+        Data Files -> Host Memory Pool -> PaddlePaddle Training
+    }
+
+Thus the reduction of the DataProvider cache memory can reduce memory occupancy, meanwhile speed up the data loading procedure before training. However, the size of the memory pool can actually effect the granularity of shuffle，which means a shuffle operation is needed before each data ﬁle reading process to ensure the randomness of data when try to reduce the size of the memory pool.


effect -> affect

abhinavarora · 2018-04-16T18:53:44Z

doc/v2/faq/local/index_en.rst

+
+..  literalinclude:: src/reduce_min_pool_size.py
+
+In such way, the memory consuming can be signiﬁcantly reduced and hence the training procedure can be accelerated. More details are demonstrated in :ref:`api_pydataprovider2`.


In such way -> In this way

memory consuming -> memory consumption

abhinavarora · 2018-04-16T19:01:53Z

doc/v2/faq/local/index_en.rst

+
+* Parameters or gradients during training are oversize, which leads to ﬂoating overﬂow during calculation.
+* The model failed to convergence and divert to a big value.
+* Errors in training data leads to parameters converge to a singularity situation. This may also due to the large scale of input data, which contains millions of parameter values, and that will raise float overflow when operating matrix multiplication.


Errors in training data leads to parameters converge to a singularity situation.

This sentence does not make any sense. What are you trying to say here?

also due to -> also be due to

abhinavarora · 2018-04-16T19:02:42Z

doc/v2/faq/local/index_en.rst

+
+Details can refer to example `machine translation <https://github.com/PaddlePaddle/book/blob/develop/08.machine_translation/train.py#L66>`_ 。
+
+The main diﬀerence of these two methods are:


of these two -> between these two

abhinavarora · 2018-04-16T19:03:12Z

doc/v2/faq/local/index_en.rst

+
+The main diﬀerence of these two methods are:
+
+1. They both block the gradient, but within different occasion，the former one happens when then :code:`optimzier` updates the network parameters while the latter happens when the back propagation computing of activation functions.


but within different occasion. What does this mean?

abhinavarora · 2018-04-16T19:04:12Z

doc/v2/faq/local/index_en.rst

+* Output sequence layer and non sequence layer;
+* Multiple output layers process multiple sequence with different length;
+
+Such issue can be avoid by calling infer interface and set :code:`flatten_result=False`. Thus, the infer interface returns a python list, in which


avoid -> avoided

abhinavarora · 2018-04-16T19:04:36Z

doc/v2/faq/local/index_en.rst

+7.  Fetch parameters’ weight and gradient during training
+-----------------------------------------------
+
+Under certain situations, know the weights of currently training mini-batch can provide more inceptions of many problems. Their value can be acquired by printing values in :code:`event_handler` (note that to gain such parameters when training on GPU, you should set :code:`paddle.event.EndForwardBackward`). Detailed code is as following:


know -> knowing

Update index_en.rst

1063144

translation version 1.0

shanyi15 added translation documentation labels Apr 16, 2018

shanyi15 changed the title ~~Update index_en.rst~~ Update faq/local/index_en.rst Apr 16, 2018

abhinavarora suggested changes Apr 16, 2018

View reviewed changes

Update index_en.rst

c685da8

abhinavarora approved these changes Apr 23, 2018

View reviewed changes

abhinavarora merged commit d060a7f into PaddlePaddle:develop Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update faq/local/index_en.rst #9947

Update faq/local/index_en.rst #9947

jamesbing commented Apr 16, 2018 •

edited by shanyi15

Loading

CLAassistant commented Apr 16, 2018 •

edited

Loading

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018

abhinavarora Apr 16, 2018


		.. literalinclude:: src/reduce_min_pool_size.py

		In such way, the memory consuming can be signiﬁcantly reduced and hence the training procedure can be accelerated. More details are demonstrated in :ref:`api_pydataprovider2`.


		Details can refer to example `machine translation <https://github.com/PaddlePaddle/book/blob/develop/08.machine_translation/train.py#L66>`_ 。

		The main diﬀerence of these two methods are:


		The main diﬀerence of these two methods are:

		1. They both block the gradient, but within different occasion，the former one happens when then :code:`optimzier` updates the network parameters while the latter happens when the back propagation computing of activation functions.

Update faq/local/index_en.rst #9947

Update faq/local/index_en.rst #9947

Conversation

jamesbing commented Apr 16, 2018 • edited by shanyi15 Loading

CLAassistant commented Apr 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamesbing commented Apr 16, 2018 •

edited by shanyi15

Loading

CLAassistant commented Apr 16, 2018 •

edited

Loading