Paddle API v4 Proposal #9912

cs2be · 2018-04-13T21:11:36Z

Training and Inference Program

train.py

READER_FILE_PATH =  './data.recordio'
READER_BATCH_SIZE = 128
TRAIN_LOOP_STEPS = 100
PARAM_DIR = './params'

reader = fluid.batch_reader(file=READER_FILE_PATH,
                                               batch_size=READER_BATCH_SIZE,
                                               shape=[[13], [1]], 
                                               dtype=['float32', 'float32'],
                                               format='recordio')

with reader.iterate() as (x, y):
  y_predict = fluid.layers.fc(input=x, size=1, act=None)

  cost = fluid.layers.square_error_cost(input=y_predict, label=y)
  avg_cost = fluid.layers.mean(cost)

  sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
  sgd_optimizer.minimize(avg_cost)

fluid.save_parameters(dir=PARAM_DIR)

infer.py

x = fluid.Tensor(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(name='y_predict', input=x, size=1, act=None)
return y_predict

Executing Training and Inference in Python

main.py

# Training
program = fluid.compile('./train.py')
program.run()

# Inference
program = fluid.compile('./infer.py', params='./params')
inputs = {
  'x': [2,4,64,36,7,4,3,2,4,6,5,9,8]
}
results = program.run(input=inputs)
print results['y_predict']

Executing Training and Inference in Bash

>>> paddle train.py

>>> echo "{
  'x': [2,4,64,36,7,4,3,2,4,6,5,9,8],
}" | paddle infer.py --format "json"
{"y_predict": 2.3}

Superjomn · 2018-04-15T11:36:56Z

It looks good, but I have several trivial questions:

fluid.io.create_checkpoint(file=CHECKPOINT_FILE)

fluid.io.create_checkpoint better to be fluid.create_checkpoint, and some strategy needs to be involved in this API, such as leaving some arguments for the user to control the checkpoint frequency and so on.

>>> paddle train.py
why not python train.py, and <<<, is paddle an alias of python or another interpreter?
fluid.parse('./train.py')
The train.py is a code file and should be parsed, generate underlying instructions and finally executed by the interpreter, parse seems just one of the phases.

cs2be · 2018-04-16T22:02:00Z

@Superjomn Thanks for the feedback.

We've changed the implementation to use fluid.save_model. We realized that checkpoint is confusing since we are not doing checkpoints in this example.
paddle in this case is an executable (could be a python script). Its job is to compile the train.py/infer.py to ProgramDesc and run it. During our discussion, we think train.py shouldn't include the run logic, but should only include the logic for the network and training.
We changed the api to fluid.compile, since it behaves kind of like a compiler and could be optimized.

cs2be · 2018-04-18T22:55:11Z

Example of a ProgramDesc in imperative:
batchLoopBlock.txt

…rative Programming. Add scaffold for the imperative Reader.

panyx0718 · 2018-04-19T11:43:49Z

@PaddleCI

panyx0718 · 2018-04-19T11:52:36Z

Thanks for thinking about improving the API. I have a few questions:

Is it possible to feed/fetch data during the training loop?
Is it possible to have "python code" executed during the training loop?
for example, I might want to use scipy or other library to do some preprocessing for input data
or use numpy to calculate some metrics for result value.
Do we always need a separate main.py file to compile train.py and infer.py? Can we run train
and test logic in a single source file in a loop? Say, I run train for 10 steps, and then run eval once.
What if I want to save model after training "loss" value is below a certain value? From your example, it seems I can only save model after finish training for all data?

wangkuiyi · 2018-04-20T03:15:56Z

@panyx0718 Very good summarization! These points are those we must address confidently.

cs2be · 2018-04-25T17:41:12Z

This issue is closed. Please see #10152 for the current Fluid API proposal

cs2be assigned cs2be, abhinavarora, varunarora, wangkuiyi, helinwang, tonyyang-svail and Superjomn Apr 13, 2018

cs2be changed the title ~~Fluid ProgramDesc Generation~~ Paddle API v4 Proposal Apr 13, 2018

panyx0718 self-assigned this Apr 17, 2018

cs2be added a commit to abhinavarora/Paddle that referenced this issue Apr 18, 2018

Please refer to PaddlePaddle#9912. Add in initial design doc for Impe…

fcd2543

…rative Programming. Add scaffold for the imperative Reader.

cs2be mentioned this issue Apr 18, 2018

[WIP] Imperative Programming Scaffold #10039

Closed

helinwang mentioned this issue Apr 20, 2018

Fluid data pipeline interface #10102

Closed

wangkuiyi mentioned this issue Apr 20, 2018

Design Doc: Complete Fluid #10103

Closed

This was referenced Apr 24, 2018

Paddle API v4 proposal #10152

Closed

How to initialize a Fluid program? #10177

Closed

cs2be closed this as completed Apr 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paddle API v4 Proposal #9912

Paddle API v4 Proposal #9912

cs2be commented Apr 13, 2018 •

edited

Loading

Superjomn commented Apr 15, 2018

cs2be commented Apr 16, 2018

cs2be commented Apr 18, 2018 •

edited

Loading

panyx0718 commented Apr 19, 2018

panyx0718 commented Apr 19, 2018

wangkuiyi commented Apr 20, 2018

cs2be commented Apr 25, 2018

Paddle API v4 Proposal #9912

Paddle API v4 Proposal #9912

Comments

cs2be commented Apr 13, 2018 • edited Loading

Training and Inference Program

train.py

infer.py

Executing Training and Inference in Python

main.py

Executing Training and Inference in Bash

Superjomn commented Apr 15, 2018

cs2be commented Apr 16, 2018

cs2be commented Apr 18, 2018 • edited Loading

panyx0718 commented Apr 19, 2018

panyx0718 commented Apr 19, 2018

wangkuiyi commented Apr 20, 2018

cs2be commented Apr 25, 2018

cs2be commented Apr 13, 2018 •

edited

Loading

cs2be commented Apr 18, 2018 •

edited

Loading