Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paddle API v4 Proposal #9912

Closed
cs2be opened this issue Apr 13, 2018 · 7 comments
Closed

Paddle API v4 Proposal #9912

cs2be opened this issue Apr 13, 2018 · 7 comments

Comments

@cs2be
Copy link
Contributor

cs2be commented Apr 13, 2018

Training and Inference Program

train.py

READER_FILE_PATH =  './data.recordio'
READER_BATCH_SIZE = 128
TRAIN_LOOP_STEPS = 100
PARAM_DIR = './params'

reader = fluid.batch_reader(file=READER_FILE_PATH,
                                               batch_size=READER_BATCH_SIZE,
                                               shape=[[13], [1]], 
                                               dtype=['float32', 'float32'],
                                               format='recordio')

with reader.iterate() as (x, y):
  y_predict = fluid.layers.fc(input=x, size=1, act=None)

  cost = fluid.layers.square_error_cost(input=y_predict, label=y)
  avg_cost = fluid.layers.mean(cost)

  sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
  sgd_optimizer.minimize(avg_cost)

fluid.save_parameters(dir=PARAM_DIR)

infer.py

x = fluid.Tensor(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(name='y_predict', input=x, size=1, act=None)
return y_predict

Executing Training and Inference in Python

main.py

# Training
program = fluid.compile('./train.py')
program.run()

# Inference
program = fluid.compile('./infer.py', params='./params')
inputs = {
  'x': [2,4,64,36,7,4,3,2,4,6,5,9,8]
}
results = program.run(input=inputs)
print results['y_predict']

Executing Training and Inference in Bash

>>> paddle train.py
>>> echo "{
  'x': [2,4,64,36,7,4,3,2,4,6,5,9,8],
}" | paddle infer.py --format "json"
{"y_predict": 2.3}
@cs2be cs2be changed the title Fluid ProgramDesc Generation Paddle API v4 Proposal Apr 13, 2018
@Superjomn
Copy link
Contributor

It looks good, but I have several trivial questions:

  1. fluid.io.create_checkpoint(file=CHECKPOINT_FILE)

fluid.io.create_checkpoint better to be fluid.create_checkpoint, and some strategy needs to be involved in this API, such as leaving some arguments for the user to control the checkpoint frequency and so on.

  1. >>> paddle train.py
    why not python train.py, and <<<, is paddle an alias of python or another interpreter?

  2. fluid.parse('./train.py')
    The train.py is a code file and should be parsed, generate underlying instructions and finally executed by the interpreter, parse seems just one of the phases.

@cs2be
Copy link
Contributor Author

cs2be commented Apr 16, 2018

@Superjomn Thanks for the feedback.

  1. We've changed the implementation to use fluid.save_model. We realized that checkpoint is confusing since we are not doing checkpoints in this example.

  2. paddle in this case is an executable (could be a python script). Its job is to compile the train.py/infer.py to ProgramDesc and run it. During our discussion, we think train.py shouldn't include the run logic, but should only include the logic for the network and training.

  3. We changed the api to fluid.compile, since it behaves kind of like a compiler and could be optimized.

@panyx0718 panyx0718 self-assigned this Apr 17, 2018
@cs2be
Copy link
Contributor Author

cs2be commented Apr 18, 2018

Example of a ProgramDesc in imperative:
batchLoopBlock.txt

cs2be added a commit to abhinavarora/Paddle that referenced this issue Apr 18, 2018
…rative Programming. Add scaffold for the imperative Reader.
@panyx0718
Copy link
Contributor

@PaddleCI

@panyx0718
Copy link
Contributor

Thanks for thinking about improving the API. I have a few questions:

  1. Is it possible to feed/fetch data during the training loop?

  2. Is it possible to have "python code" executed during the training loop?
    for example, I might want to use scipy or other library to do some preprocessing for input data
    or use numpy to calculate some metrics for result value.

  3. Do we always need a separate main.py file to compile train.py and infer.py? Can we run train
    and test logic in a single source file in a loop? Say, I run train for 10 steps, and then run eval once.

  4. What if I want to save model after training "loss" value is below a certain value? From your example, it seems I can only save model after finish training for all data?

@wangkuiyi
Copy link
Collaborator

@panyx0718 Very good summarization! These points are those we must address confidently.

@cs2be
Copy link
Contributor Author

cs2be commented Apr 25, 2018

This issue is closed. Please see #10152 for the current Fluid API proposal

@cs2be cs2be closed this as completed Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants