Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model format proposal. #7329

Closed
kavyasrinet opened this issue Jan 8, 2018 · 7 comments
Closed

Model format proposal. #7329

kavyasrinet opened this issue Jan 8, 2018 · 7 comments
Assignees
Labels
预测 原名Inference,包含Capi预测问题等

Comments

@kavyasrinet
Copy link

kavyasrinet commented Jan 8, 2018

As of now, we store different files (one for each parameter) when we save the model in fluid, I want to look further into how we can structure this better.

@kavyasrinet kavyasrinet self-assigned this Jan 8, 2018
@kavyasrinet kavyasrinet added the 预测 原名Inference,包含Capi预测问题等 label Jan 8, 2018
@kavyasrinet
Copy link
Author

kavyasrinet commented Jan 8, 2018

The current approach in Fluid is as follows:

save_program = Program()
save_block = save_program.global_block()

for each_var in vars: 
     new_var = _clone_var_in_block_(save_block, each_var)
     save_block.append_op(
               type='save',
                inputs={'X': [new_var]},
                outputs={},
                attrs={'file_path': os.path.join(dirname, new_var.name)})

This is an excerpt taken from the save_vars function.

Here, we iterate over all the variables/parameters one by one and create a separate file one for each variable in the parent directory.

I am looking at different ways of efficiently achieving this, without having to create a new file for each variable.

Proposal 1

Inside each op we write to the filename , which is an attribute of the operator.
One proposal is to append to a model file instead of writing to the var_name file. So instead of opening the file in write mode here
we can open it in append mode, i.e. do something similar to the following:

std::ofstream outfile;
if (append)
    outfile.open(name, std::ios_base::app);
else
    outfile.open(name);

Overall, we have one model file that contains a list of operators, one after the other all appended to the same file and the final file is a binary file.

Proposal 2

This proposal is similar to how the most recent version of TensorFlow stores the model. The TensorFlow save method saves three different kinds of files because it stores the graph structure separately from the variable values. The .meta file describes the saved graph structure, so we need to import it before restoring the checkpoint (otherwise it doesn't know what variables the saved checkpoint values correspond to).

In TensorFlow, we have the following files every time a checkpoint is saved:

  1. The meta file: This file describes the saved graph structure, includes GraphDef, SaverDef, and so on; then using: tf.train.import_meta_graph('/tmp/model.ckpt.meta'), will restore Saver and Graph.
  2. The index file: This file is a string-string immutable table(tensorflow::table::Table). Each key is a name of a tensor and the value is a serialized BundleEntryProto. Each BundleEntryProto describes the metadata of a tensor: i.e. which of the "data" files contain the content of a tensor, the offset into that file, checksum, some auxiliary data, etc.
  3. The data file: This file is TensorBundle collection, and has the values of all variables saved in here.

When restoring the model, the following block of code first restores the graph structure from the meta file and then restores all the values of the variables:

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('/tmp/model.ckpt.meta')
    saver.restore(sess, "/tmp/model.ckpt")

There is no file specifically named model.ckpt but we still refer to the saved checkpoint by that name when restoring it, since the Saver source code picks it up from that prefix. In our case, the meta file will be the ProgramDesc.

@kexinzhao
Copy link
Contributor

@kavyasrinet In your Proposal 1, would there be two files (one for model and one for parameters) or one file (for both model and parameters)?

@Xreki
Copy link
Contributor

Xreki commented Jan 9, 2018

I think the design in @kavyasrinet 's Proposal 1, is to save the protobuf message and parameters into one file, a single binary file. In this design, the attribute file_path of save_op will be set to the same file, and we need another attribute offset to record the start offset in the saving file.

@abhinavarora abhinavarora self-assigned this Jan 9, 2018
@abhinavarora
Copy link
Contributor

I prefer storing the Parameters and the Protobuf message separately because it is possible that users might like to explore different Parameters for the same Inference Model. For easy distribution of the model, we can always write a small wrapper that can zip the model and the parameter files together.

@kavyasrinet
Copy link
Author

kavyasrinet commented Jan 9, 2018

@Xreki Thanks for clarifying, yes that's the approach I was proposing.
@abhinavarora I have thought about that approach but the design for that will need to be well thought out before proposing it, since there will be a bunch of additional components that will need to be written to make that design work with PaddlePaddle. The first approach I have proposed above is the fastest to implement. I will write more about other approaches today/tomorrow.

@kexinzhao kexinzhao self-assigned this Jan 9, 2018
@kavyasrinet
Copy link
Author

kavyasrinet commented Jan 9, 2018

Hi @Xreki and @kexinzhao , I have added another proposal in the discussion above. Working on understanding a couple more.

@sidgoyal78
Copy link
Contributor

sidgoyal78 commented Jan 19, 2018

Even in proposal 1, it is a bit more involved than simply appending. Since we need to retrieve the merged files into separate parameter files (and that seems non-trivial because different weights can have different number of parameters, so the length won't be fixed). Hence, some extra information has to be kept somewhere to successfully retrieve separate params.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
None yet
Development

No branches or pull requests

6 participants