-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model format proposal. #7329
Comments
The current approach in Fluid is as follows: save_program = Program()
save_block = save_program.global_block()
for each_var in vars:
new_var = _clone_var_in_block_(save_block, each_var)
save_block.append_op(
type='save',
inputs={'X': [new_var]},
outputs={},
attrs={'file_path': os.path.join(dirname, new_var.name)}) This is an excerpt taken from the Here, we iterate over all the variables/parameters one by one and create a separate file one for each variable in the parent directory. I am looking at different ways of efficiently achieving this, without having to create a new file for each variable. Proposal 1Inside each op we write to the filename , which is an attribute of the operator. std::ofstream outfile;
if (append)
outfile.open(name, std::ios_base::app);
else
outfile.open(name); Overall, we have one model file that contains a list of operators, one after the other all appended to the same file and the final file is a binary file. Proposal 2This proposal is similar to how the most recent version of TensorFlow stores the model. The TensorFlow In TensorFlow, we have the following files every time a checkpoint is saved:
When restoring the model, the following block of code first restores the graph structure from the with tf.Session() as sess:
saver = tf.train.import_meta_graph('/tmp/model.ckpt.meta')
saver.restore(sess, "/tmp/model.ckpt") There is no file specifically named |
@kavyasrinet In your Proposal 1, would there be two files (one for model and one for parameters) or one file (for both model and parameters)? |
I think the design in @kavyasrinet 's Proposal 1, is to save the protobuf message and parameters into one file, a single binary file. In this design, the attribute |
I prefer storing the Parameters and the Protobuf message separately because it is possible that users might like to explore different Parameters for the same Inference Model. For easy distribution of the model, we can always write a small wrapper that can zip the model and the parameter files together. |
@Xreki Thanks for clarifying, yes that's the approach I was proposing. |
Hi @Xreki and @kexinzhao , I have added another proposal in the discussion above. Working on understanding a couple more. |
Even in proposal 1, it is a bit more involved than simply appending. Since we need to retrieve the merged files into separate parameter files (and that seems non-trivial because different weights can have different number of parameters, so the length won't be fixed). Hence, some extra information has to be kept somewhere to successfully retrieve separate params. |
As of now, we store different files (one for each parameter) when we save the model in fluid, I want to look further into how we can structure this better.
The text was updated successfully, but these errors were encountered: