Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint doc #19

Merged
merged 8 commits into from
Jun 28, 2018
Merged

checkpoint doc #19

merged 8 commits into from
Jun 28, 2018

Conversation

seiriosPlus
Copy link
Collaborator

No description provided.

4. **分布式训练**的过程中:每个Trainer都会在checkpoint_dir目录中保存当前Trainer的参数(只有Trainer 0会保存模型的参数),需要**分布式文件系统(HDFS等)**将同checkpoint_dir目录的数据进行合并才能得到完整的数据,恢复训练的时候需要用完整的数据进行恢复。

## 后续规划
1. 支持通过etcd进行参数保存。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用文档,用户不在意后续规划的。

@reyoung
Copy link
Collaborator

reyoung commented Jun 26, 2018

@seiriosPlus 请真的编译出html后,确认文档能被正确的显示和引用!

@seiriosPlus
Copy link
Collaborator Author

@reyoung 我重新修改一下,push的时候没太了解编写方法, 感谢。

@shanyi15
Copy link
Contributor

生成网站后,能否贴一份预览图,放在这个pr里?

@seiriosPlus
Copy link
Collaborator Author

好的,文档我会按照规范重新处理一下。

@seiriosPlus
Copy link
Collaborator Author

checkpoint

@seiriosPlus
Copy link
Collaborator Author

@reyoung @shanyi15 我重新优化了一下格式,上面贴的是中文的预览图。 英文的语法我修饰一下以后再发上来。

@seiriosPlus
Copy link
Collaborator Author

checkpoint user guide en

@shanyi15
Copy link
Contributor

只看了中文版的

  • 这个文档的位置在?请改一下对应的index.rst
    image

  • 这个链接最好写成隐藏的那种,不要列出来
    image

  • “后续规划”,有点design doc的意味了,其实可以不用放后续规划

其他很棒,赞!

@seiriosPlus
Copy link
Collaborator Author

@shanyi15

  1. 后续规划已删
  2. URL链接做了隐藏

注: 目前不清楚Checkpoint功能的位置是否会变化,所以index.rst还不太清楚放在哪里。

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent

@reyoung reyoung merged commit 4674602 into PaddlePaddle:develop Jun 28, 2018
@seiriosPlus seiriosPlus deleted the checkpoint_doc branch June 28, 2018 07:48
RichardWooSJTU pushed a commit to RichardWooSJTU/docs that referenced this pull request Apr 8, 2022
* add softnms, nonlocal, resnet200_vd_backbone
* add CBNet
* update model zoo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants