-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add start_record interface #3128
Add start_record interface #3128
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! One comment.
@@ -54,6 +54,9 @@ def main(): | |||
|
|||
# event_handler to print training and testing info | |||
def event_handler(event): | |||
if isinstance(event, paddle.event.BeginPass): | |||
master_client.paddle_start_get_records(event.pass_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think paddle_start_get_records
is a implementation detail of cloud reader, we need to hide it inside the reader implementation.
Since the new pass is also indicated by reader()
being called again (see here), so maybe it's better to put it here (in the def reader
implementation), so every time a new pass begin, reader()
is called, then paddle_start_get_records
is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need demo
python/paddle/v2/reader/creator.py
Outdated
@@ -98,14 +102,17 @@ def recordio(paths, buf_size=100): | |||
if host_name not in os.environ.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the line host_name = "MASTER_SERVICE_HOST"
we don't have a "master service" currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/v2/reader/creator.py
Outdated
@@ -81,6 +81,9 @@ def reader(): | |||
return dec.buffered(reader, buf_size) | |||
|
|||
|
|||
pass_num = 0 | |||
|
|||
|
|||
def recordio(paths, buf_size=100): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the name recordio
as cloud_reader
is more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
python/paddle/v2/reader/creator.py
Outdated
|
||
def reader(): | ||
c = cloud(addr, buf_size) | ||
c.set_dataset(paths) | ||
c.paddle_start_get_records(pass_id) | ||
global pass_num |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put global
the first line under function define.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -90,6 +93,7 @@ def recordio(paths, buf_size=100): | |||
""" | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some demo code of how to use this reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
python/paddle/v2/reader/creator.py
Outdated
@@ -90,6 +93,7 @@ def recordio(paths, buf_size=100): | |||
""" | |||
import os | |||
import paddle.v2.master.client as cloud | |||
import cPickle as pickle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this, not see anywhere it being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it's my mistake, we need pickle.loads()
with the record fetched from master.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after addressing Wuyi's comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM exept one comment
python/paddle/v2/reader/creator.py
Outdated
""" | ||
Creates a data reader that outputs record one one by one | ||
from given local or cloud recordio path. | ||
Create a data reader that yield a record one bye one from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one bye one => one by one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks!
I deleted the unit test for the older |
Fixed #3127