-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the functionality to read chunk by chunk #439
Comments
I am okay with name changes. |
Oops, I was not aware of that. The implementation looks like while True:
d = stream.read(self.chunk)
if not d:
break
yield (furl, d) Compared to the proposal in #439 (comment) this almost equivalent. The only difference is that Are you ok with me sending a PR to adopt |
That seems reasonable to me. We can have an argument def __iter__(self):
for furl, stream in self.datapipe:
while True:
data = stream.read(self.chunk_size)
if not data:
if self.break_if_empty:
break
else:
continue
yield furl, data However, I think in that case, we will need some mechanism to terminate the iteration when In your code snippet, I think the iterator would end as soon as iter(lambda: stream.read(self.chunk_size), b"") We can continue our discussion here: |
🚀 The feature
This issue is a continuation of the discussion from here.
We are proposing to add some functionality to allow reading chunk by chunk from a stream. Ideally, this should be done without loading the entire stream/response into memory first, but this may not be possible in all cases.
Motivation, pitch
From @pmeier
This can be useful for reading from file stream in chunks (e.g. after
FileOpener
), writing to files in chunks (e.g.Saver
, and etc (e.g.HttpReader
).Alternatives
Instead of adding a new DataPipe, we can consider modifying specific existing DataPipes to add the functionality of reading by chunk. This may be useful in avoiding the need to read the entire stream/response into memory (as I believe it is the case in the current implementation of
HttpReader
, please correct me if this is wrong).We can also do both.
Additional context
No response
cc: @ejguan @VitalyFedyunin
The text was updated successfully, but these errors were encountered: