-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --files-from and --files-from0 options #321
base: master
Are you sure you want to change the base?
Changes from 2 commits
bbb53e5
c981347
c5fe1e7
da073af
ff04e05
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -528,6 +528,63 @@ def make_path_safe(path): | |
""" | ||
return _safe_re.sub('', path) or '.' | ||
|
||
def iter_delim(f, delim='\n', delim_out=None, read_size=4096): | ||
"""Iterate through a file object's contents, given a delimiter. | ||
|
||
This function returns an iterator based on the contents of the | ||
file-like object f. The contents will be split into chunks based | ||
on delim, and each chunk is returned by the iterator created. By | ||
default, the original delimiter is retained, but a replacement can | ||
be specified using delim_out. | ||
|
||
Both text and binary files are supported, but the type of delim | ||
and delim_out must match the file type, i.e. they must be strings | ||
for text files, and bytes for binary files. | ||
|
||
""" | ||
if delim_out is None: | ||
delim_out = delim | ||
bufs = [] | ||
empty = None | ||
while True: | ||
data = f.read(read_size) | ||
if not data: | ||
break | ||
if empty is None: | ||
empty = '' if isinstance(data, str) else b'' | ||
start = 0 | ||
while True: | ||
pos = data.find(delim, start) | ||
if pos < 0: | ||
break | ||
yield empty.join(bufs) + data[start:pos] + delim_out | ||
start = pos + len(delim) | ||
bufs = [] | ||
if start < len(data): | ||
bufs.append(data[start:]) | ||
if len(bufs) > 0: | ||
yield empty.join(bufs) | ||
|
||
class FileType(argparse.FileType): | ||
"""Extended version of argparse.FileType. | ||
|
||
Allows to specify additional attributes to be set on the returned | ||
file objects. | ||
|
||
""" | ||
def __init__(self, mode='r', bufsize=-1, **kwargs): | ||
super().__init__(mode=mode, bufsize=bufsize) | ||
self._attrs = kwargs | ||
self._binary = 'b' in mode | ||
|
||
def __call__(self, string): | ||
result = super().__call__(string) | ||
# Work around http://bugs.python.org/issue14156 | ||
if self._binary and result is sys.stdin or result is sys.stdout: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the line above doing what you want? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Manual tests using stdin work (this is my primary use case, after all), however, to add automatic tests, one would need to extend 'ArchiverTestCaseBase.attic()' to emulate sys.stdin.buffer to enable binary input from stdin. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you mean it like this: if (self._binary and result is sys.stdin) or result is sys.stdout: ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TW notifications@github.com writes:
Thanks, Rotty There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another thing that occurred to me, and which i'm not exactly sure what to do about, is that the code in I think this monkey-patching is fine for newly-created file objects, but less so for |
||
result = result.buffer | ||
for key, value in self._attrs.items(): | ||
setattr(result, key, value) | ||
return result | ||
|
||
def daemonize(): | ||
"""Detach process from controlling terminal and run in background | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some unit tests just for this function.
consider edge cases like:
empty input file
file starts/ends with delim
2 delims directly following each other