forked from mschilli/archive-tar-wrapper-perl
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
278 lines (208 loc) · 10.7 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
######################################################################
Archive::Tar::Wrapper 0.16
######################################################################
NAME
Archive::Tar::Wrapper - API wrapper around the 'tar' utility
SYNOPSIS
use Archive::Tar::Wrapper;
my $arch = Archive::Tar::Wrapper->new();
# Open a tarball, expand it into a temporary directory
$arch->read("archive.tgz");
# Iterate over all entries in the archive
$arch->list_reset(); # Reset Iterator
# Iterate through archive
while(my $entry = $arch->list_next()) {
my($tar_path, $phys_path) = @$entry;
print "$tar_path\n";
}
# Get a huge list with all entries
for my $entry (@{$arch->list_all()}) {
my($tar_path, $real_path) = @$entry;
print "Tarpath: $tar_path Tempfile: $real_path\n";
}
# Add a new entry
$arch->add($logic_path, $file_or_stringref);
# Remove an entry
$arch->remove($logic_path);
# Find the physical location of a temporary file
my($tmp_path) = $arch->locate($tar_path);
# Create a tarball
$arch->write($tarfile, $compress);
DESCRIPTION
Archive::Tar::Wrapper is an API wrapper around the 'tar' command line
utility. It never stores anything in memory, but works on temporary
directory structures on disk instead. It provides a mapping between the
logical paths in the tarball and the 'real' files in the temporary
directory on disk.
It differs from Archive::Tar in two ways:
* Archive::Tar::Wrapper doesn't hold anything in memory. Everything is
stored on disk.
* Archive::Tar::Wrapper is 100% compliant with the platform's "tar"
utility, because it uses it internally.
METHODS
my $arch = Archive::Tar::Wrapper->new()
Constructor for the tar wrapper class. Finds the "tar" executable by
searching "PATH" and returning the first hit. In case you want to
use a different tar executable, you can specify it as a parameter:
my $arch = Archive::Tar::Wrapper->new(tar => '/path/to/tar');
Since "Archive::Tar::Wrapper" creates temporary directories to store
tar data, the location of the temporary directory can be specified:
my $arch = Archive::Tar::Wrapper->new(tmpdir => '/path/to/tmpdir');
Tremendous performance increases can be achieved if the temporary
directory is located on a ram disk. Check the "Using RAM Disks"
section below for details.
Additional options can be passed to the "tar" command by using the
"tar_read_options" and "tar_write_options" parameters. Example:
my $arch = Archive::Tar::Wrapper->new(
tar_read_options => "p"
);
will use "tar xfp archive.tgz" to extract the tarball instead of
just "tar xf archive.tgz". Gnu tar supports even more options, these
can be passed in via
my $arch = Archive::Tar::Wrapper->new(
tar_gnu_read_options => ["--numeric-owner"],
);
By default, the "list_*()" functions will return only file entries.
Directories will be suppressed. To have "list_*()" return
directories as well, use
my $arch = Archive::Tar::Wrapper->new(
dirs => 1
);
If more files are added to a tarball than the command line can
handle, "Archive::Tar::Wrapper" will switch from using the command
tar cfv tarfile file1 file2 file3 ...
to
tar cfv tarfile -T filelist
where "filelist" is a file containing all file to be added. The
default for this switch is 512, but it can be changed by setting the
parameter "max_cmd_line_args":
my $arch = Archive::Tar::Wrapper->new(
max_cmd_line_args => 1024
);
$arch->read("archive.tgz")
"read()" opens the given tarball, expands it into a temporary
directory and returns 1 on success und "undef" on failure. The
temporary directory holding the tar data gets cleaned up when $arch
goes out of scope.
"read" handles both compressed and uncompressed files. To find out
if a file is compressed or uncompressed, it tries to guess by
extension, then by checking the first couple of bytes in the
tarfile.
If only a limited number of files is needed from a tarball, they can
be specified after the tarball name:
$arch->read("archive.tgz", "path/file.dat", "path/sub/another.txt");
The file names are passed unmodified to the "tar" command, make sure
that the file paths match exactly what's in the tarball, otherwise
"read()" will fail.
$arch->list_reset()
Resets the list iterator. To be used before the first call to
$arch-list_next()>.
my($tar_path, $phys_path, $type) = $arch->list_next()
Returns the next item in the tarfile. It returns a list of three
scalars: the relative path of the item in the tarfile, the physical
path to the unpacked file or directory on disk, and the type of the
entry (f=file, d=directory, l=symlink). Note that by default,
Archive::Tar::Wrapper won't display directories, unless the "dirs"
parameter is set when running the constructor.
my $items = $arch->list_all()
Returns a reference to a (possibly huge) array of items in the
tarfile. Each item is a reference to an array, containing two
elements: the relative path of the item in the tarfile and the
physical path to the unpacked file or directory on disk.
To iterate over the list, the following construct can be used:
# Get a huge list with all entries
for my $entry (@{$arch->list_all()}) {
my($tar_path, $real_path) = @$entry;
print "Tarpath: $tar_path Tempfile: $real_path\n";
}
If the list of items in the tarfile is big, use "list_reset()" and
"list_next()" instead of "list_all".
$arch->add($logic_path, $file_or_stringref, [$options])
Add a new file to the tarball. $logic_path is the virtual path of
the file within the tarball. $file_or_stringref is either a scalar,
in which case it holds the physical path of a file on disk to be
transferred (i.e. copied) to the tarball. Or it is a reference to a
scalar, in which case its content is interpreted to be the data of
the file.
If no additional parameters are given, permissions and user/group id
settings of a file to be added are copied. If you want different
settings, specify them in the options hash:
$arch->add($logic_path, $stringref,
{ perm => 0755, uid => 123, gid => 10 });
If $file_or_stringref is a reference to a Unicode string, the
"binmode" option has to be set to make sure the string gets written
as proper UTF-8 into the tarfile:
$arch->add($logic_path, $stringref, { binmode => ":utf8" });
$arch->remove($logic_path)
Removes a file from the tarball. $logic_path is the virtual path of
the file within the tarball.
$arch->locate($logic_path)
Finds the physical location of a file, specified by $logic_path,
which is the virtual path of the file within the tarball. Returns a
path to the temporary file "Archive::Tar::Wrapper" created to
manipulate the tarball on disk.
$arch->write($tarfile, $compress)
Write out the tarball by tarring up all temporary files and
directories and store it in $tarfile on disk. If $compress holds a
true value, compression is used.
$arch->tardir()
Return the directory the tarball was unpacked in. This is sometimes
useful to play dirty tricks on "Archive::Tar::Wrapper" by
mass-manipulating unpacked files before wrapping them back up into
the tarball.
$arch->is_gnu()
Checks if the tar executable is a GNU tar by running 'tar --version'
and parsing the output for "GNU".
Using RAM Disks
On Linux, it's quite easy to create a RAM disk and achieve tremendous
speedups while untarring or modifying a tarball. You can either create
the RAM disk by hand by running
# mkdir -p /mnt/myramdisk
# mount -t tmpfs -o size=20m tmpfs /mnt/myramdisk
and then feeding the ramdisk as a temporary directory to
Archive::Tar::Wrapper, like
my $tar = Archive::Tar::Wrapper->new( tmpdir => '/mnt/myramdisk' );
or using Archive::Tar::Wrapper's built-in option 'ramdisk':
my $tar = Archive::Tar::Wrapper->new(
ramdisk => {
type => 'tmpfs',
size => '20m', # 20 MB
},
);
Only drawback with the latter option is that creating the RAM disk needs
to be performed as root, which often isn't desirable for security
reasons. For this reason, Archive::Tar::Wrapper offers a utility
functions that mounts the ramdisk and returns the temporary directory
it's located in:
# Create new ramdisk (as root):
my $tmpdir = Archive::Tar::Wrapper->ramdisk_mount(
type => 'tmpfs',
size => '20m', # 20 MB
);
# Delete a ramdisk (as root):
Archive::Tar::Wrapper->ramdisk_unmount();
Optionally, the "ramdisk_mount()" command accepts a "tmpdir" parameter
pointing to a temporary directory for the ramdisk if you wish to set it
yourself instead of letting Archive::Tar::Wrapper create it
automatically.
KNOWN LIMITATIONS
* Currently, only "tar" programs supporting the "z" option (for
compressing/decompressing) are supported. Future version will use
"gzip" alternatively.
* Currently, you can't add empty directories to a tarball directly.
You could add a temporary file within a directory, and then
"remove()" the file.
* If you delete a file, the empty directories it was located in stay
in the tarball. You could try to "locate()" them and delete them.
This will be fixed, though.
* Filenames containing newlines are causing problems with the list
iterators. To be fixed.
BUGS
Archive::Tar::Wrapper doesn't currently handle filenames with embedded
newlines.
LEGALESE
Copyright 2005 by Mike Schilli, all rights reserved. This program is
free software, you can redistribute it and/or modify it under the same
terms as Perl itself.
AUTHOR
2005, Mike Schilli <cpan@perlmeister.com>