-
Notifications
You must be signed in to change notification settings - Fork 69
/
README
219 lines (150 loc) · 7.98 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
INTRODUCTION
============
The `timeout` script is a resource monitoring program for limiting time and
memory consumption of black-boxed processes under Linux. It runs a command
you specify in the command line and watches for its memory and time
consumption, interrupting the process if it goes out of the limits, and
notifying the user with the preset message.
The killer feature of this script (and, actually, the reason why it appeared)
is that it not only watches the process spawned directly, but also keeps track
of its subsequently forked children. You may choose if the scope of the
watched processes is constrained by process group or by the process tree.
`timeout` may optionally detect hangups, or print time consumption breakdown.
## Note: CGroups
Linux now supports the CGroups feature, which is a much better method to track
memory and time usage and not limited by the issues listed below. This `timeout`
script **does not use CGroups**. While this script continues to work, if
you are on Linux and need a more robust monitoring method you **should use something
based on CGroups**. For system services, have a look at [available systemd
options](https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html).
For a simple command-line script to limit memory usage you could for example
use [runexec](https://github.com/sosy-lab/benchexec/blob/master/doc/runexec.md),
which is part of [BenchExec](https://github.com/sosy-lab/benchexec).
INSTALLATION
============
Installation is neither required nor supported. Just place the script
wherever you feel convenient and invoke it as a usual Linux command.
You will need Perl 5 to run the script, and the /proc filesystem mounted.
USAGE
=====
Like most of such wrapping scripts (nice, ionice, nohup), invocation is:
timeout [options] command [arguments]
The basic options are:
* `-t T` - set up CPU+SYS time limit to T seconds
* `-m M` - set up virtual memory limit to M kilobytes
* `TIMEOUT_IDSTR` environment variable - a custom string prepended to the
message about resource violation (to distinguish from the lines printed by
the command itself). The message itself may be:
- `TIMEOUT` - time limit is exhausted
- `MEM` - memory limit is exhausted
- `HANGUP` - hangup detected (see below)
- `SIGNAL` - the timeout process was killed by a signal
After the message the number of seconds the process has been running for is
printed.
Advanced options:
* `-p .*regexp1.*,NAME1;.*regexp2.*,NAME2` - collect statistics for children
with specified commands. NAMEs define buckets, and regexps (Perl format)
define matching children that fall into these buckets.
If a pattern begins with `CHILD:`, then the runtime of the children of the
matching process (match being performed with the rest of the pattern) is
collected under this category. Note that this is an only way to collect
statistics of time consumed by the children that last for fractions of a
second only.
* `-o outfile` - a file to dump bucket statistics collected by `-p` option.
* `--detect-hangups` - enable hangup detection. If you have specified
buckets through the `-p` option, then if the CPU time in any of the buckets
does not increase during some time, the timeout script reasons that the
controlled process hanged up, and terminates it.
* `--no-info-on-success` - disable printing usage statistics if the
controlled process has been successfully terminated.
* `--confess`, `-c` - when killing the controlled process, return its exit
code or signal+128. This also makes timeout to wait until the controlled
process is terminated. Without this option, the script returns zero.
* `--memlimit-rss`, `-s` - monitor RSS (resident set size) memory limit
More options may be read in the script itself. More documentation will be
added in the future releases!
Exit code of the script is the exit code of the controlled process. If the
controlled process was killed by a signal, the exit code is 128+N, where N is
the number of the signal. This simulates Bash exit code policy. If the
controlled process was terminated by the timeout script itself the script
returns zero because having the timeout terminate the child is expected
behavior. If you want the child's return code in such a situation (which may
be nonzero if the child handles SIGTERM), use `--confess` option.
EXAMPLES
========
Since you already have Perl to run the script itself, the examples will
utilize it.
Basic time limiting:
./timeout -t 2 perl -e 'while ($i<100000000) {$i++;}'
Outputs:
TIMEOUT 2.04 CPU
Basic memory limiting (1000M of virtual memory):
./timeout -m 1000000 perl -e 'while ($i<100000000) {$a->{$i} = $i++;}'
Outputs:
MEM 8.55
Limit both time and memory (adjust number to match the command above):
./timeout -m 1000000 -t 9 perl -e 'while ($i<100000000) {$x->{$i} = $i++;}'
Outputs:
MEM 8.57
./timeout -m 1000000 -t 8 perl -e 'while ($i<100000000) {$x->{$i} = $i++;}'
Outputs:
TIMEOUT 8.02 CPU
Limit time with a lot of short child processes:
./timeout -t 2 perl -e 'while(1){ system qw(perl -e while($i<500){$i++;}); }'
Outputs (in 4 seconds):
TIMEOUT 2.01 CPU
Collect statistics for `heavy' processes:
./timeout -p '.*perl.*,PERL' perl -e 'for (1..20_000_000) {$i++;}'
Outputs:
<time name="PERL">1400</time>
Collect statistics for `lightweight' children:
./timeout -t 10 -p '.*perl.*,PERL;CHILD:.*perl.*,KIDS' perl -e 'for (1..2_000) {system qw(perl -e while($i<500000){$i++;}); $i++;}'
Outputs:
TIMEOUT 10.18 CPU
<time name="PERL">640</time>
<time name="KIDS">10160</time>
Lightweight children should be tracked with special `CHILD:` prefix in their
pattern, compare the above with:
./timeout -t 10 -p '.*perl.*,PERL' perl -e 'for (1..2_000) {system qw(perl -e while($i<500000){$i++;}); $i++;}'
Outputs:
TIMEOUT 10.06 CPU
<time name="PERL">830</time>
Why is the rest not shown in the bucket statistics? All processes spawned are
Perl-s, but the short-living ones aren't tracked fully, since timeout doesn't
wake up often enough.
Detect hangups:
./timeout --detect-hangups -p '.*sleep.*,SLEEP' -t 5 sleep 10000
Outputs:
HANGUP CPU 0.00 MEM 19760 MAXMEM 19760 STALE 6
IMPLEMENTATION DETAILS
======================
The script wakes up several times a second and checks if the process tree (or
group) controlled does not violate the limits.
More explanations how the script works and why it couldn't be implemented
differently may be found here:
http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/40
KNOWN ISSUES
============
The script is slow. It's implemented in Perl, but that's not the reason as
itself. Performance of certain portions of it may easily be improved. Our
measurements demonstrated that it consumes up to 2% of the CPU time during its
work.
The script can overlook some child processes (in case of a double fork, for example).
The resources used by these processes will not be measured and limits won't apply to them.
Furthermore, these child processes will not be killed when the script terminates.
Due to the use of sampling, all measurements are somewhat imprecise.
Sort-lived child processes and memory peaks can be overlooked.
Sometimes waitpid call from inside SIGALRM handler returns -1 for no apparent
reason. This return value is ignored, and the appropriate warning is printed,
but the cause of such a behavior is still unknown.
SIGTERM-sleep-SIGKILL termination sequence is probably implemented poorly, and
it sometimes does not get to sending SIGKILL if SIGTERM doesn't kill the
process controlled. The reasons are still unknown.
The interface is a mess, as the script had been under requirements-driven
development without a specific plan.
ACKNOWLEDGMENTS
===============
The script was initially developed in the Institute for System Programming of
Russian Academy of Sciences (http://ispras.ru/en/) for Linux Driver
Verification project (http://forge.ispras.ru/projects/ldv) in 2010-2011 by
Pavel Shved with some contributions from Alexander Strakh.