Description
LuaJIT memory profiler interface and parser
Product: Tarantool
Since: 2.7.1
Audience/target: developers, operations: what's making my Tarantool slow
Root document: where to add or update documentation
SME: @igormunkin @Buristan
Peer reviewer: @
Details
Tarantool 2.7.1 release introduced the LuaJIT memory profiler (gh-5442) and the profile parser (gh-5490).
This version of LuaJIT memory profiler does not support verbose reporting allocations from traces. All allocation from traces are reported as internal. But trace code semantics should be totally the same as for the Lua interpreter (excluding sink optimizations). Also all deallocations reported as internal too.
Tail call optimization does not create a new call frame, so all allocations inside the function called via CALLT
/CALLMT
bytecodes are attributed to its caller.
Usually developers are not interested in information about allocations inside builtins. So if builtin function was called from a Lua function all allocations are attributed to this Lua function. Otherwise attribute this event to a C function.
Assume we have the following Lua chunk named <test.lua>:
1 jit.off()
2 misc.memprof.start("memprof_new.bin")
3 -- Lua does not create a new frame to call string.rep and all allocations are
4 -- attributed not to `append()` function but to the parent scope.
5 local function append(str, rep)
6 return string.rep(str, rep)
7 end
8
9 local t = {}
10 for _ = 1, 1e5 do
11 -- table.insert is a builtin and all corresponding allocations
12 -- are reported in the scope of main chunk.
13 table.insert(t,
14 append('q', _)
15 )
16 end
17 misc.memprof.stop()
Binary data can be read via Tarantool itself via the following command (NB: mind the dash prior to the dump filename):
$ tarantool -e 'require("memprof")(arg[1])' - memprof_new.bin
It parses the binary format provided by memory profiler and render it on human-readable format.
If one run the chunk above the profiler reports approximately the following:
ALLOCATIONS
@test.lua:0, line 14: 1002 531818 0
@test.lua:0, line 13: 1 24 0
@test.lua:0, line 9: 1 32 0
@test.lua:0, line 7: 1 20 0
REALLOCATIONS
@test.lua:0, line 13: 9 16424 8248
Overrides:
@test.lua:0, line 13
@test.lua:0, line 14: 5 1984 992
Overrides:
@test.lua:0, line 14
DEALLOCATIONS
INTERNAL: 20 0 1481
@test.lua:0, line 14: 3 0 7168
Overrides:
@test.lua:0, line 14
Plain text of profiled info has the following format:
@<filename>:<function_line>, line <line where event was detected>: <number of events> <allocated> <freed>
INTERNAL
means that these allocations are caused by internal LuaJIT structures. Note that events are sorted from the most often to the least.
Overrides
means what allocation this reallocation overrides.
Starting profiler from Lua is quite simple:
local started, err, errno = misc.memprof.start(fname)
where fname
is name of the file where profile events are written. Writer for this function perform fwrite()
for each call retrying in case of EINTR
. When the profiling is stopped the fclose()
is called. If it is impossible to open a file for writing or profiler fails to start, returns nil
on failure (plus an error message as a second result and a system-dependent error code as a third result). Otherwise returns true
value.
Stopping profiler from Lua is simple too:
local stopped, err, errno = misc.memprof.stop()
If there is any error occurred at profiling stopping (an error when file descriptor was closed) or during reporting memprof.stop()
returns nil
(plus an error message as a second result and a system-dependent error code as a third result). Returns true
otherwise.
Also, see some possible FAQs in the issue comments
@veod32
Possibly, we can firstly create a short MVP document that can be enough for our developers to have a try. Then publish it and collect developers' feedback on the functionality.