Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several issues when using bloaty with large numbers of input files #193

Merged
merged 9 commits into from
May 8, 2020

Conversation

jchl
Copy link
Contributor

@jchl jchl commented May 7, 2020

Fix the following issues, which all arise when using bloaty with large numbers of input files.

  1. Signed integer overflow in a 32-bit build when the total file/VM size exceeds 2GB.
  2. An error about --debug-file arguments if there are more input files than threads.
  3. An assertion failure when using --source-filter if there are more input files than threads.
  4. Out-of-memory issues.

jchl added 5 commits May 7, 2020 09:30
Fix signed integer overflow in the 32-bit version of bloaty when
analyzing libraries with a combined size of >2GB, due to the use of
ssize_t (and size_t) when int64_t should have been used.

Before this change:

$ bloaty /usr/lib/lib*.so.0.0.0
    FILE SIZE        VM SIZE
 --------------  --------------
   0.0%       0 -285.4% -1.82Gi    .bss
  45.6%  1.14Gi 179.3%  1.14Gi    .text
  15.6%   398Mi  61.1%   398Mi    .data
   9.5%   243Mi  37.4%   243Mi    .dynstr
   8.8%   225Mi  34.5%   225Mi    .rodata
   4.7%   121Mi  18.6%   121Mi    .rel.dyn
   4.3%   109Mi  16.7%   109Mi    .eh_frame
   2.8%  70.6Mi  10.8%  70.4Mi    .dynsym
   1.8%  47.1Mi   7.2%  46.9Mi    .data.rel.ro
   1.0%  25.4Mi   3.9%  25.2Mi    .gnu.hash
   1.0%  25.1Mi   3.8%  24.9Mi    .plt
   0.9%  23.6Mi   3.6%  23.4Mi    .gcc_except_table
   0.9%  22.5Mi   0.0%       0    .gnu_debugdata
   0.7%  18.5Mi   2.8%  18.3Mi    .eh_frame_hdr
   0.6%  14.4Mi   1.4%  8.80Mi    [39 Others]
   0.5%  12.6Mi   1.9%  12.4Mi    .rel.plt
   0.4%  9.69Mi   0.0%       0    [Unmapped]
   0.4%  8.98Mi   1.4%  8.80Mi    .gnu.version
   0.3%  6.43Mi   1.0%  6.25Mi    .got.plt
   0.2%  5.26Mi   0.0%       0    .debug_info
   0.2%  4.01Mi   0.0%       0    .debug_str
 100.0% -1.50Gi 100.0%   651Mi    TOTAL

After this change:

$ bloaty /usr/lib/lib*.so.0.0.0
    FILE SIZE        VM SIZE
 --------------  --------------
   0.0%       0  47.1%  2.18Gi    .bss
  45.6%  1.14Gi  24.6%  1.14Gi    .text
  15.6%   398Mi   8.4%   398Mi    .data
   9.5%   243Mi   5.1%   243Mi    .dynstr
   8.8%   225Mi   4.7%   225Mi    .rodata
   4.7%   121Mi   2.5%   121Mi    .rel.dyn
   4.3%   109Mi   2.3%   109Mi    .eh_frame
   2.8%  70.6Mi   1.5%  70.4Mi    .dynsym
   1.8%  47.1Mi   1.0%  46.9Mi    .data.rel.ro
   1.0%  25.4Mi   0.5%  25.2Mi    .gnu.hash
   1.0%  25.1Mi   0.5%  24.9Mi    .plt
   0.9%  23.6Mi   0.5%  23.4Mi    .gcc_except_table
   0.9%  22.5Mi   0.0%       0    .gnu_debugdata
   0.7%  18.5Mi   0.4%  18.3Mi    .eh_frame_hdr
   0.6%  14.4Mi   0.2%  8.80Mi    [39 Others]
   0.5%  12.6Mi   0.3%  12.4Mi    .rel.plt
   0.4%  9.69Mi   0.0%       0    [Unmapped]
   0.4%  8.98Mi   0.2%  8.80Mi    .gnu.version
   0.3%  6.43Mi   0.1%  6.25Mi    .got.plt
   0.2%  5.26Mi   0.0%       0    .debug_info
   0.2%  4.01Mi   0.0%       0    .debug_str
 100.0%  2.50Gi 100.0%  4.64Gi    TOTAL
When multiple `--debug-file` argumuents were specified, bloaty would
sometimes incorrect give errors of the form:

  bloaty: Debug file(s) did not match any input file

even if all the debug files did in fact match some input file.  This was
because the code incorrectly assumed that each thread would only scan a
single file/build-id.
Fix an assertion failure if the `--source-filter` option is used with
more input files than there are threads.

Before:

  $ bloaty /usr/lib/*.so.0.0.0 --source-filter=xxx
  bloaty: /tmp/bloaty2/src/bloaty.cc:1645: void bloaty::Bloaty::ScanAndRollupFile(bloaty::ObjectFile*, bloaty::Rollup*, std::vector<std::basic_string<char> >*) const: Assertion `filesize == file->file_data().data().size()' failed.
  Aborted (core dumped)

After:

  $ /bloaty /usr/lib/*.so.0.0.0 --source-filter=xxx
      FILE SIZE        VM SIZE
   --------------  --------------
   100.0%       0 100.0%       0    TOTAL
  Filtering enabled (source_filter); omitted file = 35.8Mi, vm = 35.6Mi of entries
Instead of mmaping all input files for the duration of the process's
execution, only mmap each input file while that file is being processed.

This avoids running out of virtual address space, especially on 32-bit
systems, when running bloaty on large numbers of input files.

After this change, I am able to run bloaty on around 4GB of input files
plus 12GB of debug files, without error, whereas previously I would
get either:

  terminate called after throwing an instance of 'std::system_error'
    what():  Resource temporarily unavailable

or:

  terminate called after throwing an instance of 'std::bad_alloc'
    what():  std::bad_alloc
@jchl jchl changed the title Fix signed integer overflow due to use of (s)size_t types Fix several issues when using bloaty with large numbers of input files May 7, 2020
Copy link
Member

@haberman haberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up, I'm glad you were able to find fixes to your issues.

src/bloaty.cc Outdated Show resolved Hide resolved
src/bloaty.cc Outdated Show resolved Hide resolved
@haberman
Copy link
Member

haberman commented May 7, 2020

Thanks for these fixes, this looks great now.

Do you think any of these fixes have reasonably easy ways to verify in unit tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants