Skip to content

Fix #610: fix memory leaks caused by unfreed compiled regex data #2263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

asterite3
Copy link

@asterite3 asterite3 commented Feb 6, 2020

This pull request aims to fix 2 memory leaks in ModSecurity v2:

  1. PCRE JIT data produced by pcre_study not freed by msc_pcre_cleanup because free()/pcre_free is used to deallocate pcre_extra structure. The proposed fix is to use pcre_free_study instead of free/pcre_free in msc_pcre_cleanup. This change was already proposed in Memory leak if builded modsecurity with --enable-pcre-study #610.
  2. Regex data not freed (msc_pcre_cleanup not called) for @rx rules which regexes contain variable substitutions. OWASP CRS contains such rules: 920420 and 920480. The proposed fix is to allocate memory for regexes in msr->mp pool instead of rule->ruleset->mp in msre_op_rx_execute. msre_op_rx_execute is called for each request (unlike msre_op_rx_param_init) and creates volatile regexes used for that one request only (which is done when regex contains variable substitution and therefore can't be compiled once and reused for each request), that regex does not seem to be stored anywhere in ruleset data.

These two fixes are related: without the second one, for rules with regex depending on a variable, function msc_pcre_cleanup is never called, so the first fix does not help. The other way around, when PCRE JIT is used, the second fix is of little use for such rules, because only the lesser part of memory is freed by msc_pcre_cleanup. As noted in #610, first leak also occurs when apache with mod_security2 is reloaded (and so all the rules are freed and re-allocated again, but PCRE study data from the first config load remains unfreed for all regexes).

It should be noted that first leak is rather hard to trace as it can't be detected by most automated tools including Valgrind because PCRE uses it's own allocator for allocating memory for JIT, this allocator is not based on malloc and allocates memory by directly calling mmap, and it does not have annotations for Valgrind (VALGRIND_MALLOCLIKE_BLOCK/VALGRIND_MEMPOOL_ALLOC).

Reproducing the leaks

This test program (using ModSecurity 2 built as a standalone module, code is mostly based on standalone/main.cpp) with this minimal config can be used to demonstrate the leaks. (This config with more rules can be used instead, with it memory leaks more). These test configs include rules 920420 and 920480 from OWASP CRS. To test it with PCRE JIT, ModSecurity build was configured with ./configure --with-curl --with-yajl --enable-standalone-module --disable-mlogc --enable-pcre-jit --without-lua. When run without the proposed fixes, it consumes a lot of RAM. Note that applying only one of the fixes does not solve the problem - using only the first fix does not change the situation at all (as msc_pcre_cleanup is not called), using only the second one reduces memory leak only slightly (as most of the memory allocated belongs to PCRE JIT data).

If PCRE JIT is disabled (--enable-pcre-jit configure flag is omitted), memory leaks slower, but leak still occurs. In my tests, after 200000 requests test program reached RSS 145M when ModSecurity was compiled without PCRE JIT (with PCRE JIT RSS reached nearly 869M).

The other way to reproduce the leak is to use the same config with ModSecurity 2 running as Apache module and continuously sending requests to it - without the fixes memory usage of Apache workers can be seen to grow constantly, with the fixes memory usage of workers in my tests stayed nead 8 megabytes.

Fixes

The first leak is fixed by calling pcre_free_study on pcre_extra structure pointer instead of free/pcre_free in msc_pcre_cleanup. Function msc_pregcomp_ex that creates msc_regex_t can allocate pcre_extra "manually" with malloc/pcre_malloc instead of with pcre_study. That's why the logic is added that check that pcre_extra indeed came from pcre_study by checking if PCRE_EXTRA_EXECUTABLE_JIT is set in flags of pcre_study (and pcre_free_study is only called in that case). msc_pregcomp_ex does not set that flag itself, and in fact pcre_free_study works the same way: it checks that flag, if it is not set, it calls pcre_free on pcre_extra pointer (pcre_free defaults for free, but can be set to other function by library user).
Please review this solution and tell if it's not the best one. Alternative options are, for example, to call pcre_free_study unconditionally or add an extra field to msc_regex_t that will indicate whether pe field was obtained by pcre_study.

The second leak is fixed by "attaching" allocated data to msr->mp instead of to rule->ruleset->mp . This seems safe because msc_regex_t created in msre_op_rx_execute seems to be stored nowhere and for subsequent requests a new msc_regex_t will be created with calls to msc_pregcomp_ex. Anyway, the review here would be appreciated too.

Similar leaks in other operators

It seems the same problem exists with validateHash https://github.com/SpiderLabs/ModSecurity/blob/12cefbd70f2aab802e1bff53c50786f3b8b89359/apache2/re_operators.c#L787 and similar case is with gsbLookup https://github.com/SpiderLabs/ModSecurity/blob/12cefbd70f2aab802e1bff53c50786f3b8b89359/apache2/re_operators.c#L1687 and below.
I was not able to find the examples of usage of these operators in OWASP CRS or figure out how they work enough to make my own examples - to make working examples reproducing the leaks there.
Still, I can provide analogous fixes for them in this or separate pull request.

Fixes #610

If regex contains variable substitution, function
msre_op_rx_execute(), which handles execution of @rx operator,
will compile regex on each request (because variable, and this
regex, can be different each time). It does so by calling
msc_pregcomp_ex(), which accepts a mem pool because it does
allocations for regex data. Until this fix, msre_op_rx_execute()
passed rule->ruleset->mp to msc_pregcomp_ex(), which is a ruleset
mempool that persists as long as ruleset lives - which means it
is long-living and will persist until ModSecurity is
stopped/restarted. This created a memory leak because regex data
for non-constand regexes was allocated again to each request and
not freed after request was processed.
Fix that by passing msr->mp mempool to msc_pregcomp_ex() instead,
which is a mempool of record corresponding to a tx being processed
and will be freed as soon as processing is done.
@zimmerle zimmerle added the 2.x Related to ModSecurity version 2.x label Feb 11, 2020
Allocate pcre_extra structure with pcre_study or pcre_malloc
and free it with pcre_free_study(). Pass flag
PCRE_STUDY_EXTRA_NEEDED to pcre_study telling it to always
allocate pcre_extra because this structure is needed anyway to
configure match limits.
Until this change, pcre_extra was allocated with eigther
pcre_malloc or regular malloc (depending on whether VERSION_NGINX
is defined); function msc_pcre_cleanup(), which is responsible
for freeing compiled regex data, used either regular free() or
pcre_free() (depending VERSION_NGINX too) to free pcre_extra
structure (pointer to which is stored in regex->pe).
Freeing it like this was incorrect, structure returned by
pcre_study() should be freed by function pcre_free_study(). In
case PCRE JIT is used, pcre_study() makes some additional
allocations itself (at least for JITed executable code), which
function pcre_free_study() frees. If pcre_free_study() is not
used a memory leak occurs because, while pcre_extra structure
itself might be freed, some additional data referenced by it is
not. Fix that by calling pcre_free_study() (instead of
free()/pcre_free()) on pointer returned by pcre_study().
There also seems to be no reason to allocate pcre_extra with
regular malloc (and de-allocate it with free()) - there is a
function pcre_malloc(), which is a function pcre_study() itself
would use to allocate that memory, and, in default case,
pcre_malloc and pcre_free will be set to regular malloc and free.
Usage of malloc() seems to be a remaining of old code where
manual allocation of pcre_extra was always done with malloc().
So, remove "#if defined(VERSION_NGINX)" branches and always use
pcre_malloc() for pcre_extra allocation in case pcre_study did not
allocate it yet and always free is with pcre_free_study() (btw.
'pcreapi' man page recommends to replace pcre_free() usages to
deallocate pcre_extra with pcre_free_study()).

Fixes owasp-modsecurity#610
@asterite3 asterite3 force-pushed the fix_re_pcre_study_jit_memory_leak branch from c74a295 to 5859417 Compare February 27, 2020 20:47
@marcstern
Copy link

Any chance to have this merged? Thanks

@marcstern
Copy link

Running in prod on 60+ servers for more than 1 year.
It definitely solves a technical problem that could be critical in some environments.
Why isn't it accepted?

Copy link

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running in prod on 60+ servers for more than 1 year.
It definitely solves a technical problem that could be critical in some environments.

@martinhsv
Copy link
Contributor

This PR has become out-of-date.

The change suggested for re_operators.c was made in Jan. 2021 via #2049 . This fix is available in v2.9.4.

That change addressed the more severe of the two issues described. (More severe, in my opinion, simply because macro-expansion patterns are likely to exist on many more installations than installations using pcre jit.)

Do you want to amend this PR (or replace it) to address only the separate JIT issue?

@marcstern
Copy link

@martinhsv: you mean that the fix in re_operators.c is already included now, so you cannot merge this PR anymore. So you need a PR with only msc_pcre.c. Is that correct?

@martinhsv
Copy link
Contributor

Basically, yes.

(Although I'll note that a big part of my suggestion here is for purposes of clarity. Whether the new/revised PR gets merged in the near future or remains open for a more extended period, it's useful for it to be clear what change it represents -- rather than having it muddled with some other change that has been done previously.)

@marcstern
Copy link

Incorporated into #3154

@marcstern marcstern closed this Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to ModSecurity version 2.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants