-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Fix #610: fix memory leaks caused by unfreed compiled regex data #2263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #610: fix memory leaks caused by unfreed compiled regex data #2263
Conversation
If regex contains variable substitution, function msre_op_rx_execute(), which handles execution of @rx operator, will compile regex on each request (because variable, and this regex, can be different each time). It does so by calling msc_pregcomp_ex(), which accepts a mem pool because it does allocations for regex data. Until this fix, msre_op_rx_execute() passed rule->ruleset->mp to msc_pregcomp_ex(), which is a ruleset mempool that persists as long as ruleset lives - which means it is long-living and will persist until ModSecurity is stopped/restarted. This created a memory leak because regex data for non-constand regexes was allocated again to each request and not freed after request was processed. Fix that by passing msr->mp mempool to msc_pregcomp_ex() instead, which is a mempool of record corresponding to a tx being processed and will be freed as soon as processing is done.
Allocate pcre_extra structure with pcre_study or pcre_malloc and free it with pcre_free_study(). Pass flag PCRE_STUDY_EXTRA_NEEDED to pcre_study telling it to always allocate pcre_extra because this structure is needed anyway to configure match limits. Until this change, pcre_extra was allocated with eigther pcre_malloc or regular malloc (depending on whether VERSION_NGINX is defined); function msc_pcre_cleanup(), which is responsible for freeing compiled regex data, used either regular free() or pcre_free() (depending VERSION_NGINX too) to free pcre_extra structure (pointer to which is stored in regex->pe). Freeing it like this was incorrect, structure returned by pcre_study() should be freed by function pcre_free_study(). In case PCRE JIT is used, pcre_study() makes some additional allocations itself (at least for JITed executable code), which function pcre_free_study() frees. If pcre_free_study() is not used a memory leak occurs because, while pcre_extra structure itself might be freed, some additional data referenced by it is not. Fix that by calling pcre_free_study() (instead of free()/pcre_free()) on pointer returned by pcre_study(). There also seems to be no reason to allocate pcre_extra with regular malloc (and de-allocate it with free()) - there is a function pcre_malloc(), which is a function pcre_study() itself would use to allocate that memory, and, in default case, pcre_malloc and pcre_free will be set to regular malloc and free. Usage of malloc() seems to be a remaining of old code where manual allocation of pcre_extra was always done with malloc(). So, remove "#if defined(VERSION_NGINX)" branches and always use pcre_malloc() for pcre_extra allocation in case pcre_study did not allocate it yet and always free is with pcre_free_study() (btw. 'pcreapi' man page recommends to replace pcre_free() usages to deallocate pcre_extra with pcre_free_study()). Fixes owasp-modsecurity#610
c74a295
to
5859417
Compare
Any chance to have this merged? Thanks |
Running in prod on 60+ servers for more than 1 year. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running in prod on 60+ servers for more than 1 year.
It definitely solves a technical problem that could be critical in some environments.
This PR has become out-of-date. The change suggested for re_operators.c was made in Jan. 2021 via #2049 . This fix is available in v2.9.4. That change addressed the more severe of the two issues described. (More severe, in my opinion, simply because macro-expansion patterns are likely to exist on many more installations than installations using pcre jit.) Do you want to amend this PR (or replace it) to address only the separate JIT issue? |
@martinhsv: you mean that the fix in re_operators.c is already included now, so you cannot merge this PR anymore. So you need a PR with only msc_pcre.c. Is that correct? |
Basically, yes. (Although I'll note that a big part of my suggestion here is for purposes of clarity. Whether the new/revised PR gets merged in the near future or remains open for a more extended period, it's useful for it to be clear what change it represents -- rather than having it muddled with some other change that has been done previously.) |
Incorporated into #3154 |
This pull request aims to fix 2 memory leaks in ModSecurity v2:
pcre_study
not freed bymsc_pcre_cleanup
becausefree()
/pcre_free
is used to deallocatepcre_extra
structure. The proposed fix is to usepcre_free_study
instead offree
/pcre_free
inmsc_pcre_cleanup
. This change was already proposed in Memory leak if builded modsecurity with --enable-pcre-study #610.msc_pcre_cleanup
not called) for@rx
rules which regexes contain variable substitutions. OWASP CRS contains such rules: 920420 and 920480. The proposed fix is to allocate memory for regexes inmsr->mp
pool instead ofrule->ruleset->mp
inmsre_op_rx_execute
.msre_op_rx_execute
is called for each request (unlikemsre_op_rx_param_init
) and creates volatile regexes used for that one request only (which is done when regex contains variable substitution and therefore can't be compiled once and reused for each request), that regex does not seem to be stored anywhere in ruleset data.These two fixes are related: without the second one, for rules with regex depending on a variable, function
msc_pcre_cleanup
is never called, so the first fix does not help. The other way around, when PCRE JIT is used, the second fix is of little use for such rules, because only the lesser part of memory is freed bymsc_pcre_cleanup
. As noted in #610, first leak also occurs when apache withmod_security2
is reloaded (and so all the rules are freed and re-allocated again, but PCRE study data from the first config load remains unfreed for all regexes).It should be noted that first leak is rather hard to trace as it can't be detected by most automated tools including Valgrind because PCRE uses it's own allocator for allocating memory for JIT, this allocator is not based on
malloc
and allocates memory by directly callingmmap
, and it does not have annotations for Valgrind (VALGRIND_MALLOCLIKE_BLOCK
/VALGRIND_MEMPOOL_ALLOC
).Reproducing the leaks
This test program (using ModSecurity 2 built as a standalone module, code is mostly based on
standalone/main.cpp
) with this minimal config can be used to demonstrate the leaks. (This config with more rules can be used instead, with it memory leaks more). These test configs include rules 920420 and 920480 from OWASP CRS. To test it with PCRE JIT, ModSecurity build was configured with./configure --with-curl --with-yajl --enable-standalone-module --disable-mlogc --enable-pcre-jit --without-lua
. When run without the proposed fixes, it consumes a lot of RAM. Note that applying only one of the fixes does not solve the problem - using only the first fix does not change the situation at all (asmsc_pcre_cleanup
is not called), using only the second one reduces memory leak only slightly (as most of the memory allocated belongs to PCRE JIT data).If PCRE JIT is disabled (
--enable-pcre-jit
configure flag is omitted), memory leaks slower, but leak still occurs. In my tests, after 200000 requests test program reached RSS 145M when ModSecurity was compiled without PCRE JIT (with PCRE JIT RSS reached nearly 869M).The other way to reproduce the leak is to use the same config with ModSecurity 2 running as Apache module and continuously sending requests to it - without the fixes memory usage of Apache workers can be seen to grow constantly, with the fixes memory usage of workers in my tests stayed nead 8 megabytes.
Fixes
The first leak is fixed by calling
pcre_free_study
onpcre_extra
structure pointer instead offree
/pcre_free
inmsc_pcre_cleanup
. Functionmsc_pregcomp_ex
that createsmsc_regex_t
can allocatepcre_extra
"manually" withmalloc
/pcre_malloc
instead of withpcre_study
. That's why the logic is added that check thatpcre_extra
indeed came frompcre_study
by checking ifPCRE_EXTRA_EXECUTABLE_JIT
is set inflags
ofpcre_study
(andpcre_free_study
is only called in that case).msc_pregcomp_ex
does not set that flag itself, and in factpcre_free_study
works the same way: it checks that flag, if it is not set, it callspcre_free
onpcre_extra
pointer (pcre_free
defaults forfree
, but can be set to other function by library user).Please review this solution and tell if it's not the best one. Alternative options are, for example, to call
pcre_free_study
unconditionally or add an extra field tomsc_regex_t
that will indicate whetherpe
field was obtained bypcre_study
.The second leak is fixed by "attaching" allocated data to
msr->mp
instead of torule->ruleset->mp
. This seems safe becausemsc_regex_t
created inmsre_op_rx_execute
seems to be stored nowhere and for subsequent requests a newmsc_regex_t
will be created with calls tomsc_pregcomp_ex
. Anyway, the review here would be appreciated too.Similar leaks in other operators
It seems the same problem exists with
validateHash
https://github.com/SpiderLabs/ModSecurity/blob/12cefbd70f2aab802e1bff53c50786f3b8b89359/apache2/re_operators.c#L787 and similar case is withgsbLookup
https://github.com/SpiderLabs/ModSecurity/blob/12cefbd70f2aab802e1bff53c50786f3b8b89359/apache2/re_operators.c#L1687 and below.I was not able to find the examples of usage of these operators in OWASP CRS or figure out how they work enough to make my own examples - to make working examples reproducing the leaks there.
Still, I can provide analogous fixes for them in this or separate pull request.
Fixes #610