5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… #4073

altmannmarcelo · 2020-10-05T19:20:40Z

… are executed

https://jira.percona.com/browse/PS-7346

Function audit_log_general_record can be called twice for each record.
1st time it is called with a 4096k buffer. It will estimate via
query_length what can be the maximum size of incoming query if every
character has to be converted to utf8mb4(basically 4 times its size).
If this size fits the buffer, it will convert event->general_query.str
to utf8mb4, use the current buffer to store the converted query and
advance the return pointer by query_length bytes. After we have the real
query, we escape it.
Otherwise (if the buffer is not big enough to convert the query), we
will fall into the else branch. Here we calculate the size of the
current query after escaping it and multiply by 4 (since we have not
converted the query to utf8mb4) + the maximum size of incoming query if
every character has to be converted to utf8mb4.
We continue by escaping user / host / external_user / ip and db storing
the required bytes into buflen_estimated.
After this point we do a check to validate if the current buffer is big
enough to hold buflen_estimated (the converted query + the full output
of audit log entry). In case we don't have a big enough buffer, we set
outlen to buflen_estimated and return NULL. Then audit_log_notify will
allocate outlen bytes into its buffer and call audit_log_general_record
for the 2nd time, this time with the resized buffer.

Issue:
Before 9fd57d3 we estimated buflen_estimated as twice of the output. If
it was not enough we would increase the size of buffer by 4 times before
calling audit_log_general_record the second time.
After 9fd57d3 the else branch has been enhance to make a better estimate
of the size of the output + size of the query, taking into consideration
utf8mb4, however it missed two points:

buflen_estimated was doubling the size. This is no longer required
since the else branch already take into account escaped utf8mb4 query +
non escaped utf8mb4 query.
When estimating the size of the non escaped utf8mb4 query, it was
wrongly multiplying its size by my_charset_utf8mb4_general_ci.mbmaxlen,
however, this computation has already been performed before the if/else.

… are executed https://jira.percona.com/browse/PS-7346 Function audit_log_general_record can be called twice for each record. 1st time it is called with a 4096k buffer. It will estimate via query_length what can be the maximum size of incoming query if every character has to be converted to utf8mb4(basically 4 times its size). If this size fits the buffer, it will convert event->general_query.str to utf8mb4, use the current buffer to store the converted query and advance the return pointer by query_length bytes. After we have the real query, we escape it. Otherwise (if the buffer is not big enough to convert the query), we will fall into the else branch. Here we calculate the size of the current query after escaping it and multiply by 4 (since we have not converted the query to utf8mb4) + the maximum size of incoming query if every character has to be converted to utf8mb4. We continue by escaping user / host / external_user / ip and db storing the required bytes into buflen_estimated. After this point we do a check to validate if the current buffer is big enough to hold buflen_estimated (the converted query + the full output of audit log entry). In case we don't have a big enough buffer, we set outlen to buflen_estimated and return NULL. Then audit_log_notify will allocate outlen bytes into its buffer and call audit_log_general_record for the 2nd time, this time with the resized buffer. Issue: Before 9fd57d3 we estimated buflen_estimated as twice of the output. If it was not enough we would increase the size of buffer by 4 times before calling audit_log_general_record the second time. After 9fd57d3 the else branch has been enhance to make a better estimate of the size of the output + size of the query, taking into consideration utf8mb4, however it missed two points: 1) buflen_estimated was doubling the size. This is no longer required since the else branch already take into account escaped utf8mb4 query + non escaped utf8mb4 query. 2) When estimating the size of the non escaped utf8mb4 query, it was wrongly multiplying its size by my_charset_utf8mb4_general_ci.mbmaxlen, however, this computation has already been performed before the if/else.

altmannmarcelo · 2020-10-05T19:23:04Z

https://ps56.cd.percona.com/job/percona-server-5.6-param/130/

altmannmarcelo · 2020-10-05T19:25:26Z

5.6 is a bit difficult to have a test case without memory instrumentation on PS. Where is the example of 5.7 https://github.com/altmannmarcelo/percona-server/commit/0db8c9eff5820c8f8a4950111f9e1f05fa7d9ef2 before and after the patch (audit code is exactly the same for 5.6 and 5.7):

Before

*************************** 1. row ***************************
       event_name: memory/sql/THD::variables
    current_count: 4
    current_alloc: 400.00 MiB
current_avg_alloc: 100.00 MiB
       high_count: 6
       high_alloc: 400.00 MiB
   high_avg_alloc: 66.67 MiB

After:

*************************** 1. row ***************************
       event_name: memory/sql/THD::variables
    current_count: 4
    current_alloc: 80.00 MiB
current_avg_alloc: 20.00 MiB
       high_count: 6
       high_alloc: 80.00 MiB
   high_avg_alloc: 13.33 MiB

percona-ysorokin

LGTM

percona-ysorokin · 2020-10-06T12:22:10Z

@altmannmarcelo Please, also double check with ASan and Valgrind. No need to run full set of suites. Just try to do the same measurements you did for regular Debug build under these checkers.

altmannmarcelo · 2020-10-06T14:30:41Z

@percona-ysorokin
Ran all tests related to audit_log (audit_log_buffer_size_overflow audit_log_charset audit_log_csv audit_log_default_db audit_log_filter_commands audit_log_filter_users audit_log_install_bug1435606 audit_log_install audit_log_json audit_log_long_records audit_log_many_connections audit_log_new audit_log_old audit_log_rotate audit_log_startup audit_log_syslog audit_log audit_log_threadpool) :

https://ps56.cd.percona.com/job/percona-server-5.6-asan-param/11/ - ASAN ok
https://ps56.cd.percona.com/job/percona-server-5.6-asan-param/12/ - Valgrind doesn't have any definitely lost

percona-ysorokin · 2020-10-07T07:24:49Z

Good to go then

altmannmarcelo requested a review from percona-ysorokin October 5, 2020 19:20

altmannmarcelo changed the title ~~Fixed PS-7346 - Huge memory usage for audit plugin when large queries…~~ 5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… Oct 5, 2020

percona-ysorokin approved these changes Oct 6, 2020

View reviewed changes

altmannmarcelo merged commit 180af02 into percona:5.6 Oct 14, 2020

altmannmarcelo deleted the 5.6-PS-7346 branch October 14, 2020 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… #4073

5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… #4073

altmannmarcelo commented Oct 5, 2020

altmannmarcelo commented Oct 5, 2020

altmannmarcelo commented Oct 5, 2020

percona-ysorokin left a comment

percona-ysorokin commented Oct 6, 2020

altmannmarcelo commented Oct 6, 2020

percona-ysorokin commented Oct 7, 2020

5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… #4073

5.6 - Fixed PS-7346 - Huge memory usage for audit plugin when large queries… #4073

Conversation

altmannmarcelo commented Oct 5, 2020

altmannmarcelo commented Oct 5, 2020

altmannmarcelo commented Oct 5, 2020

percona-ysorokin left a comment

Choose a reason for hiding this comment

percona-ysorokin commented Oct 6, 2020

altmannmarcelo commented Oct 6, 2020

percona-ysorokin commented Oct 7, 2020