Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPCache wasted memory increases on each request #9917

Closed
icedevelopment opened this issue Jul 19, 2022 · 11 comments · Fixed by #10434
Closed

OPCache wasted memory increases on each request #9917

icedevelopment opened this issue Jul 19, 2022 · 11 comments · Fixed by #10434

Comments

@icedevelopment
Copy link

Bug Report

Q A
BC Break no
Version 2.12.3

Summary

The OPCache wasted memory increases on each request if a paginator is used.

Current behavior

Whenever you make a request on a controller with a Doctrine\ORM\Query that uses Doctrine\ORM\Tools\Pagination\Paginator#getIterator(), the DQL to SQL parser cache is redone each time.
Seems that this is bad a side effect introduced in commit 023e946 which was merged in 2.6.5 so every version between this and 2.12.3 should be impacted.

How to reproduce

  1. Install a Symfony app with Doctrine with the following standard cache configuration:
# config/packages/prod/doctrine.yaml
doctrine:
    orm:
        auto_generate_proxy_classes: false
        metadata_cache_driver:
            type: pool
            pool: doctrine.system_cache_pool
        query_cache_driver:
            type: pool
            pool: doctrine.system_cache_pool
        result_cache_driver:
            type: pool
            pool: doctrine.result_cache_pool

framework:
    cache:
        pools:
            doctrine.result_cache_pool:
                adapter: cache.app
            doctrine.system_cache_pool:
                adapter: cache.system
  1. Create a controller that uses a Doctrine\ORM\Query with a paginator (i.e. knplabs/knp-paginator-bundle).

  2. Set the environment to prod in .env (APP_ENV=prod and APP_DEBUG=0).

  3. Make requests on the controller and watch opcache_get_status()['memory_usage']['wasted_memory'] increase each time until OPCache is full and resets itself.

The expireQueryCache flag is always set to true here:

$whereInQuery->expireQueryCache();

This makes Doctrine reparse the DQL each time:

if (! $this->expireQueryCache && $cacheItem->isHit()) {
$cached = $cacheItem->get();
if ($cached instanceof ParserResult) {
// Cache hit.
$this->parserResult = $cached;
return $this->parserResult;
}
}
// Cache miss.
$parser = new Parser($this);

Then Symfony will recreate the exact same file in its cache in var/cache/prod/pools/, invalidate and recompile it into OPCache:
https://github.com/symfony/cache/blob/7d8415956df68c8dcbc9468e119945e39bacead1/Adapter/PhpFilesAdapter.php#L254-L259

Expected behavior

The OPCache wasted_memory doesn't increase on each request.

@josephLaurent
Copy link

I got exactly the same issue, for me removing the following lines in the symfony config fix the issue:

doctrine:
    orm:
        auto_generate_proxy_classes: false
        metadata_cache_driver:
            type: pool
            pool: doctrine.system_cache_pool
        query_cache_driver:
            type: pool
            pool: doctrine.system_cache_pool
        result_cache_driver:
            type: pool
            pool: doctrine.result_cache_pool

Also, put this in the php.ini file prevent using the opcache and so fix the issue as well:

opcache.file_cache = NULL
opcache.file_cache_only = 0

But this are only quick fix that have impact on performances.
Is there any update on a possible fix on doctrine ?

@greg0ire
Copy link
Member

I would not expect the metadata cache or the query cache to grow indefinitely. I would try removing only the line about the result cache. How do you use that cache inside your app, by the way. Regarding the news, we did not discuss this issue in private or in other threads AFAIK.

@josephLaurent
Copy link

I also use doctrine 2.12.3 with symfony. The case where the cache doesn't stop growing is when I use doctrine to query the database with a paginator in production environment. If fact, I'm facing the same issue than @icedevelopment in exactly the same case scenario.

@greg0ire
Copy link
Member

greg0ire commented Aug 11, 2022

Well let's say your result set is millions of records. Why would you expect the cache not to grow? The result cache is not meant to be used on each and every query you make through Doctrine ORM, and isn't by default (you have to call useResultCache explicitly IIRC. The cache system is not going to magically hold all the data in a table for free.

@icedevelopment
Copy link
Author

@greg0ire The issue here is with the query cache (DQL to SQL transformation), not with the result cache.
For every SQL query used in an HTTP request, Doctrine will always do the DQL to SQL parsing and create a new query cache file with the same contents but a new unique filename everytime. This will make your var/cache/prod/pools/ grow on every HTTP request and wil increase your OPCache wasted memory all the way until it is forced to automatically restart.

@icedevelopment
Copy link
Author

@greg0ire have you been able to have a look at this issue? This looks like a pretty bad bug performance-wise.

@greg0ire
Copy link
Member

greg0ire commented Sep 18, 2022

I was on holidays and I don't use the ORM myself, so no. After looking into this a bit, I think a solution could be to use a special implementation of the query cache driver just for the WHERE IN query, instead of calling expireQueryCache. Such an implementation is provided by symfony/cache, but not by psr/cache, sadly.

Can you try calling $whereInQuery->setQueryCache(new Symfony\Component\Cache\Adapter\NullAdapter()) and let me know if that mitigates the issue? If it does, we will need to figure out what to do (build our own null implementation of CacheItemPoolInterface or reuse one).

@icedevelopment
Copy link
Author

Hey @greg0ire thank you for your help.

I've just tried calling $whereInQuery->setQueryCache(new \Symfony\Component\Cache\Adapter\NullAdapter()) before $whereInQuery->expireQueryCache(); and I can confirm that it does indeed mitigate the issue.

@greg0ire
Copy link
Member

greg0ire commented Oct 3, 2022

Thanks, that's great news! Would you be willing to work on a PR to use this trick instead of the expireQueryCache one?

icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
…expireQueryCache() when using Paginator.

This fixes a bug where Doctrine would recreate the DQL to SQL parsing cache on every request which would fill up OPCache with duplicates.
@see doctrine#9917
icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
@icedevelopment
Copy link
Author

Thanks, that's great news! Would you be willing to work on a PR to use this trick instead of the expireQueryCache one?

I just opened #10095.
I created a new NullCacheItemPool based on Symfony\Component\Cache\Adapter\NullAdapter.
Let me know what you think.

icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 4, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
icedevelopment pushed a commit to icedevelopment/orm that referenced this issue Oct 5, 2022
This fixes a bug where Doctrine would recreate the DQL to SQL parsing
cache on every request which would fill up OPCache with duplicates.
We now use a "Null" cache implementation instead of calling
Query::expireQueryCache().
@see doctrine#9917
@mpdude
Copy link
Contributor

mpdude commented Jan 21, 2023

A suggested fix is in #10434.

mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 22, 2023
This PR prevents the Paginator from causing OpCache "wasted memory" to increase _on every request_ when used with Symfony's `PhpFilesAdapter` as the cache implementation for the query cache.

Depending on configured thresholds, wasted memory this will either cause periodic opcache restarts or running out of memory and not being able to cache additional scripts ([Details](https://tideways.com/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises)).

Fixes doctrine#9917, closes doctrine#10095.

There is a long story (doctrine#7820, doctrine#7821, doctrine#7837, doctrine#7865) behind how the Paginator can take care of DBAL type conversions when creating the pagination query. This conversion has to transform identifier values before they will be used as a query parameter, so it has to happen every time the Paginator is used.

For reasons, this conversion happens inside `WhereInWalker`. Tree walkers like this are used only during the DQL parsing/AST processing steps. Having a DQL query in the query cache short-cuts this step by fetching the parsing/processing result from the cache.

So, to make sure the conversion happens also with the query cache being enabled, this line

https://github.com/doctrine/orm/blob/1753d035005c1125c9fb4855c3fa629341e5734d/lib/Doctrine/ORM/Tools/Pagination/Paginator.php#L165

was added in doctrine#7837. It causes `\Doctrine\ORM\Query::parse()` to re-parse the query every time, but will also put the result into the query cache afterwards.

At this point, the setup described in doctrine#9917 – which, to my knowledge, is the default in Symfony + DoctrineBundle projects – will ultimately bring us to this code:

https://github.com/symfony/symfony/blob/4b3391725f2fc4a072e776974f00a992cbc70515/src/Symfony/Component/Cache/Adapter/PhpFilesAdapter.php#L248-L249

When writing a cache item with an already existing key, the driver has to make sure the opcache will honor the changed PHP file. This is what causes _wasted memory_ to increase.

Instead of using `\Doctrine\ORM\Query::expireQueryCache()`, which will force `\Doctrine\ORM\Query::parse()` to parse the query again before putting it into the cache, use `\Doctrine\ORM\Query::useQueryCache(false)`. The subtle difference is the latter will not place the processed query in the cache in the first place.

A test case is added to check that repeated use of the paginator does not call the cache to update existing keys. That should suffice to make sure we're not running into the issue, while at the same time not complicating tests by using the `PhpFilesAdapter` directly.

Note that in order to observe the described issue in tests, you will need to use the `PhpFilesDriver` and also make sure that OpCache is enabled and also activated for `php-cli` (which is running the unit tests).

This particular subquery generated/used by the Paginator is not put into the query cache. The DQL parsing/to-SQL conversion has to happen _every time_ the Paginator is used.

This, however, was already the case before this PR. In other words, this PR only changes that we do not store/update the cached result every time, but instead completely omit caching the query.
mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 23, 2023
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again.

When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821).

In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available.

The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`.

The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917).

This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`.

`RootTypeWalker` uses a ~hack~  clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache.

With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used.

The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
derrabus pushed a commit that referenced this issue Jan 23, 2023
This PR prevents the Paginator from causing OpCache "wasted memory" to increase _on every request_ when used with Symfony's `PhpFilesAdapter` as the cache implementation for the query cache.

Depending on configured thresholds, wasted memory this will either cause periodic opcache restarts or running out of memory and not being able to cache additional scripts ([Details](https://tideways.com/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises)).

Fixes #9917, closes #10095.

There is a long story (#7820, #7821, #7837, #7865) behind how the Paginator can take care of DBAL type conversions when creating the pagination query. This conversion has to transform identifier values before they will be used as a query parameter, so it has to happen every time the Paginator is used.

For reasons, this conversion happens inside `WhereInWalker`. Tree walkers like this are used only during the DQL parsing/AST processing steps. Having a DQL query in the query cache short-cuts this step by fetching the parsing/processing result from the cache.

So, to make sure the conversion happens also with the query cache being enabled, this line

https://github.com/doctrine/orm/blob/1753d035005c1125c9fb4855c3fa629341e5734d/lib/Doctrine/ORM/Tools/Pagination/Paginator.php#L165

was added in #7837. It causes `\Doctrine\ORM\Query::parse()` to re-parse the query every time, but will also put the result into the query cache afterwards.

At this point, the setup described in #9917 – which, to my knowledge, is the default in Symfony + DoctrineBundle projects – will ultimately bring us to this code:

https://github.com/symfony/symfony/blob/4b3391725f2fc4a072e776974f00a992cbc70515/src/Symfony/Component/Cache/Adapter/PhpFilesAdapter.php#L248-L249

When writing a cache item with an already existing key, the driver has to make sure the opcache will honor the changed PHP file. This is what causes _wasted memory_ to increase.

Instead of using `\Doctrine\ORM\Query::expireQueryCache()`, which will force `\Doctrine\ORM\Query::parse()` to parse the query again before putting it into the cache, use `\Doctrine\ORM\Query::useQueryCache(false)`. The subtle difference is the latter will not place the processed query in the cache in the first place.

A test case is added to check that repeated use of the paginator does not call the cache to update existing keys. That should suffice to make sure we're not running into the issue, while at the same time not complicating tests by using the `PhpFilesAdapter` directly.

Note that in order to observe the described issue in tests, you will need to use the `PhpFilesDriver` and also make sure that OpCache is enabled and also activated for `php-cli` (which is running the unit tests).

This particular subquery generated/used by the Paginator is not put into the query cache. The DQL parsing/to-SQL conversion has to happen _every time_ the Paginator is used.

This, however, was already the case before this PR. In other words, this PR only changes that we do not store/update the cached result every time, but instead completely omit caching the query.
mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 23, 2023
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again.

When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821).

In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available.

The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`.

The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917).

This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`.

`RootTypeWalker` uses a ~hack~  clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache.

With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used.

The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 23, 2023
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again.

When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821).

In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available.

The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`.

The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917).

This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`.

`RootTypeWalker` uses a ~hack~  clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache.

With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used.

The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 23, 2023
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again.

When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821).

In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available.

The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`.

The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917).

This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`.

`RootTypeWalker` uses a ~hack~  clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache.

With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used.

The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
mpdude added a commit to mpdude/doctrine2 that referenced this issue Jan 23, 2023
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again.

When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821).

In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available.

The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`.

The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917).

This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`.

`RootTypeWalker` uses a ~hack~  clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache.

With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used.

The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants