-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Query Cache in Doctrine ORM Causing Cache Memory Overflow #11112
Comments
Hey somebody! |
Can you detail the "how to reproduce" section with code? |
Looked into it thanks to your reproducer, and this is where the ORM does the hardcoding (actually, you can see it delegates that job to DBAL): orm/lib/Doctrine/ORM/Query/SqlWalker.php Line 573 in e50ae06
Now we need to figure out why it's done like this. |
Maybe it was just not possible to use parameter binding at the time, at least for some platforms: doctrine/dbal#3459 (comment) |
@beberlei please help |
I think our SECURITY.md should mention https://www.doctrine-project.org/policies/security.html I'm AFK and will be for a week, so can't help much here. The next step would be to look into implementing a DBAL API that allows to use placeholders with limit and offset, and then leverage it in ORM. |
Ooops! "If you think that you have found a security issue in Doctrine, please don't use the issue tracker in GitHub and don't publish it publicly. Instead, all security issues must be sent to..."
So what i should to do? Create issue in doctrine/dbal? |
Yeah, creating an issue in the DBAL seems good. And let me delete some of the last messages |
Its confusing, the firstResult+maxResults are in Could you paste the output of your test.php to the README.md of that repo? then i dont have to execute this myself. |
@beberlei some news? |
One challenge I see is that Maybe one way out would be to add new methods that allow to set parameter names that shall be used in When choosing parameter names in an automated fashion, we need to take care that they are unique in case people do subqueries/subselects, to avoid parameter name collisions. At the same time, the names must not be random, otherwise they'd mess up the query cache as well. So, like predictable, unique parameter names used in queries that may not fully be constructed at the time the name is needed? |
Regarding the question why |
As I've said in the DBAL issue I don't believe this can easily be fixed or even prepared in DBAL, and then we'd also need similar changes here in ORM. So I'd like to suggest another route: Would it be possible to put (through a dedicated, additional interface?) the output walker into a dedicated "mode" where it uses a (random) placeholder string instead of the SQL expression for limit/offset handling? Then keep the placeholder name together with the unfinished SQL statement in the DQL->SQL cache, and fill in platform-specific limit/offset expressions at a later time, after the parserResult has been taken from the cache? Cache key computation would probably need to know if this is going to work and only conditionally include offset/limit. Things like Update: Due to the way |
What cache are we talking about, the query (DQL->SQL) cache or the result cache? |
@mpdude the query (DQL->SQL) cache |
The query cache does not need this theoretically, because we can call modifiyLimit... on the resulting sql coming from the cache. This should not even be a problem with paginator, because that has query hints that change the cache id |
Do we consider the ParserResult class or the AbstractSqlExecutor and subclasses as part of our public API? The SqlWalker for sure is, it is subclassed in a documentation example. |
Hey! how is it going? |
I need to find spare time to think through Benjamin‘s review 😉 |
@mpdude so you are only who is an able to fix it? |
Half year passed. do you plan to fix it? |
I‘d love to fix it since I already put much effort into it and I think it’s feasible. Also I still have this item on my personal to-do list. However, unless being paid for it, I’d have to do this in my spare free time, which is rather limited at the moment 😔. I’ll see what I can do, but no promises given. |
…1188) * Add a test covering the #11112 issue * Add new OutputWalker and SqlFinalizer interfaces * Add a SingleSelectSqlFinalizer that can take care of adding offset/limit as well as locking mode statements to a given SQL query. Add a FinalizedSelectExecutor that executes given, finalized SQL statements. * In SqlWalker, split SQL query generation into the two parts that shall happen before and after the finalization phase. Move the part that generates "pre-finalization" SQL into a dedicated method. Use a side channel in SingleSelectSqlFinalizer to access the "finalization" logic and avoid duplication. * Fix CS violations * Skip the GH11112 test while applying refactorings * Avoid a Psalm complaint due to invalid (?) docblock syntax * Establish alternate code path - queries can obtain the sql executor through the finalizer, parser knows about output walkers yielding finalizers * Remove a possibly premature comment * Re-enable the #11112 test * Fix CS * Make RootTypeWalker inherit from SqlOutputWalker so it becomes finalizer-aware * Update QueryCacheTest, since first/max results no longer need extra cache entries * Fix ParserResultSerializationTest by forcing the parser to produce a ParserResult of the old kind (with the executor already constructed) * Fix WhereInWalkerTest * Update lib/Doctrine/ORM/Query/Exec/PreparedExecutorFinalizer.php Co-authored-by: Grégoire Paris <postmaster@greg0ire.fr> * Fix tests * Fix a Psalm complaint * Fix a test * Fix CS * Make the NullSqlWalker an instance of SqlOutputWalker * Avoid multiple cache entries caused by LimitSubqueryOutputWalker * Fix Psalm complaints * Fix static analysis complaints * Remove experimental code that I committed accidentally * Remove unnecessary baseline entry * Make AddUnknownQueryComponentWalker subclass SqlOutputWalker That way, we have no remaining classes in the codebase subclassing SqlWalker but not SqlOutputWalker * Use more expressive exception classes * Add a deprecation message * Move SqlExecutor creation to ParserResult, to minimize public methods available on it * Avoid keeping the SqlExecutor in the Query, since it must be generated just in time (e. g. in case Query parameters change) * Address PHPStan complaints * Fix tests * Small refactorings * Add an upgrade notice * Small refactorings * Update the Psalm baseline * Add a missing namespace import * Update Psalm baseline * Fix CS * Fix Psalm baseline --------- Co-authored-by: Grégoire Paris <postmaster@greg0ire.fr>
#11188 has just been merged into 2.20.x, and I think it fixes this issue here. |
Bug Report
Summary
I am experiencing an issue with Doctrine ORM's query caching mechanism, leading to Redis memory overflow. This is primarily due to the way 'limit' and 'offset' values are handled in cached queries.
Current Behavior
In my production environment, Doctrine ORM caches each query including 'limit' and 'offset' as literal values in the SQL statement, rather than as parameters. As a result, every paginated request results in a new cache entry, with the only difference being the varying 'limit' and 'offset' values. This leads to an excessive number of cache entries in Redis.
How to Reproduce
See test code here
Expected Behavior
The expected behavior is for Doctrine ORM to treat 'limit' and 'offset' as parameters rather than literals in the query. This approach would allow for a single cache entry to be applicable to multiple paginated requests of the same query, thereby reducing the number of cache entries and preventing cache memory overflow.
Additional Information
Redis version: 7
PHP version: 8.2
DB: MySQL 8
Environment: Production
Addressing this issue is critical for optimizing query caching efficiency in Doctrine ORM, especially for applications that heavily rely on pagination.
The text was updated successfully, but these errors were encountered: