-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse 4: Case-insensitive indexes cause huge performance problems for regex queries #6559
Comments
thanks for the great issue!
Is this something you'd be willing to take on?
|
@acinader It has been an interesting couple days of learning for us and we definitely wanted to pass on the knowledge to the community. Regarding 1, since the
|
I don't have an opinion yet, but probably, right? cc: @davimacedo |
Cool. We will definitely handle creating the PR once everyone agrees on the correct approach. |
@MobileVet do you have a strong opinion? After having thought about it for an hour, I think that adding a hint in the limited number of cases where it'll be important would be a good idea -- especially if you guys think you can bite it off ;). |
@acinader I am never a fan of 'special cases' or 'automagical' decisions as it were, but we have had 3 people thinking about this for the last 2 days and that is the best solution we could come up with. One clearly NEVER wants to use the new case insensitive index for a regex... the results are abysmal. Without forcing the There are only 2 keys where this matters so that limits the scope some as you said. My only question, do you think we should still allow a passed in |
I agree with you. I definitely think that we should let a passed in hint override. It doesn't introduce a security weakness (that we know of) and there's no other reason not to let the user shoot herself in the foot if wanted (or, more optimistically, there is a good reason, like another index that should be used and has been reasoned about.) Lmk if you disagree. |
Hi guys, since #6322, it is already possible to pass a hint to the query and avoid the case insensitive index, right? Or am I missing something? |
Hi @davimacedo - I think the issue boils down to
In my mind, the only reason NOT to improve parse-server’s handling of regex queries when indexes are involved would be if for some reason there is a case where a user would prefer to have the regex query use the case-insensitive index, but the MongoDB docs seem to strongly discourage that pretty clearly. |
Also a suggested implementation of this change has been made here: |
I understand the problem but I am not super fan of the solution as well. In https://github.com/parse-community/parse-server/pull/6600/files#diff-271c623e43af83bc3d858c5ad248451aR117 and https://github.com/parse-community/parse-server/pull/6600/files#diff-271c623e43af83bc3d858c5ad248451aR122 we are doing two more db queries just to know if the indexes exist. In the case I am doing a case insensitive regex, wouldn't the insensitive index perform better? Maybe it would be better just remove this insensitive index creation from Parse Server (since it is not recommended by MongoDB) and let the developers to choose which indexes they want to create. |
|
I see the pros and cons of creating this index, I think the issue is that it becomes the index of choice by Mongo when it should not be. Stepping back just a bit, I think it is important to recognize that there is NO situation where using the case insensitive index is the correct solution when doing a @acinader and our team agreed on that and if we need to we should revisit that discussion first. Once we are on the same page regarding the issue, then we can discuss the solution more clearly. I could definitely see the potential for skipping the index queries since we know they exist (as we create them on startup) and we know that using the regular index is ALWAYS preferable to the case insensitive index when doing |
I really don't like adding corner cases like this on the code base even more when we already have the
|
@davimacedo I agree that corner cases are not ideal and we definitely want to make sure that we don't create additional issues when trying to patch this one. In response to your two suggestions:
|
@pocketcolin thanks for your comments. @dplewis @acinader any additional thoughts here? |
Thanks for the continued discussion on this everyone. Our goal is to improve Parse Server and ensure that others don't experience the pain we did when updating from 3.9 to 4.x. That pain was a result of the new index and our use of We would really appreciate further thoughts from everyone, including @acinader who responded extremely quickly to our initial post. Hopefully we can all agree on the appropriate changes soon and get them pulled in ASAP so that the entire community can benefit. |
Sorry to have missed out on this for so long... My first thought is: is there a better solution to #5634 that would allow us to remove the case insensitive index (ci) that is causing the problem. Fixing the underlying issue was important to me as it was causing a set of end-user problems. After reading Davi's comments, I'm left with the biggest concern of how do we know we can safely 'auto-hint' to the ci index.
Which, if I am reading it right, will fail if you add anything other than email or username. My uncertain inclination at this point, which contradicts my earlier discussion with Rob, is to document the Query's regex method to explain the problem and suggest the appropriate hint to use rather than introducing the proposed corner-case fix? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Issue Description
TL;DR
parse@2.12.0
be published to npm (the change is already on master)?TS;WM
As documented, Parse 4 now creates
case_insensitive_email
andcase_insensitive_username
indexes in addition to theemail_1
andusername_1
indexes that it has always created. These case-insensitive indexes significantly help the performance for queries that use theMongoCollection.caseInsensitiveCollation()
👍However...
The MongoDB docs for regex queries mention that regex queries are not collation-aware and unable to utilize case-insensitive indexes:
At first this doesn't seem to be a problem...because why would we tell the regex query to use the case-insensitive index? Shouldn't the regex query just continue to use the case-sensitive index, or no index at all? Well...sadly, for whatever reason, when both a case-insensitive index and a case-sensitive index exist and a regex query runs, the MongoDB Query Plan seems to select the case-insensitive index, and the resulting query time is significantly worse. We were seeing regex queries running ~4x slower, and then as CPU use and query backlog increased this effectively crashed our server.
We're in conversation with the Mongo community about why this is the case, as it doesn't seem like the intended behavior. One possibility is that the regex query is selecting the most recently created index for the key being queried, regardless of its collation. We attempted to change the selected index by specifying a collation manually on the regex query, but this didn't seem to have any effect. Whether the query optimizer is behaving as expected is still unknown.
Some good news...
We do seem to have found a workaround. Using the
hint()
cursor method, we were able to tell our regex queries to use theemail_1
index instead of thecase_insensitive_email
index. On the parse side, this looks like:This returned our regex query times back to normal and everything was great 👍
So...I'm opening this issue to
hint()
withinparse-server
for regex queries, so that users don't have to do it themselvesFinally, as a corollary to this, we also wanted to ask about the status of
parse-server
's version ofparse
being updated to 2.12.0...we see that the dependency has been updated, but it hasn't been published to npm, yet? Where could we find information about when to expect changes merged to master getting published to npm? The reason we ask is that our workaround usinghint()
can only work withparse@2.12.0
because that is the version that added support for Parse.Query.hint.Steps to reproduce
find
query usingregex
against the username or email fields.case_insensitive_email
andcase_insensitive_username
indexesFor us, doing the above led to an almost 4x improvement in query time.
Expected Results
We expected the regex queries to completely ignore the case-insensitive indexes, and to have the same performance as in
parse-server@3.9.0
Actual Outcome
The regex queries are attempting to use the case-insensitive indexes, and are performing ~4x worse.
Environment Setup
Server
Database
Logs/Trace
Can add more detail here if there are issues reproducing the issue
The text was updated successfully, but these errors were encountered: