Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make diacritic_sensitive parameter optional to support $text operator on Cosmos DB #1089

Merged
merged 3 commits into from
Dec 18, 2024

Conversation

mykolaskrynnyk
Copy link
Contributor

Closes #1088.

This fixes [the issue](BeanieODM#1088) with Azure CosmosDB for MongoDB which does not support diacritic sensitivity.
@mykolaskrynnyk
Copy link
Contributor Author

To use text search with Azure CosmosDB for MongoDB, set diacritic_sensitive to None, e.g.,

import asyncio
import os

from beanie import Document, Indexed, init_beanie
from beanie.operators import Text
from motor.motor_asyncio import AsyncIOMotorClient
from pymongo import TEXT


class Dish(Document):
    name: Indexed(str, index_type=TEXT)


async def example():
    client = AsyncIOMotorClient(os.environ["COSMOSDB_CONNECTION"])
    await init_beanie(database=client["test"], document_models=[Dish])
    names = [
        "Ravioli",
        "Spaghetti al pomodoro",
        "Tagliatelle al pomodoro",
    ]
    for name in names:
        dish = Dish(name=name)
        await dish.insert()
    result = await Dish.find(Text("pomodoro", diacritic_sensitive=None)).to_list()
    print("result", result)


if __name__ == "__main__":
    asyncio.run(example())

Copy link
Member

@adeelsohailahmed adeelsohailahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mykolaskrynnyk, thank you for your contribution!

Please update the docstring to explain why diacritic_sensitive would be set to None.

Please update this test to address the case where the diacritic_sensitive would be set to None. You could test this case just below this block.
https://github.com/mykolaskrynnyk/beanie/blob/0773aed151acb30dfcc4448cdbc26326c7366a53/tests/odm/operators/find/test_evaluation.py#L52C1-L59C6

@adeelsohailahmed adeelsohailahmed changed the title Fix(find): allow excluding the diacritic sensitivity parameter Make diacritic_sensitive parameter optional to support $text operator on Cosmos DB Dec 13, 2024
Copy link
Member

@adeelsohailahmed adeelsohailahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@adeelsohailahmed adeelsohailahmed requested a review from a team December 13, 2024 19:46
Copy link
Contributor

@staticxterm staticxterm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mykolaskrynnyk , thank you for your contribution, it looks good.
Just one minor thing, according to MongoDB documentation, $language and $caseSensitive are optional as well (do not need to be provided).
Could you please make these optional as well? Can be a separate PR.
https://www.mongodb.com/docs/manual/reference/operator/query/text/

@staticxterm staticxterm requested a review from a team December 13, 2024 20:31
@mykolaskrynnyk
Copy link
Contributor Author

@staticxterm , it would've been better if all three parameters had been made optional and set to None in the original implementation of Text class. Right now, it is a bit of a legacy and consistency issue. language is already optional and None by default which is fine. diacritic_sensitive is made optional in this PR but both diacritic_sensitive and case_sensitive are False by default. The use case for diacritic_sensitive=None is to enable queries agains version 2 text indices (like Azure CosmosDB for MongoDB) but there is no clear-cut use case for explicitly setting case_sensitive to None except for delegating case sensitivity to server/index defaults, which again should've probably been the default behaviour anyway. To sum up, there are two options:

Option 1

I guess this is what you suggest:

# @@ class Text(BaseFindEvaluationOperator):
        search: str,
        language: Optional[str] = None,
        case_sensitive: Optional[bool] = False,
        diacritic_sensitive: Optional[bool] = False,
    ):

Which is a non-breaking change but it doesn't seem like there is any use-case for that at this point.

Option 2

I think this option would be a lot nicer (and work out of the box for use cases like version 2 text indices) but it does introduce a breaking change:

# @@ class Text(BaseFindEvaluationOperator):
        search: str,
        language: Optional[str] = None,
        case_sensitive: Optional[bool] = None,
        diacritic_sensitive: Optional[bool] = None,
    ):

Any additional thoughts on the topic?

@staticxterm
Copy link
Contributor

Hi @mykolaskrynnyk,
I agree that it would've been better if all 3 parameters were made optional and set to None from the start.
I guess I was proposing the first option then, keeping all the other parameters as they were with their defaults set to False, just making them all optional (and if they are None, exclude them from the query).
Later on, in a next major version of Beanie, we could I guess make this interface a bit better and make all of these parameters optional/not included, leaving only "search" as mandatory.

@mykolaskrynnyk
Copy link
Contributor Author

@staticxterm , I'll make another PR for case sensitivity to keep this one focused on the issue. In the meantime, can someone do the third review to get this out of the door?

Copy link

@Riverfount Riverfount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review it after all change requests are solved.

@mykolaskrynnyk
Copy link
Contributor Author

@Riverfount , I don't think there are any outstanding change requests. Can you clarify what you mean?

@Riverfount
Copy link

Riverfount commented Dec 16, 2024

@Riverfount , I don't think there are any outstanding change requests. Can you clarify what you mean?

There is a "Review required" in the PR. Do you know if that is a change request to solve, or not?

@mykolaskrynnyk
Copy link
Contributor Author

@Riverfount , if you refer to the change request from @adeelsohailahmed, it has already been solved in subsequent commits and approved accordingly.

Copy link

@Riverfount Riverfount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job

@adeelsohailahmed adeelsohailahmed merged commit a2e6a5a into BeanieODM:main Dec 18, 2024
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Text search in Azure CosmosDB fails due to the lack of diacritic sensitivity support
4 participants