Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1521: Configure which fields are indexed in the Ditto search index per namespace pattern #1870

Merged
merged 8 commits into from
Jan 24, 2024

Conversation

an1310
Copy link
Contributor

@an1310 an1310 commented Jan 22, 2024

Long time coming.

Resolves: #1521

Copy link
Member

@thjaeckle thjaeckle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @an1310 for that awesome PR.
Looking already really good - I did a quick round of comments to what I found.

What is still missing before getting merged:

  • an "empty" default configuration in the search.conf file
    • containing e.g. commented out example of how the config for a concrete namespace would look like
  • documentation on how to enable the SearchIndexingSignalEnrichmentFacadeProvider, maybe both:
    • in search.conf as commented out alternative to the default DittoCachingSignalEnrichmentFacadeProvider for ditto.extensions.caching-signal-enrichment-facade-provider
    • and as a subsection in the Operating Ditto docs

And one optional wish from me 😁
I just figured out that the entity-creation configuration already supports adding wildcards.
So maybe we could integrate that in the same way, see:

private DefaultCreationRestrictionConfig(final ConfigWithFallback configWithFallback) {
this.resourceTypes = Set.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.RESOURCE_TYPES.getConfigPath()
));
this.namespacePatterns = compile(List.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.NAMESPACES.getConfigPath())
));
this.authSubjectPatterns = compile(List.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.AUTH_SUBJECTS.getConfigPath())
));
}
private static List<Pattern> compile(final List<String> patterns) {
return patterns.stream()
.map(LikeHelper::convertToRegexSyntax)
.filter(Objects::nonNull)
.map(Pattern::compile)
.toList();
}

So basically the getNamespace would return a Pattern instead which is created during the config parsing.

An additional benefit of that would be that is configured commonly as the "Restriction entity creation".

What do you think about that?

@thjaeckle
Copy link
Member

Feature wise it works really great 👍
I just manually tested and only the defined fields were inserted into the search index, awesome 🏅

@thjaeckle thjaeckle added this to the 3.5.0 milestone Jan 23, 2024
@an1310
Copy link
Contributor Author

an1310 commented Jan 23, 2024

Thanks a lot @an1310 for that awesome PR. Looking already really good - I did a quick round of comments to what I found.

What is still missing before getting merged:

  • an "empty" default configuration in the search.conf file

    • containing e.g. commented out example of how the config for a concrete namespace would look like
  • documentation on how to enable the SearchIndexingSignalEnrichmentFacadeProvider, maybe both:

    • in search.conf as commented out alternative to the default DittoCachingSignalEnrichmentFacadeProvider for ditto.extensions.caching-signal-enrichment-facade-provider

    • and as a subsection in the Operating Ditto docs

And one optional wish from me 😁 I just figured out that the entity-creation configuration already supports adding wildcards. So maybe we could integrate that in the same way, see:

private DefaultCreationRestrictionConfig(final ConfigWithFallback configWithFallback) {
this.resourceTypes = Set.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.RESOURCE_TYPES.getConfigPath()
));
this.namespacePatterns = compile(List.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.NAMESPACES.getConfigPath())
));
this.authSubjectPatterns = compile(List.copyOf(configWithFallback.getStringList(
CreationRestrictionConfigValues.AUTH_SUBJECTS.getConfigPath())
));
}
private static List<Pattern> compile(final List<String> patterns) {
return patterns.stream()
.map(LikeHelper::convertToRegexSyntax)
.filter(Objects::nonNull)
.map(Pattern::compile)
.toList();
}

So basically the getNamespace would return a Pattern instead which is created during the config parsing.

An additional benefit of that would be that is configured commonly as the "Restriction entity creation".

What do you think about that?

I think this suggestion is great. If it's okay, can we do this in a separate issue and I can contribute it?

@thjaeckle
Copy link
Member

I think this suggestion is great. If it's okay, can we do this in a separate issue and I can contribute it?

I would like to add it to 3.5.0 still - Because otherwise it would again take ~3 months for the next minor release of Ditto.
Do you think you can make it within the next 2 days? :D

Then it would not really matter if new issue and PR or use this one ..

@thjaeckle
Copy link
Member

One TODO to check (I can also check - but maybe you are quicker):

  • what happens when you search for a field which is not indexed?
    • do you find nothing (I assume)
    • is there an error (there probably could be - and maybe even should be) indicating that this field is not indexed?

@an1310 an1310 requested a review from thjaeckle January 23, 2024 21:58
@thjaeckle thjaeckle changed the title Ditto 1521: Initial submission of scoping search fields for things. #1521: Configure which fields are indexed in the Ditto search index per namespace pattern Jan 24, 2024
Copy link
Member

@thjaeckle thjaeckle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@an1310 I did a review commit, hope you don't mind - providing:

  • added a cache to SearchIndexingSignalEnrichmentFacade in order to only evaluate "patterns" once for a given namespace
  • removed copy&pasted unit tests in SearchIndexingSignalEnrichmentFacadeTest by adding another abstract test class AbstractCachingSignalEnrichmentFacadeTest
  • added a unit test testing selection of JsonFieldSelectors based on different namespaces
  • minor cleanup and formatting
  • enhanced documentation about how to configure the indexed namespaces via system properties

For me, this would now good to be merged 👍

@an1310
Copy link
Contributor Author

an1310 commented Jan 24, 2024

I don't mind in the slightest. Glad to have helped.

* added a cache to SearchIndexingSignalEnrichmentFacade in order to only evaluate "patterns" once for a given namespace
* removed copy&pasted unit tests in SearchIndexingSignalEnrichmentFacadeTest by adding another abstract test class AbstractCachingSignalEnrichmentFacadeTest
* added a unit test testing selection of JsonFieldSelectors based on different namespaces
* minor cleanup and formatting
* enhanced documentation about how to configure the indexed namespaces via system properties

Signed-off-by: Thomas Jäckle <thomas.jaeckle@beyonnex.io>
* consolidated config keys to be more intuitive

Signed-off-by: Thomas Jäckle <thomas.jaeckle@beyonnex.io>
@thjaeckle
Copy link
Member

@an1310 I also just provided configuration via the Helm chart - so that it is easily enabled and configured via values.yaml

Once the build is green (nowadays GitLab runners seem quite busy), I will merge :)

Thanks a lot for this contribution - I guess not only you will benefit from that configuration.

@thjaeckle thjaeckle merged commit a8b6267 into eclipse-ditto:master Jan 24, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Configure on a namespace basis the fields to index in the thing-search
2 participants