Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEO] Prevent PrefixTree strategies on new geo_shape indexes #37152

Closed
wants to merge 7 commits into from

Conversation

nknize
Copy link
Contributor

@nknize nknize commented Jan 4, 2019

6.6 introduced BKD based indexing for geo_shape field types. For bwc purposes PrefixTree strategies and parameters were deprecated but still allowed on newly created indexes. This PR prohibits using any/all PrefixTrees strategies and parameters for 7.0+ newly created geo_shape indexes.

TODO

  • fix remaining failing tests
  • add explicit tests for exception handling when creating a geo_shape index using PrefixTree parameters
  • update docs to remove all prefix tree parameters (and references to)
  • update breaking changes doc

6.6 introduced BKD based indexing for geo_shape field types. For bwc purposes PrefixTree strategies and parameters were deprecated but still allowed on newly created indexes. This commit prohibits using any/all PrefixTrees strategies and parameters for 7.0+ newly created geo_shape indexes.
@nknize nknize added >breaking :Analytics/Geo Indexing, search aggregations of geo points and shapes :Search Foundations/Mapping Index mappings, including merging and defining field types v7.0.0 labels Jan 4, 2019
@nknize nknize requested review from jpountz and imotov January 4, 2019 18:58
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@nknize nknize added the WIP label Jan 4, 2019
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past we tried to be a bit lenient with changes in mappings due to the fact that mappings may be used in templates that are a bit hard to migrate. Maybe we should issue a deprecation warning and ignore deprecated settings instead of rejecting them. It would make all existing templates keep working and also still give us the ability to remove support for legacy shapes in 8.0 since no 7.x index is going to use them?

@@ -54,21 +54,7 @@ An error will now be thrown when unknown configuration options are provided
to similarities. Such unknown parameters were ignored before.

[float]
==== Changed default `geo_shape` indexing strategy
==== prohibit use of `geo_shape` prefix tree parameters for new indexes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/prohibit/Prohibit/?

@@ -156,9 +148,6 @@ intersects the query geometry.
has nothing in common with the query geometry.
* `WITHIN` - Return all documents whose `geo_shape` field
is within the query geometry.
* `CONTAINS` - Return all documents whose `geo_shape` field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's maybe keep it in the docs since the plan is to have it back by 7.0?

Copy link
Contributor

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am somewhat concerned about merging this before we achieve parity in BKD implementation, unless me make this as a blocker for 7.0. Otherwise we don't really provide a clear migration path for users who are using the currently missing functionality.

@polyfractal
Copy link
Contributor

polyfractal commented Jan 10, 2019

I agree, this feels like a fairly big regression.

If we throw an exception, all existing templates using non-default parameters will break. If we just log a deprecation, the user is silently opted into functionality that may break their existing queries. Which is arguably a worse situation since they won't know until a CONTAINS/MULTIPOINT/LINESTRING + WITHIN query is used and then it's a runtime break far away from the mapping change.

In both cases, anyone depending on the missing functionality will be prevented from upgrading until whatever 7.x version implements it. :(

This is a bit tangential to this PR, but defaulting to the new geoshapes in 6.6 feels like a big break too? E.g. existing templates (that only use default geoshape params) will opt into the new geoshape and could break queries due to reduced functionality? Is there something I'm missing there?

@nknize
Copy link
Contributor Author

nknize commented Jan 10, 2019

I am somewhat concerned about merging this before we achieve parity in BKD implementation, unless me make this as a blocker for 7.0.

I agree. I wanted to get this opened as a WIP PR to solicit feedback on the migration path forward. Some great feedback here and I do think we should hold on this until:

  1. https://issues.apache.org/jira/browse/LUCENE-8620 from @iverase is added to Lucene (hoping 8.0). That will give feature parity for CONTAINS queries.
  2. Adaptive orientation is integrated w/ Lucene to ensure accurate point in triangle calculation; which is needed for MultiPoint query parity (Add support for MultiPoint Queries to new BKD geo_shape indexing approach #37318).

I don't think we need to wait for linestring + within queries. This is more of an "accidental feature" that really shouldn't be available as its somewhat misleading (its not actually performing a true buffer query based on distance, only exploiting the resolution of the quadcells).

defaulting to the new geoshapes in 6.6 feels like a big break too?

You're not missing anything w.r.t templates. Its something that was discussed and added to the 6.6 breaking changes doc.

@jpountz
Copy link
Contributor

jpountz commented Jan 17, 2019

Adaptive orientation is integrated w/ Lucene to ensure accurate point in triangle calculation; which is needed for MultiPoint query parity

@nknize Can you clarify why this is needed?

@nknize
Copy link
Contributor Author

nknize commented Jan 28, 2019

@nknize Can you clarify why this is needed?

After digging into this for MultiPoint queries I'll retract my statement and say its not needed. So long as we stay in encoded space overflow a non issue (but will likely need to be readdressed for supporting different projections using XYShape). I'm wrapping up testing on the MultiPoint query feature branch and will open a lucene patch shortly. That should help close the gap for feature parity on MultiPoint and CONTAINS.

@blkbltjns
Copy link

blkbltjns commented Dec 30, 2019

I've hit a couple roadblocks trying to upgrade from 6.3 to 7.5 having to do with non feature parity and performance regressions of the new BKD indexing.

The largest roadblock so far has been the inability to spatially intersect with the circle type. Converting the circles to a polygons first is not an option due to my use case involving thousands of circles which kills performance if converted to polygons and intersected from there. I would not need to intersect with the circle type if there was supported capability to filter by distance to an arbitrary geo_shape (point, line, polygon, etc...).

The next largest roadblock is a significant performance regression when intersecting an index containing a few million polygons (land parcels shapes which will often overlap).

If the old indexing method is not carried forward, I will not be able to upgrade to 8.0 without some significant rework in my application.

@rjernst rjernst added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team labels May 4, 2020
@jimczi
Copy link
Contributor

jimczi commented Apr 23, 2021

The PR is outdated, hence closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes :Search Foundations/Mapping Index mappings, including merging and defining field types stalled Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team v8.0.0-alpha1 WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.