-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support spatial fields in field retrieval API. #59821
Support spatial fields in field retrieval API. #59821
Conversation
@@ -36,18 +37,18 @@ | |||
private static final ParseField Y_FIELD = new ParseField("y"); | |||
private static final ParseField Z_FIELD = new ParseField("z"); | |||
|
|||
protected float x; | |||
protected float y; | |||
protected double x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change, coordinates could change unexpectedly in a roundtrip like WKT -> CartesianPoint
-> WKT. It seems common to represent points using doubles even though they're indexed as floats -- for example the Point
geometry uses doubles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good Point
. I do not see a problem with this. maybe @nknize has an opinion here
Formatter<Parsed> geometryFormatter = mappedFieldType.geometryFormatter(); | ||
|
||
Parsed geometry; | ||
try (XContentParser parser = new MapXContentParser(NamedXContentRegistry.EMPTY, LoggingDeprecationHandler.INSTANCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seemed like the easiest way to integrate with the existing logic but is far from ideal:
- We perform parsing even when it's not necessary: even when the geometry is already in the right format, we still parse it to an object re-serialize it.
- We also awkwardly translate between maps and xContent.
Let me know if you have suggestions around a better approach. Overall I was hoping to keep this PR reasonably scoped, since it is part of a large 'field retrieval' change that already has a few moving parts. But we could plan a larger refactor in a separate/ follow-up change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty yikes, but I do like the idea of keeping the work contained on the branch and doing a larger refactoring after merging. Maybe it'd be cleaner to push the ability to parse these fields into SourceLookup
. That way we don't need to do the xcontent -> map -> xcontent
dance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to specialize between points and shapes? At least for Shapes, we should be able to tell whether value
is an object or a string, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'd be cleaner to push the ability to parse these fields into SourceLookup
I think this would be challenging given the current design of SourceLookup
, since it is operates only on the level of xContent and is completely agnostic to the mappings. It would be nice though if SourceLookup
could provide direct access to the xContent representation instead of always parsing + returning generic objects. This feels like a bigger change that's better suited for a follow-up though?
At least for Shapes, we should be able to tell whether value is an object or a string, right?
This would work, I'll look into it further! I'm inclined to not worry about it though if it makes the design more complicated (specifically splitting points vs. shapes handling).
5b91623
to
64143b8
Compare
28ddcc3
to
58c5a26
Compare
Pinging @elastic/es-analytics-geo (:Analytics/Geo) |
Pinging @elastic/es-search (:Search/Mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatter<Parsed> geometryFormatter = mappedFieldType.geometryFormatter(); | ||
|
||
Parsed geometry; | ||
try (XContentParser parser = new MapXContentParser(NamedXContentRegistry.EMPTY, LoggingDeprecationHandler.INSTANCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty yikes, but I do like the idea of keeping the work contained on the branch and doing a larger refactoring after merging. Maybe it'd be cleaner to push the ability to parse these fields into SourceLookup
. That way we don't need to do the xcontent -> map -> xcontent
dance.
@@ -610,4 +617,14 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
} | |||
} | |||
|
|||
public static Map<?, ?> toXContentMap(Geometry geometry) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth javadoc. I was confused by the method name because I usually think of map
and xcontent
as mutually exclusive representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, XContent here is just an implementation detail. it is a normal Map, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will clean this up. Here I was using toXContentMap
to mean 'return the JSON representation of the geometry as a Java map'. But I see how this is confusing given there's already a toXContent
method ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, would like to make a plan / gain resolution on the "yikes" method Nik.
@@ -82,6 +87,11 @@ | |||
Parsed parse(XContentParser parser, AbstractGeometryFieldMapper mapper) throws IOException, ParseException; | |||
} | |||
|
|||
public interface Formatter<Parsed> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add some javadoc for this?
@@ -36,18 +37,18 @@ | |||
private static final ParseField Y_FIELD = new ParseField("y"); | |||
private static final ParseField Z_FIELD = new ParseField("z"); | |||
|
|||
protected float x; | |||
protected float y; | |||
protected double x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good Point
. I do not see a problem with this. maybe @nknize has an opinion here
@@ -610,4 +617,14 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
} | |||
} | |||
|
|||
public static Map<?, ?> toXContentMap(Geometry geometry) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, XContent here is just an implementation detail. it is a normal Map, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I share the concern about mapping as well, I think we can easily add a separate visitor to generate a map directly. I can help with this part. I also wonder if we can make it a bit more flexible in terms of formats and instead of introducing GeoShapeFormatter
interface we could extend GeometryFormat
interface with toGenericObject()
method, which will produce String
for WKT and Map
for Json and possibly support other formats in the future. We will have to add an alternative GeometryParser,geometryFormat()
method that will return the format not only for XContentParser
but also for a string.
Thanks everyone for the reviews. Here are my planned next steps:
|
A heads up that I will rebase |
64143b8
to
1de5a51
Compare
Although we accept a variety of formats during indexing, spatial data is returned in a single consistent format. This is GeoJSON by default, but well-known text is also supported by passing the option 'format: wkt'. Note that points (in addition to shapes) are returned in GeoJSON by default. The reasoning is that this gives better consistency, and is the most convenient format for most REST API users.
This gives more predictable values when parsing and formatting a point. It matches the behavior for GeoPoint and the Point geometry.
3bb216a
to
804a295
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of really minor comments, otherwise LGTM.
} | ||
|
||
@Override | ||
public Map<?, ?> toObject(Geometry geometry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe return Object here as well for consistency. I don't think the fact that it returns Map is significant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sometimes like using covariance like this to clarify that a subclass always returns a specific type. I don't feel strongly about it though, happy to change it. We can always change it back later if we find it helpful for unit testing, etc. to have the exact type.
* | ||
* For example, the GeoJson format returns the geometry as a map, while WKT returns a string. | ||
*/ | ||
Object toObject(ParsedFormat geometry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if somebody can come up with a better name here :) maybe to toXContentObject()
or toXConentValue()
or something like this. I feel like toObject is misleadingly generic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, oi. since there are javadocs explaining it now, I drop my naming argument. I'm good with whatever sounds good to you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot say I love toXContentAsObject
but I like it the best comparing to all other versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
@elasticmachine run elasticsearch-ci/default-distro |
I checked with @imotov and he is good with merging. |
Although we accept a variety of formats during indexing, spatial data is returned in a single consistent format. This is GeoJSON by default, but well-known text is also supported by passing the option 'format: wkt'. Note that points (in addition to shapes) are returned in GeoJSON by default. The reasoning is that this gives better consistency, and is the most convenient format for most REST API users.
One last update: @imotov plans to raise a PR next week to avoid the geometry -> xContent -> map translation. I've already raised the PR #55363 to merge the feature branch, so that refactor will be done as a follow-up against master. I'll ensure that the feature doesn't ship in 7.10 without this improvement (filing a new 'blocking' issue if needed). |
Although we accept a variety of formats during indexing, spatial data is returned in a single consistent format. This is GeoJSON by default, but well-known text is also supported by passing the option 'format: wkt'. Note that points (in addition to shapes) are returned in GeoJSON by default. The reasoning is that this gives better consistency, and is the most convenient format for most REST API users.
Although we accept a variety of formats during indexing, spatial data is
returned in a single consistent format. This is GeoJSON by default, but
well-known text is also supported by passing the option
format: wkt
.Note that points (in addition to shapes) are returned in GeoJSON by default. The
reasoning is that this gives better consistency, and is the most convenient
format for most REST API users.