Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] add geo point combined field to CSV import #77117

Merged
merged 21 commits into from
Sep 18, 2020

Conversation

nreese
Copy link
Contributor

@nreese nreese commented Sep 9, 2020

Fixes #76996

This PR updates CSV upload to make it possible to combine two columns into geo_point field.

  • When ImportView is first rendered, a geo_point combined field will be added if there are latitude and longitude columns. This means that users will get geo_point field with no additional configuration.
  • Under the advanced tab, users can click Add combined field to configure a new combined field. In this initial PR, the only option is to add a geo point field but this can be expanded in the future to include other templates.

Whether auto-generated or manually added by users, geo point combined field does the following

  • Adds a new field in mappings with type geo_point
  • Adds a set processor to the ingest pipeline that creates the geo_point field by combining the latitude and longitude fields.

Screen Shot 2020-09-09 at 3 10 27 PM

Screen Shot 2020-09-09 at 3 10 35 PM

Screen Shot 2020-09-09 at 3 10 41 PM

@nreese nreese added release_note:enhancement WIP Work in progress [Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation :ml v8.0.0 Feature:File and Index Data Viz ML file and index data visualizer v7.10.0 labels Sep 9, 2020
@nreese nreese requested a review from a team as a code owner September 9, 2020 21:21
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-gis (Team:Geo)

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@nreese
Copy link
Contributor Author

nreese commented Sep 10, 2020

@elasticmachine merge upstream

@jgowdyelastic
Copy link
Member

I've tested this and it works well.
I like the idea of writing this is a way that further combined field templates can be added later.
It would be good to hear feedback from @peteharverson, @sophiec20 and @droberts195 on this feature.

@peteharverson
Copy link
Contributor

peteharverson commented Sep 15, 2020

I tried giving this a test, but am seeing the following console errors when moving to the Import page. Are changes needed to this PR following the edits made in #77035?

Update: Sorry for the noise. My ES build was out of date and did not include the change for the "properties" layer in find_file_structure mappings.

image

@peteharverson
Copy link
Contributor

Had a test and this is a great enhancement!

Some quick thoughts on the edits to the UI:

  • Whilst we only have the single 'geo point' option for 'combined fields' I wonder if we should just remove the 'combined fields' labelling, and just label it as 'geo point fields' until if or when we add templates for other combined field types. Clicking the 'Add geo point field' would then just directly open the edit popover:
    image
  • There should be a way of deleting geo_point fields (other than by directly deleting them from the mapping JSON editor)
    image
  • There is no attempt to keep the new UI controls in sync with edits made directly to Mappings or Ingest Pipeline editors. Would these new controls be better moved to the Simple tab?
  • Improve the messaging if the user tries to give a new geo_point field the same name as an existing field in the mapping:
    image

@nreese
Copy link
Contributor Author

nreese commented Sep 15, 2020

@elasticmachine merge upstream

@nreese
Copy link
Contributor Author

nreese commented Sep 15, 2020

Whilst we only have the single 'geo point' option for 'combined fields' I wonder if we should just remove the 'combined fields' labelling, and just label it as 'geo point fields' until if or when we add templates for other combined field types

Users appreciate a consistent UI from release to release making it best to provide a stable UI that can grow with future needs. If there are known additional combined field templates that would later cause "geo point fields" to change to "combined fields", then that required UI churn can confusion users. Is there another known use case for combined fields? Should this PR just add a second template to flesh out the concept?

There should be a way of deleting geo_point fields (other than by directly deleting them from the mapping JSON editor)

This has been addressed and there is now a delete button for each combined field that will remove the combined field from the mappings and pipeline

There is no attempt to keep the new UI controls in sync with edits made directly to Mappings or Ingest Pipeline editors. Would these new controls be better moved to the Simple tab?

+1 on moving to the simple tab. I think the feature will be more visible to novice users from the simple tab. I think the feature is hidden on the advance tab since the main focus is on manually editing JSON configurations. I will defer to @@elastic/ml-ui for a final say. Please let me know where it should be placed.

Improve the messaging if the user tries to give a new geo_point field the same name as an existing field in the mapping:

This has been addressed and now name collision checking will look for collisions with other combined fields and any key in mappings.properties.

@nreese
Copy link
Contributor Author

nreese commented Sep 17, 2020

@peteharverson

The Import page is hanging for me with a lot of log files - I guess when it doesn't find any candidates for combined fields?

This error is occurring for files that do not provide column_names in results. FindFileStructureResponse.column_names is typed as string[].

Is the bug that column_names is not provided or is the bug that FindFileStructureResponse.column_names should be typed as column_names?: string[];?

@jgowdyelastic
Copy link
Member

Is the bug that column_names is not provided or is the bug that FindFileStructureResponse.column_names should be typed as column_names?: string[];?

Yes, column_names should be optional.

@nreese
Copy link
Contributor Author

nreese commented Sep 17, 2020

We have discussed this some more, and the proposal is to add a readonly display of the combined fields that have been auto-added to the Simple tab

I like this proposal and have added a readonly view in the simple tab

Screen Shot 2020-09-17 at 2 39 05 PM

@nreese
Copy link
Contributor Author

nreese commented Sep 17, 2020

The Import page is hanging for me with a lot of log files - I guess when it doesn't find any candidates for combined fields?

This has been resolved. FindFileStructureResponse.column_names has been updated to reflect that column_names can be undefined and corresponding uses of column_names have been updated to handle column_names can be

@jgowdyelastic
Copy link
Member

jgowdyelastic commented Sep 18, 2020

I'm able to get this feature into a confused state by adding a custom named combined field and then pressing reset after an import.

  1. Delete the auto added combined field
  2. Add a new one with a custom name e.g."foo"
  3. Import the data
  4. Press reset

foo still exists in the mappings and pipeline JSON but Latitude,Longitude => foo (geo_point) isn't listed in the combined fields list.
Instead the default name is listed (Latitude,Longitude => location (geo_point) but location isn't in the JSON.

@nreese
Copy link
Contributor Author

nreese commented Sep 18, 2020

I'm able to get this feature into a confused state by adding a custom named combined field and then pressing reset after an import.

Thanks for pointing out. I have updated the logic in getDefaultState to use state.combinedFields on reset

@nreese nreese added review and removed WIP Work in progress labels Sep 18, 2020
Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice feature.
LGTM

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest edits and LGTM.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
ml 1409 +12 1397

async chunks size

id value diff baseline
ml 8.1MB +57.2KB 8.1MB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@nreese nreese merged commit f3c9d10 into elastic:master Sep 18, 2020
nreese added a commit to nreese/kibana that referenced this pull request Sep 18, 2020
* [ML] add geo point combined field to CSV import

* remove some geo_point specific logic

* Account for properties layer in find_file_structure mappings

* improve checking of name collision to include combined fields and mappings

* add delete button

* fix function name

* fill in unknowns with defined types

* tslint changes

* get tslint passing

* show readonly combined fields in simple tab

* handle column_names being undefined

* add unit tests for modifying mappings and pipeline

* review feedback

* do not change combinedFields on reset

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@nreese
Copy link
Contributor Author

nreese commented Sep 18, 2020

7.x/7.10 #77921

nreese added a commit that referenced this pull request Sep 18, 2020
* [ML] add geo point combined field to CSV import

* remove some geo_point specific logic

* Account for properties layer in find_file_structure mappings

* improve checking of name collision to include combined fields and mappings

* add delete button

* fix function name

* fill in unknowns with defined types

* tslint changes

* get tslint passing

* show readonly combined fields in simple tab

* handle column_names being undefined

* add unit tests for modifying mappings and pipeline

* review feedback

* do not change combinedFields on reset

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
@kmartastic
Copy link
Contributor

@nreese @peteharverson @jgowdyelastic

Was there ever any consideration to placing the UI for adding a combined field in the Advanced tab to be aligned with the "Mappings" section?

I relate new fields to mappings. Not sure if that's normal/expected.

@RaniaElKatatny
Copy link

In the geo point form, the dropdown lists of longitude and latitude are returning long fields only. I have longitude and latitude fields of type double but they are not appearing in the dropdown lists.
@nreese

@nreese
Copy link
Contributor Author

nreese commented Feb 28, 2022

In the geo point form, the dropdown lists of longitude and latitude are returning long fields only. I have longitude and latitude fields of type double but they are not appearing in the dropdown lists.

Would you mind moving this conversation to discuss? When you open a topic, please include a sample csv that is not displaying the lat/lon fields. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CSV upload - Indexing geo_point data from latitude and longitude columns
7 participants