Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto detect field type when sorting on unmapped_type fields #4593

Closed
gmoskovicz opened this issue Aug 5, 2015 · 17 comments
Closed

Auto detect field type when sorting on unmapped_type fields #4593

gmoskovicz opened this issue Aug 5, 2015 · 17 comments
Labels
enhancement New value added to drive a business result Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort

Comments

@gmoskovicz
Copy link
Contributor

When discovering results we are setting the unmapped_type for the timestamp (probably for the default field?) field as boolean instead of date.

Inspecting the Kibana 4 call, we can find a query like:

{
  "size": 500,
  "sort": [
    {
      "@timestamp": {
        "order": "desc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "highlight": {
    "pre_tags": [
      "@kibana-highlighted-field@"
    ],
    "post_tags": [
      "@/kibana-highlighted-field@"
    ],
    "fields": {
      "*": {}
    },
    "fragment_size": 2147483647
  },
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "30s",
        "pre_zone": "-03:00",
        "pre_zone_adjust_large_interval": true,
        "min_doc_count": 0,
        "extended_bounds": {
          "min": 1438794998995,
          "max": 1438795898996
        }
      }
    }
  },
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gte": "now-1d",
                  "lte": "now+1d"
                }
              }
            }
          ],
          "must_not": []
        }
      }
    }
  },
  "fields": [
    "*",
    "_source"
  ],
  "script_fields": {},
  "fielddata_fields": [
    "@timestamp"
  ]
}

In Kibana, you will see a message like Discover: An error occurred with your request. Reset your inputs and try again..

In Elasticsearch, you will find an exception like

[2015-08-05 15:03:40,962][DEBUG][action.search.type       ] [<nodename>] [<indexname>][2]: Failed to execute [org.elasticsearch.action.search.SearchRequest@4598f49] while moving to second phase
java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.lucene.util.BytesRef
    at org.apache.lucene.search.FieldComparator$TermOrdValComparator.compareValues(FieldComparator.java:902)
    at org.apache.lucene.search.TopDocs$MergeSortQueue.lessThan(TopDocs.java:172)
    at org.apache.lucene.search.TopDocs$MergeSortQueue.lessThan(TopDocs.java:120)
    at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java:225)
    at org.apache.lucene.util.PriorityQueue.add(PriorityQueue.java:133)
    at org.apache.lucene.search.TopDocs.merge(TopDocs.java:234)
    at org.elasticsearch.search.controller.SearchPhaseController.sortDocs(SearchPhaseController.java:239)
    at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryThenFetchAction.java:89)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:403)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:202)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:178)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:175)
    at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:568)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Instead of being setting the unmapped_type to a boolean, it should be set to a date field.

@gmoskovicz gmoskovicz added the bug Fixes for quality problems that affect the customer experience label Aug 5, 2015
@lukasolson
Copy link
Member

  1. Click on Settings
  2. Click on Advanced
  3. Change the value for sort:options from { "unmapped_type": "boolean" } to { "unmapped_type": "date" }

Hopefully this solves your issue. :) Please let us know if it doesn't.

@Inderjeet26
Copy link

Works for me

@acs
Copy link

acs commented Sep 29, 2015

For me also!

@gmoskovicz
Copy link
Contributor Author

@lukasolson should we add the sort options for the unmapped type to date by default?

@ichandan16
Copy link

Worked for me too

@lukasolson
Copy link
Member

@rashidkpc Is there a reason we chose boolean instead of date here?

@rashidkpc
Copy link
Contributor

I tried to pick the simplest data type possible. The parameter is really only used when a field is missing in some index, so blanket setting it to date would only work for instances in which a date field was the one missing.

See unmapped_type docs here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_ignoring_unmapped_fields

@gmoskovicz
Copy link
Contributor Author

what about dinamically setting the unmapped field in the call for the field that we are trying to aggregate @lukasolson @rashidkpc

@rashidkpc
Copy link
Contributor

This doesn't apply to aggregations, its only for document sorting. We could probably add an option on the field in the index pattern for the type to use if the field is missing, but really that would be a UI hack to a data problem

The right answer here is to fix your data so you're not trying to sort on things that don't exist and update the mapping for your indices that are missing timestamp to have a mapping for the timestamp field even if there are no value for it. See the put mapping api: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

@gmoskovicz gmoskovicz reopened this Dec 3, 2015
@gmoskovicz
Copy link
Contributor Author

We could probably add an option on the field in the index pattern for the type to use if the field is missing, but really that would be a UI hack to a data problem

This could be a good approach. At this point, i've pointed this issue to many people, so looks like it's still confusing and it happens that people from the community use multiple indices to search and sort with fields that don't work.

Some times, there is no fix for the data. It makes sense when it comes to sort by timestamp, but when another field is being sorted, sometimes it's better to just set at the field level the unmapped type, since changed the unmapped type for the entire data to be a single type doesn't work for every query.

@smruti-sahoo
Copy link

Hi,
I was facing similar issue -
I tried with "unmapped_type": "boolean" but logs(message) are showing in reverse order and some lines are not in proper order. Could someone please help me here.
Initially I was thinking its logstash/filebeat which is causing issue, in fact they are not

Here is excerpt of kibana output and actual log output

Kibana Dashboard

@time @message


August 4th 2016, 19:14:17.212 INFO 2016-08-04 13:44:16,844 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DefaultArchiveDeployer:
August 4th 2016, 19:14:17.212 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_79]
August 4th 2016, 19:14:17.211 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_79]
August 4th 2016, 19:14:17.211 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) ~[?:1.7.0_79]
August 4th 2016, 19:14:17.211 at org.mule.module.launcher.DeploymentDirectoryWatcher.redeployModifiedArtifacts(DeploymentDirectoryWatcher.java:549) ~[?:?]
August 4th 2016, 19:14:17.211 at org.mule.module.launcher.DeploymentDirectoryWatcher.redeployModifiedApplications(DeploymentDirectoryWatcher.java:538) ~[?:?]
August 4th 2016, 19:14:17.211 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.7.0_79]
August 4th 2016, 19:14:17.211 at org.mule.module.launcher.DeploymentDirectoryWatcher.run(DeploymentDirectoryWatcher.java:348) ~[?:?]
August 4th 2016, 19:14:17.211 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) ~[?:1.7.0_79]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.DefaultArchiveDeployer.redeploy(DefaultArchiveDeployer.java:544) ~[?:?]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.artifact.ArtifactWrapper$3.execute(ArtifactWrapper.java:74) ~[?:?]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.artifact.ArtifactWrapper.executeWithinArtifactClassLoader(ArtifactWrapper.java:129) ~[?:?]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.DefaultArtifactDeployer.deploy(DefaultArtifactDeployer.java:24) ~[?:?]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.application.DefaultMuleApplication.install(DefaultMuleApplication.java:101) ~[?:?]
August 4th 2016, 19:14:17.210 at org.mule.module.launcher.artifact.ArtifactWrapper.install(ArtifactWrapper.java:69) ~[?:?]
August 4th 2016, 19:14:17.209 + Failed to deploy artifact 'hr-pub-global', see below +
August 4th 2016, 19:14:17.209 org.mule.module.launcher.InstallException: Config for app 'xxxxxx' not found: /opt/mule/apps/xxxxxx/loggingsetup.xml
August 4th 2016, 19:14:17.209 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
August 4th 2016, 19:14:17.209 ERROR 2016-08-04 13:44:16,843 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DefaultArchiveDeployer:
August 4th 2016, 19:14:17.209 INFO 2016-08-04 13:44:16,843 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.application.DefaultMuleApplication: App 'xxxxxx' never started, nothing to dispose of

This is Actual output means, this is what is in application log file

Application Log

  • New app 'xxxxxx' +
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    INFO 2016-08-04 13:44:16,843 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.application.DefaultMuleApplication: App 'xxxxxx' never started, nothing to dispose of
    ERROR 2016-08-04 13:44:16,843 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DefaultArchiveDeployer:
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • Failed to deploy artifact 'xxxxxx', see below +
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    org.mule.module.launcher.InstallException: Config for app 'xxxxxx' not found: /opt/mule/apps/xxxxxx/loggingsetup.xml
    at org.mule.module.launcher.application.DefaultMuleApplication.install(DefaultMuleApplication.java:101) ~[?:?]
    at org.mule.module.launcher.artifact.ArtifactWrapper$3.execute(ArtifactWrapper.java:74) ~[?:?]
    at org.mule.module.launcher.artifact.ArtifactWrapper.executeWithinArtifactClassLoader(ArtifactWrapper.java:129) ~[?:?]
    at org.mule.module.launcher.artifact.ArtifactWrapper.install(ArtifactWrapper.java:69) ~[?:?]
    at org.mule.module.launcher.DefaultArtifactDeployer.deploy(DefaultArtifactDeployer.java:24) ~[?:?]
    at org.mule.module.launcher.DefaultArchiveDeployer.redeploy(DefaultArchiveDeployer.java:544) ~[?:?]
    at org.mule.module.launcher.DeploymentDirectoryWatcher.redeployModifiedArtifacts(DeploymentDirectoryWatcher.java:549) ~[?:?]
    at org.mule.module.launcher.DeploymentDirectoryWatcher.redeployModifiedApplications(DeploymentDirectoryWatcher.java:538) ~[?:?]
    at org.mule.module.launcher.DeploymentDirectoryWatcher.run(DeploymentDirectoryWatcher.java:348) ~[?:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_79]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) ~[?:1.7.0_79]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) ~[?:1.7.0_79]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.7.0_79]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_79]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_79]
    at java.lang.Thread.run(Thread.java:745) [?:1.7.0_79]
    INFO 2016-08-04 13:44:16,844 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DefaultArchiveDeployer: ================== Request to Undeploy Artifact: xxxxxx
    INFO 2016-08-04 13:44:16,844 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.application.DefaultMuleApplication: App 'xxxxxx' never started, nothing to dispose of

@tbragin
Copy link
Contributor

tbragin commented Feb 6, 2017

Removing discuss label - we will triage this in the context of Discovery team.

@tbragin tbragin removed the discuss label Feb 6, 2017
@Bargs
Copy link
Contributor

Bargs commented Feb 6, 2017

I agree we should be able to set this dynamically, assuming the selected field doesn't have conflicting types across indices. The only complication is that we don't maintain the exact field type in our index pattern cache, it's normalized into things like "number" and "string" instead of "long" or "keyword". We'd either need to do an additional query to figure out the exact type to use or figure out if there's a safe default to use for each normalized type.

@tbragin tbragin added the P3 label Feb 8, 2017
@epixa epixa removed the P3 label Apr 25, 2017
@timroes timroes added Feature:Search Querying infrastructure in Kibana Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Discovery labels Sep 16, 2018
@stacey-gammon
Copy link
Contributor

Is this really something we need to implement in Kibana, or is the right solution just to not use an unmapped_type in the index?

Wondering if we should close this as "will not implement". If we do want to keep it around, perhaps re-label as enhancement request and update the title to be a bit more descriptive (e.g. "Allow sorting on unmapped field types by dynamically guessing their type"). But weighing in without much context, it seems like adding this support in Kibana is more trouble than it's worth.

@Bargs
Copy link
Contributor

Bargs commented Sep 20, 2018

I think unmapped fields are quite common for index patterns with schemas that change often. We really don't want a sort to fail just because a user added a field at some point. However I could see an argument being made for implementing this in ES instead of Kibana. Would be cool if the unmapped_type param had an "auto" option. It would probably use the exact same algorithm we would use, but would have better access to the necessary info to make a good choice.

@stacey-gammon stacey-gammon changed the title unmapped_type wrongly set for sorting Auto detect field type when sorting on unmapped_type fields Sep 24, 2018
@stacey-gammon stacey-gammon added enhancement New value added to drive a business result and removed bug Fixes for quality problems that affect the customer experience labels Sep 24, 2018
@vzunigav-aeropost
Copy link

Works for me too

@timroes timroes added :AppArch and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Mar 27, 2019
@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jun 2, 2021
@lukasolson
Copy link
Member

I don't believe we have any near-term plans to work on this. This does seem like something that would be better implemented in Elasticsearch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort
Projects
None yet
Development

No branches or pull requests