Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Visualizer fails to import data without a timestamp #63526

Open
LeeDr opened this issue Apr 14, 2020 · 8 comments
Open

Data Visualizer fails to import data without a timestamp #63526

LeeDr opened this issue Apr 14, 2020 · 8 comments
Labels
bug Fixes for quality problems that affect the customer experience Feature:File and Index Data Viz ML file and index data visualizer :ml

Comments

@LeeDr
Copy link
Contributor

LeeDr commented Apr 14, 2020

Kibana version: 7.7.0 BC6

Elasticsearch version: 7.7.0 BC6

Server OS version: Windows 2012 Server

Browser version: Chrome (also IE11)

Browser OS version: Windows 10

Original install method (e.g. download page, yum, from source, etc.): zip files default distribution

Describe the bug: If Data Visualizer should accept files without timestamps, its not working in this version.

Steps to reproduce:

  1. From Kibana home, click Import a CSV, NDJSON, or log file
  2. try uploading a file without timestamps. I'll attach the one I tried.
    xpack-ascii.txt

Error on screen:

File could not be read
Bad Request: [illegal_argument_exception] Could not find a timestamp in the sample provided

Expected behavior: It should ingest the data

Screenshots (if relevant):
image

Errors in browser console (if relevant):

DevTools failed to load SourceMap: Could not load content for chrome-extension://hdokiejnpimakedhajhdlcegeplioahd/sourcemaps/onloadwff.js.map: HTTP error: status code 404, net::ERR_UNKNOWN_URL_SCHEME
ml#/filedatavisualizer:342 Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'unsafe-eval' 'self'". Either the 'unsafe-inline' keyword, a hash ('sha256-P5polb1UreUSOe5V/Pv7tc+yeZuJXiOi/3fqhGsU7BE='), or a nonce ('nonce-...') is required to enable inline execution.

bootstrap.js:10 ^ A single error about an inline script not firing due to content security policy is expected!
kbn-ui-shared-deps.js:381 INFO: 2020-04-14T21:04:50Z
  Adding connection to https://localhost:5601/elasticsearch


4.plugin.js:1 overrides undefined
VM469:1 POST https://localhost:5601/api/ml/file_data_visualizer/analyze_file 400 (Bad Request)
(anonymous) @ VM469:1
_callee3$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
fetchResponse @ commons.bundle.js:3
_callee$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
Promise.then (async)
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
_callee2$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
_callee$ @ 1.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 1.plugin.js:1
_next @ 1.plugin.js:1
(anonymous) @ 1.plugin.js:1
(anonymous) @ 1.plugin.js:1
http @ 1.plugin.js:1
analyzeFile @ 1.plugin.js:1
_callee3$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadSettings @ 4.plugin.js:1
_callee2$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadFile @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
bo @ kbn-ui-shared-deps.js:342
vo @ kbn-ui-shared-deps.js:342
vl @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
yl @ kbn-ui-shared-deps.js:342
ol @ kbn-ui-shared-deps.js:342
(anonymous) @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
Gi @ kbn-ui-shared-deps.js:342
Yi @ kbn-ui-shared-deps.js:342
ce @ kbn-ui-shared-deps.js:342
Ln @ kbn-ui-shared-deps.js:342
Dn @ kbn-ui-shared-deps.js:342
On @ kbn-ui-shared-deps.js:342
Show 25 more frames
4.plugin.js:1 Error: Bad Request
    at Fetch._callee3$ (commons.bundle.js:3)
    at l (kbn-ui-shared-deps.js:288)
    at Generator._invoke (kbn-ui-shared-deps.js:288)
    at Generator.forEach.e.<computed> [as next] (kbn-ui-shared-deps.js:288)
    at asyncGeneratorStep (commons.bundle.js:3)
    at _next (commons.bundle.js:3)
_callee3$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_throw @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadSettings @ 4.plugin.js:1
_callee2$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadFile @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
bo @ kbn-ui-shared-deps.js:342
vo @ kbn-ui-shared-deps.js:342
vl @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
yl @ kbn-ui-shared-deps.js:342
ol @ kbn-ui-shared-deps.js:342
(anonymous) @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
Gi @ kbn-ui-shared-deps.js:342
Yi @ kbn-ui-shared-deps.js:342
ce @ kbn-ui-shared-deps.js:342
Ln @ kbn-ui-shared-deps.js:342
Dn @ kbn-ui-shared-deps.js:342
On @ kbn-ui-shared-deps.js:342

Provide logs and/or server output (if relevant):

Any additional context:

@LeeDr LeeDr added bug Fixes for quality problems that affect the customer experience :ml Feature:File and Index Data Viz ML file and index data visualizer labels Apr 14, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@droberts195
Copy link
Contributor

Only highly structured formats like CSV and NDJSON are accepted without timestamps. The reason is that for semi-structured log files the definition of the first line of each message is the line containing the identified timestamp, so without a timestamp there’s no way to split the file into messages.

We should probably spell this out more clearly in the docs. Currently it is buried away in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-find-file-structure.html in the sentence:

For structured file formats, it is not compulsory to have a timestamp in the file.

@droberts195
Copy link
Contributor

I think we could make it possible to support import of semi-structured log files without timestamps, by implementing #38868 and elastic/elasticsearch#55219.

@cspielmann
Copy link

This feature was very useful indeed and was promoted in many of the elasticsearch/kibana tutos or videos. It would be good to have it back as now I am stuck with such a basic stuff. Is there any workaround ?

@droberts195
Copy link
Contributor

@cspielmann are you complaining that the entire feature has disappeared, or specifically that it doesn't work for semi-structured log files without timestamps?

I believe the whole feature was accidentally made inaccessible on a basic license for one minor release and then fixed in the following patch release. There will be a separate issue for that somewhere if that's the problem you've got.

Only highly structured formats like CSV and NDJSON are accepted without timestamps. That has always been the case. We could do an enhancement for semi-structured log files without timestamps, but that has never been demonstrated in a video as it has never worked. So please be more specific about exactly what doesn't work for you.

@cspielmann
Copy link

cspielmann commented Dec 3, 2020 via email

@droberts195
Copy link
Contributor

I am sorry I didn't want to be mean

No problem, I didn't think you were being mean, it's just that I wasn't completely clear what didn't work for you.

It seems that you explained it in https://discuss.elastic.co/t/upload-csv-file-without-timestamp-to-kibana-with-ml-fails/257376.

What happened is that there was something about your CSV file that failed to upload that meant the file structure finder didn't think it was CSV. As a result, it tried to analyse it as semi-structured text, and currently that only works when a timestamp can be detected.

So, the next question is, why wasn't your CSV file recognized as a CSV file? There are a few possible reasons:

  1. Maybe there was some extra non-CSV data at the end of the file?
  2. Maybe there were too few fields per row for it to be auto-detected as CSV - see [ML Data Visualizer] Upload failed due to missing timestamp elasticsearch#56325 (comment) for more discussion on this
  3. Maybe there were different numbers of fields on a few lines?

If it is reason 2 or 3 then you should upgrade to 7.10 where you will be able to take advantage of elastic/elasticsearch#55735 and #74376. When the initial analysis fails due to one of those reasons you'll be able to go to the overrides flyout and tell it that your file is CSV, and then up to 10% of the rows will be allowed to have a column count that's inconsistent with the header row and it will still be imported as best it can be.

The other benefit of upgrading is that you'll get the explanation of why it wasn't considered to be CSV, for example, "row 375 had 19 columns whereas the header had 17". This is really hard to spot by eye in a big text file (although easier in a spreadsheet program).

@MatthiasScholzTW
Copy link

I experienced a similar issue. At the end it turned out the the CSV data was corrupt, because of the chosen comma separator. Exporting the dataset using a semicolon solved the issue.

The error message seems to be just a bit misleading. A timestamp is not needed - as already described in comments above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:File and Index Data Viz ML file and index data visualizer :ml
Projects
None yet
Development

No branches or pull requests

5 participants