Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing async csv_from_savedobject report #62986

Closed
theY4Kman opened this issue Apr 8, 2020 · 9 comments · Fixed by #71031
Closed

Error executing async csv_from_savedobject report #62986

theY4Kman opened this issue Apr 8, 2020 · 9 comments · Fixed by #71031
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead feedback_needed

Comments

@theY4Kman
Copy link

Kibana version: 7.6.2

Elasticsearch version: 7.6.2

Server OS version: Docker image

Browser version:

Browser OS version:

Original install method (e.g. download page, yum, from source, etc.): Docker

Describe the bug: After submitting a CSV from Saved Object reporting job with /api/reporting/v1/generate/csv/saved-object/search:xyz, the executor fails with TypeError: Cannot read property 'settings' of undefined

Steps to reproduce:

  1. Create a saved search
  2.  POST http://kibana/api/reporting/v1/generate/csv/saved-object/search:<saved-search-uuid>
     Content-Type: application/json
    
     {}
    
  3. Job status changes to failed, and error visible in server logs.

Expected behavior: Reporting job should be executed successfully, and change its state to completed.

Provide logs and/or server output (if relevant):

log   [16:28:43.835] [debug][basic][plugins][security] Trying to authenticate user request to /api/reporting/v1/generate/csv/saved-object/search:c3960d50-eca6-4dc7-95f2-9bfadabb3fe3.
log   [16:28:43.835] [debug][basic][plugins][security] Trying to authenticate via header.
log   [16:28:43.838] [debug][basic][plugins][security] Request has been authenticated via header.
log   [16:28:43.838] [debug][api-authorization][plugins][security] API endpoint is not marked with "access:" tags, skipping.
log   [16:28:43.868] [debug][esqueue][queue-worker][reporting] k8rjrikz0006f5ccd7d12kcg - Job created in index .reporting-2020.04.05
log   [16:28:43.883] [debug][esqueue][queue-worker][reporting] k8rjrikz0006f5ccd7d12kcg - Job index refreshed .reporting-2020.04.05
log   [16:28:43.883] [info][queue-job][reporting] Successfully queued job: k8rjrikz0006f5ccd7d12kcg
spons [16:28:43.832] [api] POST /api/reporting/v1/generate/csv/saved-object/search:c3960d50-eca6-4dc7-95f2-9bfadabb3fe3 200 52ms - 9.0B
log   [16:28:44.303] [debug][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - 1 outstanding jobs returned
log   [16:28:44.312] [info][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - Job marked as claimed: /.reporting-2020.04.05/_doc/k8rjrikz0006f5ccd7d12kcg
log   [16:28:44.312] [info][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - Starting job
log   [16:28:44.312] [debug][csv_from_savedobject][execute-job][k8rjrikz0006f5ccd7d12kcg][reporting] Execute job generating [search] csv
log   [16:28:44.312] [info][csv_from_savedobject][execute-job][k8rjrikz0006f5ccd7d12kcg][reporting] Executing job async using encrypted headers
log   [16:28:44.320] [error][csv_from_savedobject][execute-job][k8rjrikz0006f5ccd7d12kcg][reporting] Generate CSV Error! TypeError: Cannot read property 'settings' of undefined
log   [16:28:44.321] [error][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - Failure occurred on job k8rjrikz0006f5ccd7d12kcg: TypeError: Cannot read property 'settings' of undefined
  at KibanaRequest.getRouteInfo (/usr/share/kibana/src/core/server/http/router/request.js:113:23)
  at new KibanaRequest (/usr/share/kibana/src/core/server/http/router/request.js:92:46)
  at Function.from (/usr/share/kibana/src/core/server/http/router/request.js:43:12)
  at getKibanaRequest (/usr/share/kibana/x-pack/plugins/security/server/saved_objects/index.js:22:114)
  at SavedObjectsClientProvider.savedObjects.setClientFactory [as _clientFactory] (/usr/share/kibana/x-pack/plugins/security/server/saved_objects/index.js:27:27)
  at SavedObjectsClientProvider.getClient (/usr/share/kibana/src/core/server/saved_objects/service/lib/scoped_client_provider.js:50:25)
  at Object.getScopedSavedObjectsClient (/usr/share/kibana/src/legacy/server/saved_objects/saved_objects_mixin.js:129:56)
  at generateCsvSearch (/usr/share/kibana/x-pack/legacy/plugins/reporting/export_types/csv_from_savedobject/server/lib/generate_csv_search.js:48:43)
  at generateCsv (/usr/share/kibana/x-pack/legacy/plugins/reporting/export_types/csv_from_savedobject/server/lib/generate_csv.js:25:22)
  at executeJob (/usr/share/kibana/x-pack/legacy/plugins/reporting/export_types/csv_from_savedobject/server/execute_job.js:82:37)
log   [16:28:44.322] [debug][queue-worker][reporting] Worker error: (k8rjrikz0006f5ccd7d12kcg)
log   [16:28:44.322] [warning][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - Failing job k8rjrikz0006f5ccd7d12kcg
log   [16:28:44.329] [info][esqueue][queue-worker][reporting] k8riwfo70006f5ccd7b8p183 - Job marked as failed: /.reporting-2020.04.05/_doc/k8rjrikz0006f5ccd7d12kcg

Any additional context:

@theY4Kman theY4Kman changed the title Error executing csv_from_savedobject report Error executing async csv_from_savedobject report Apr 8, 2020
@joelgriffith joelgriffith added bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead labels Apr 8, 2020
@joelgriffith
Copy link
Contributor

@theY4Kman we can definitely do a better job of this, but IIRC your JSON body needs some options (settings) in order to function properly.

We'll take a look and see if there's a more intuitive way for declaring these runtime errors.

@theY4Kman
Copy link
Author

theY4Kman commented Apr 8, 2020

The same call worked in 7.4.2. The csv_from_savedobject job type uses the saved object ID passed in the route params to retrieve the saved object and generate all the params needed. Optionally, a body could be passed with { timerange: { min, max } } and/or { state: { query: ... } }.

I believe this error is related to the dummy request / RequestFacade that's now making its way into the New Platform (?) new KibanaRequest(). When KibanaRequest.getRouteInfo() is called on this dummy request object, it doesn't have request.route to access request.route.settings upon.

export class KibanaRequest<
  Params = unknown,
  Query = unknown,
  Body = unknown,
  Method extends RouteMethod = any
> {
// ...
  constructor(
    request: Request,
    // ...
  ) {
    this.url = request.url;
    this.headers = deepFreeze({ ...request.headers });

    // ...

    this.route = deepFreeze(this.getRouteInfo(request)); // <---- from here
    this.socket = new KibanaSocket(request.raw.req.socket);
    this.events = this.getEvents(request);
  }

  // ...

  private getRouteInfo(request: Request): KibanaRequestRoute<Method> {
    const method = request.method as Method;
    const { parse, maxBytes, allow, output } = request.route.settings.payload || {};  // <------ source of error here

    const options = ({
      authRequired: request.route.settings.auth !== false,
      tags: request.route.settings.tags || [],
      body: ['get', 'options'].includes(method)
        ? undefined
        : {
            parse,
            maxBytes,
            accepts: allow,
            output: output as typeof validBodyOutput[number], // We do not support all the HAPI-supported outputs and TS complains
          },
    } as unknown) as KibanaRequestRouteOptions<Method>; // TS does not understand this is OK so I'm enforced to do this enforced casting

    // ...
  }
}

Plus, the dummy request won't have request.raw.req.socket used by the new KibanaSocket() at the end of the KibanRequest constructor

@theY4Kman
Copy link
Author

Oooo, though, it would be hella sweet to have errors like those saved to the reporting job — right now, the only info is either completed 🎉 or failed 😭

@theY4Kman
Copy link
Author

I looked at the csv export type, and found a more rounded fake request body. That resolved the initial issue when executing the job.

Fixing that uncovered another issue: the executor was unable to find the index pattern referenced in the saved object. While debugging, I found that when executing the job, the saved objects client was not scoped to the user or their space.

I once again looked at the csv export type, and found that basePath is used to curry along the requested space ID. I changed the csv_from_savedobject job creator to save this basePath to the JobDocPayloadPanelCsv (alongside jobParams), and now the jobs succeed.

Here are the changes I made:

https://github.com/PerchSecurity/kibana/compare/v7.6.2..fix/7.6.2-csv_from_savedobject

Judging from the previous return type of the executeJobFactory for csv_from_savedobject, which picked up the ImmediateExecuteFn tag over time, I think the ability of csv_from_savedobject to be executed asynchronously was accidentally forgotten. When New Platform overhauled the saved objects client, I think it unwittingly broke these async jobs.

@tsullivan
Copy link
Member

Judging from the previous return type of the executeJobFactory for csv_from_savedobject, which picked up the ImmediateExecuteFn tag over time, I think the ability of csv_from_savedobject to be executed asynchronously was accidentally forgotten. When New Platform overhauled the saved objects client, I think it unwittingly broke these async jobs.

Indeed this ability has gone out of maintenance. Originally, we had automated functional tests to validate this feature, but the test repeatedly was hit by flaky test failures, and skipped: #37471. Apologies for this!

One question though: why not use the POST URL that you can get to export a saved search from the Discover app?

@tsullivan
Copy link
Member

I'm adding the feedback label to this PR. We plan on totally removing the /api/reporting/v1/generate/csv/saved-object/search API since it doesn't have a real purpose.

@tsullivan
Copy link
Member

tsullivan commented Jul 8, 2020

Hi @theY4Kman just want to see if you have feedback about why you are not using the Discover POST URL for the CSV exports. I'm planning on removing the "async csv_from_savedobject" code and route in #71031

The only reason I can think of choosing this API over Discover CSV export is this export doesn't take the search results through the field formatters and provides raw data. However, that kind of inconsistency is hard to support, and I believe most users would say that's a bug.

If that is important, perhaps that should be an option for Discover CSV export.

@theY4Kman
Copy link
Author

theY4Kman commented Jul 9, 2020

My primary motive is to initiate the generation of reports externally. The "async csv_from_savedobject" endpoint was perfect for this: it allowed the reporting job to be initiated without waiting on the response (or failing if a reverse proxy timed out the request); and additionally, it only required the saved object type & ID.

If #71031 allows a job to be initiated and return a job ID, regardless of how long it takes to process the job, it'd likely be adequate for our purposes: creating hands-off, periodic reports.

As for why I'm not using Discover CSV: it's my understanding that the Discover POST URL includes the raw Elasticsearch query; however, the saved search saved object does not include this raw query — it's some other internal format from Kibana. Things like filters, etc. So, in order to submit Discover CSV reports externally, I'd have to duplicate / reverse engineer Kibana's filter translation, which seems brittle, easily outdated, and hard to debug.

Submitting dashboard jobs is hard enough, where we've gotta synthesize a reasonable height to submit with the job. (The print layout is too sparse, and surprising to users who expect what they see when generating reports from within Kibana.)

@tsullivan
Copy link
Member

@theY4Kman your problem statements echo the same concerns as the Reporting Services team which tries to provide an API to export CSV the same as it would show in a browser. To export this data, we need the same query that the browser uses to search. Historically in Kibana code, the source of generating that query has been in browser-side Angular code that can't be shared with the server. The solution was to pass the entire query in the API to generate a request. Obviously that is not ideal for all the reasons you've pointed out.

The Reporting Service code for generating a CSV export based on Saved Object ID isn't "perfect" and still requires a lot of side-by-side catching up, and bloating of API params over time. We're also handling and passing through JSON that we don't understand why is needed.

The solution are are looking for is already on the @elastic/kibana-app-arch radar here: #65069. As I've been working with that team and asking questions about how to fix things in Reporting, I've been tracking their efforts to un-bundle their logic from Angular and get us to a point where a plugin like Reporting can share their source of data access on the server.

I hope that helps! Sorry that it will still take some time to get to a better API for CSV export.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead feedback_needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants