Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documents saved with es-archiver are missing source fields #118639

Closed
jportner opened this issue Nov 15, 2021 · 3 comments · Fixed by #118642
Closed

Documents saved with es-archiver are missing source fields #118639

jportner opened this issue Nov 15, 2021 · 3 comments · Fixed by #118642
Labels
bug Fixes for quality problems that affect the customer experience Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@jportner
Copy link
Contributor

jportner commented Nov 15, 2021

Kibana version: 8.0 (unreleased) and main

Describe the bug:

When using es-archiver to save an archive, the documents are all missing their source fields

Steps to reproduce:

  1. Start Elasticsearch and Kibana
  2. Log into Kibana and install sample data from the home page
  3. Use es-archiver to save an archive of the .kibana index:
    node scripts/es_archiver.js --es-url http://elastic:changeme@localhost:9200 --kibana-url http://elastic:changeme@localhost:5601 save ./tmp .kibana --raw
    
  4. Observe that all documents are missing their source fields

Expected behavior:

Saved documents should include their source fields

Any additional context:

I tested this on 7.16 and it still works as expected. This appears to only be a problem in 8.0+, possibly due to the recent upgrade to the 8.0 version of the ES client (#113950).

I traced the problem down to this line:

It appears that this used to be evaluated to a boolean, and now it is being treated as a string. In other words,

  • old behavior: include all source fields
  • new behavior: include only the "true" source field

Also, the same problem is present in the incrementCounter method of the saved objects repository:

This is not trivial to test because we don't expose an HTTP API for that method. However, you can apply this diff:

diff --git a/src/core/server/saved_objects/service/lib/repository.ts b/src/core/server/saved_objects/service/lib/repository.ts
index d538690fb19..2982376dc95 100644
--- a/src/core/server/saved_objects/service/lib/repository.ts
+++ b/src/core/server/saved_objects/service/lib/repository.ts
@@ -1858,6 +1858,7 @@ export class SavedObjectsRepository {
       },
     });
 
+    console.log(`body.get for ${type}:${id} - ${JSON.stringify(body.get)}`);
     const { originId } = body.get?._source ?? {};
     return {
       id,

and while you are using Kibana, note a lot of console logs that look like this:

body.get for core-usage-stats:core-usage-stats - {"_seq_no":600,"_primary_term":1,"found":true,"_source":{}}
body.get for application_usage_daily:home:2021-11-15 - {"_seq_no":602,"_primary_term":1,"found":true,"_source":{}}

The logs indicate that the returned object _source field is empty.

However, changing the cluster call options to use _source: true makes them behave as expected -- the returned docs include all source fields as intended.

@jportner jportner added the bug Fixes for quality problems that affect the customer experience label Nov 15, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Nov 15, 2021
@jportner
Copy link
Contributor Author

jportner commented Nov 15, 2021

@mshustov When I was looking at the SOR, I realized you changed it from _source: true to _source: 'true' in your PR (#72289).

I saw you commented here: #72289 (comment)

I'm not 100% sure of the intent behind your comment, do you or @delvedor have any additional context to share?

@jportner jportner added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Nov 15, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@delvedor
Copy link
Member

delvedor commented Nov 16, 2021

This is a very fun side effect.
You can set the _source parameter in both querystring and body. Given that the querystring serializes booleans as strings, writing 'true' or true produces the same effect, while in the body those have two different meanings, the 'true' field, and the true _source configuration.
In v7, the client was sending it via querystring, but the v8 client always prefers the body if there are duplicate parameters.
The fix, as you mentioned, is to update 'true' to true.

I would recommend updating v7 from 'true' to true as well, as it's more semantically correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants