Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose _async_search metadata to kibana admin user #57537

Closed
Tracked by #61738
lizozom opened this issue Jun 2, 2020 · 11 comments · Fixed by #62947
Closed
Tracked by #61738

Expose _async_search metadata to kibana admin user #57537

lizozom opened this issue Jun 2, 2020 · 11 comments · Fixed by #62947
Assignees
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@lizozom
Copy link

lizozom commented Jun 2, 2020

This feature is part of elastic/kibana#61738

When Kibana runs BackgroundSessions, it will need to have a monitoring service that periodically checks for the status of each session. We will use that information to mark sessions that are complete or have errors. It will also be responsible for sending push notifications back to users.

In order to do that, we need to have access to each _async_search's metadata, using the kibana admin user (rather than the user that created the search). This meta data should include the amount of total shards, amount of shards processed and the is_running and is_partial flags.

@lizozom lizozom added >enhancement needs:triage Requires assignment of a team area label labels Jun 2, 2020
@romseygeek romseygeek added the :Search/Search Search-related issues that do not fall into other categories label Jun 3, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 3, 2020
@romseygeek romseygeek removed the needs:triage Requires assignment of a team area label label Jun 3, 2020
@jimczi
Copy link
Contributor

jimczi commented Jun 3, 2020

I don't think we should tied this to the Kibana admin user. IMO the main question is whether we want to expose the metadata associated with an async search to any user. These metadata infos shouldn't contain any sensible information so I don't see why we couldn't. The following question is how should it be exposed, maybe an additional parameter on the get API is enough ?

@lizozom
Copy link
Author

lizozom commented Jun 14, 2020

@jimczi note that when we do query for meta data - we will always do so for multiple objects (i.e. take all running background searches, and fetch their metadata).

Maybe it's worth adding another route for that altogether?

@lizozom
Copy link
Author

lizozom commented Jun 15, 2020

@jimczi I implemented the monitoring task I would like to run to track the completion of async requests.
You can see the place where I would like to query that new API here https://github.com/lizozom/kibana/pull/8/files#diff-462f5bdb3bd70cec29416a1967e9119fR44

@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Aug 27, 2020

we will always do so for multiple objects

@lizozom Do you need an API that provides metadata about a particular single async request? Or metadata about all async requests available?

Or perhaps you would like to submit to this API a list of several async requests IDs?


If you just need metadata about a single async request, as @jimczi suggested we can reuse the same GET request with a metadata parameter:

GET /_async_search/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=?metadata=true

In response we will get the same response as in GET request, but without data sections (sections hits, aggs, suggestions will not be included).

{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : true, 
  "is_running" : true, 
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986, 
  "response" : {
    "took" : 12144,
    "timed_out" : false,
    "num_reduce_phases" : 46, 
    "_shards" : {
      "total" : 562, 
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
    }
  }
}

For an alternative where it is possible to get metadata about several async requests IDs, we can introduce a new endpoint – metadata:

GET /_async_search/metadata
{
  "async_search_ids" : [
      "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
      "DnF1ZXJ5VGhlbkZldGNoBQAAAAAJFUFEaMkZ5QTVSTZSaVN3WlNFVmtlWHJsdzoxMDc="
    ]
}

and a response here will consists of several responses:

{
  "took": 0,
  "responses": [
    {
      "id": "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
      "is_partial": true,
      "is_running": true,
      "start_time_in_millis": 1583945890986,
      "expiration_time_in_millis": 1584377890986,
      "response": {
        "took": 12144,
        "timed_out": false,
        "num_reduce_phases": 46,
        "_shards": {
          "total": 562,
          "successful": 188,
          "skipped": 0,
          "failed": 0
        }
      }
    },
    {
      "id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAJFUFEaMkZ5QTVSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
      "is_partial": true,
      "is_running": true,
      "start_time_in_millis": 1583945890986,
      "expiration_time_in_millis": 1584377890986,
      "response": {
        "took": 12144,
        "timed_out": false,
        "num_reduce_phases": 46,
        "_shards": {
          "total": 562,
          "successful": 188,
          "skipped": 0,
          "failed": 0
        }
      }
    }
  ]
}

@javanna
Copy link
Member

javanna commented Aug 27, 2020

thanks for your thoughts @mayya-sharipova ! I have a preference for a specific API, given that the functionality it offers differ quite a bit from that of the get async search API that is used to retrieve results. It is more of a way to poll for the status of the async search. Also, given that we want this functionality to require different permissions, it may make sense to make it its own endpoint and transport action that requires its own role.

@mayya-sharipova mayya-sharipova self-assigned this Aug 31, 2020
@lizozom
Copy link
Author

lizozom commented Aug 31, 2020

@mayya-sharipova I think there is no single answer to this question:

From the perspective of Kibana, having a bulk API would be better, as that would allow us grouping queries, at least per Background Session, and minimizing the number of requests to be made.
From the perspective of simplicity for Elasticsearch, I assume, a per-async-search API is better, but then Kibana would be hitting it a lot more.

So as long as it's efficient enough for Elasticsearch, and as long as we can get the status of each async search somehow, then it's good.

I do think that having the same route have different permissions based on the metadata=true option is a bit weird. Don't you think?

@mayya-sharipova
Copy link
Contributor

@javanna @lizozom Thank you for your feedback. It makes sense to me.

So, let's have the following API that can provide the status of several async searches. This API will be accessible to any user.

GET /_async_search/status
{
  "async_search_ids" : [
      "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
      "DnF1ZXJ5VGhlbkZldGNoBQAAAAAJFUFEaMkZ5QTVSTZSaVN3WlNFVmtlWHJsdzoxMDc="
    ]
}

@jimczi Do you have any objections/suggestions about this API?

@mayya-sharipova
Copy link
Contributor

I've chatted with @jimczi and @javanna offline, and we've decided for now on an API to retrieve metadata about an individual async search:

GET /_async_search/status/<id> 

In future, we will also consider retrieving metadata in bulk.

@mayya-sharipova
Copy link
Contributor

Forgot to say about permissions, I am thinking this endpoint should have "cluster:monitor" privilege, that kibana_admin user should also have.

mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Sep 28, 2020
Introduce async search status API

GET /_async_search/status/<id>

The API is restricted to the monitoring_user role.

For a running async search, the response is:

```js
{
  "id" : <id>,
  "is_running" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
  }
}
```

For a completed async search, the response is:

```js
{
  "id" : <id>
  "is_running" : false,
  "expiration_time_in_millis" : 1584377890986
}
```

----
Techincal details:
We first try to retrieve the status of the async search from tasks.
If this doesn't succeed, we retrieve it from an index: .async-search.
In case of retrieving from the index, we assume that the async search is
completed, and a shorter response for the status is returned.

Closes elastic#57537
@lizozom
Copy link
Author

lizozom commented Oct 4, 2020

@mayya-sharipova permissions sound ok.
And as a first phase, the individual API is ok.

Thanks 👍

mayya-sharipova added a commit that referenced this issue Nov 3, 2020
Introduce async search status API

GET /_async_search/status/<id>

The API is restricted to the monitoring_user role.

For a running async search, the response is:

```js
{
  "id" : <id>,
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
  }
}
```

For a completed async search, an additional
`completion_status` fields is added.

```js
{
  "id" : <id>,
  "is_running" : false,
  "is_partial" : false,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 562,
      "skipped" : 0,
      "failed" : 0
  },
 "completion_status" : 200
}
```

Closes #57537
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Nov 3, 2020
Introduce async search status API

GET /_async_search/status/<id>

The API is restricted to the monitoring_user role.

For a running async search, the response is:

```js
{
  "id" : <id>,
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
  }
}
```

For a completed async search, an additional
`completion_status` fields is added.

```js
{
  "id" : <id>,
  "is_running" : false,
  "is_partial" : false,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 562,
      "skipped" : 0,
      "failed" : 0
  },
 "completion_status" : 200
}
```

Closes elastic#57537
Backport for elastic#62947
mayya-sharipova added a commit that referenced this issue Nov 3, 2020
Introduce async search status API

GET /_async_search/status/<id>

The API is restricted to the monitoring_user role.

For a running async search, the response is:

```js
{
  "id" : <id>,
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188,
      "skipped" : 0,
      "failed" : 0
  }
}
```

For a completed async search, an additional
`completion_status` fields is added.

```js
{
  "id" : <id>,
  "is_running" : false,
  "is_partial" : false,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 562,
      "skipped" : 0,
      "failed" : 0
  },
 "completion_status" : 200
}
```

Closes #57537
Backport for #62947
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants