Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] adding new _preview endpoint for data frame analytics #69453

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/reference/ml/df-analytics/apis/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ include::get-dfanalytics.asciidoc[leveloffset=+2]
include::get-dfanalytics-stats.asciidoc[leveloffset=+2]
include::get-trained-models.asciidoc[leveloffset=+2]
include::get-trained-models-stats.asciidoc[leveloffset=+2]
//PREVIEW
include::preview-dfanalytics.asciidoc[leveloffset=+2]
//SET/START/STOP
include::start-dfanalytics.asciidoc[leveloffset=+2]
include::stop-dfanalytics.asciidoc[leveloffset=+2]
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

You can use the following APIs to perform {ml} {dfanalytics} activities.

* <<preview-dfanalytics,Preview {dfanalytics}>>
* <<put-dfanalytics,Create {dfanalytics-jobs}>>
* <<update-dfanalytics,Update {dfanalytics-jobs}>>
* <<delete-dfanalytics,Delete {dfanalytics-jobs}>>
Expand Down
107 changes: 107 additions & 0 deletions docs/reference/ml/df-analytics/apis/preview-dfanalytics.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
[role="xpack"]
[testenv="platinum"]
[[preview-dfanalytics]]
= Preview {dfanalytics} API

[subs="attributes"]
++++
<titleabbrev>Preview {dfanalytics}</titleabbrev>
++++

Previews the features used by a {dataframe-analytics-config}.

beta::[]


[[ml-preview-dfanalytics-request]]
== {api-request-title}

`GET _ml/data_frame/analytics/_preview` +

`POST _ml/data_frame/analytics/_preview` +

`GET _ml/data_frame/analytics/<data_frame_analytics_id>/_preview` +

`POST _ml/data_frame/analytics/<data_frame_analytics_id>/_preview`


[[ml-preview-dfanalytics-prereq]]
== {api-prereq-title}

If the {es} {security-features} are enabled, you must have the following
privileges:

* cluster: `monitor_ml`

For more information, see <<security-privileges>> and {ml-docs-setup-privileges}.


[[ml-preview-dfanalytics-desc]]
== {api-description-title}

This API provides preview of the extracted features for a {dataframe-analytics-config}
that either exists already or one that has not been created yet.


[[ml-preview-dfanalytics-path-params]]
== {api-path-parms-title}

`<data_frame_analytics_id>`::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=job-id-data-frame-analytics]

[[ml-preview-dfanalytics-request-body]]
== {api-request-body-title}

`config`::
(Optional, object)
A {dataframe-analytics-config} as described in <<put-dfanalytics>>.
Note that `id` and `dest` don't need to be provided in the context of this API.

[role="child_attributes"]
[[ml-preview-dfanalytics-results]]
== {api-response-body-title}

The API returns a response that contains the following:

`feature_values`::
(array)
An array of objects that contain feature name and value pairs. The features have
been processed and indicate what will be sent to the model for training.

[[ml-preview-dfanalytics-example]]
== {api-examples-title}

[source,console]
--------------------------------------------------
POST _ml/data_frame/analytics/_preview
{
"config": {
"source": {
"index": "houses_sold_last_10_yrs"
},
"analysis": {
"regression": {
"dependent_variable": "price"
}
}
}
}
--------------------------------------------------
// TEST[skip:TBD]

The API returns the following results:

[source,console-result]
----
{
"feature_values": [
{
"number_of_bedrooms": "1",
"postcode": "29655",
"price": "140.4"
},
...
]
}
----
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
package org.elasticsearch.xpack.core.ml.action;

import org.elasticsearch.action.ActionRequest;
import org.elasticsearch.action.ActionRequestValidationException;
import org.elasticsearch.action.ActionResponse;
import org.elasticsearch.action.ActionType;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.xcontent.ConstructingObjectParser;
import org.elasticsearch.common.xcontent.ObjectParser;
import org.elasticsearch.common.xcontent.ToXContentObject;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.xpack.core.ml.dataframe.DataFrameAnalyticsConfig;
import org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper;

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.Objects;


public class PreviewDataFrameAnalyticsAction extends ActionType<PreviewDataFrameAnalyticsAction.Response> {

public static final PreviewDataFrameAnalyticsAction INSTANCE = new PreviewDataFrameAnalyticsAction();
public static final String NAME = "cluster:admin/xpack/ml/data_frame/analytics/preview";

private PreviewDataFrameAnalyticsAction() {
super(NAME, PreviewDataFrameAnalyticsAction.Response::new);
}

public static class Request extends ActionRequest {
benwtrent marked this conversation as resolved.
Show resolved Hide resolved

public static final ParseField CONFIG = new ParseField("config");

private final DataFrameAnalyticsConfig config;

static final ObjectParser<Builder, Void> PARSER = new ObjectParser<>(
"preview_data_frame_analytics_response",
Request.Builder::new
);
static {
PARSER.declareObject(Request.Builder::setConfig, DataFrameAnalyticsConfig.STRICT_PARSER::apply, CONFIG);
}

public static Request.Builder fromXContent(XContentParser parser) {
return PARSER.apply(parser, null);
}

public Request(DataFrameAnalyticsConfig config) {
this.config = ExceptionsHelper.requireNonNull(config, CONFIG);
}

public Request(StreamInput in) throws IOException {
super(in);
this.config = new DataFrameAnalyticsConfig(in);
}

public DataFrameAnalyticsConfig getConfig() {
return config;
}

@Override
public ActionRequestValidationException validate() {
return null;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
super.writeTo(out);
config.writeTo(out);
}


@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Request request = (Request) o;
return Objects.equals(config, request.config);
}

@Override
public int hashCode() {
return Objects.hash(config);
}

public static class Builder {
private DataFrameAnalyticsConfig config;

private Builder setConfig(DataFrameAnalyticsConfig.Builder config) {
this.config = config.buildForExplain();
return this;
}

public Builder setConfig(DataFrameAnalyticsConfig config) {
this.config = config;
return this;
}

public DataFrameAnalyticsConfig getConfig() {
return config;
}

public Request build() {
return new Request(config);
}
}
}

public static class Response extends ActionResponse implements ToXContentObject {

public static final ParseField TYPE = new ParseField("preview_data_frame_analytics_response");
public static final ParseField FEATURE_VALUES = new ParseField("feature_values");

@SuppressWarnings("unchecked")
static final ConstructingObjectParser<Response, Void> PARSER =
new ConstructingObjectParser<>(
TYPE.getPreferredName(),
args -> new Response((List<Map<String, Object>>) args[0]));

static {
PARSER.declareObjectArray(ConstructingObjectParser.constructorArg(), (p, c) -> p.map(), FEATURE_VALUES);
}

private final List<Map<String, Object>> featureValues;

public Response(List<Map<String, Object>> featureValues) {
this.featureValues = Objects.requireNonNull(featureValues);
}

public Response(StreamInput in) throws IOException {
super(in);
this.featureValues = in.readList(StreamInput::readMap);
}

public List<Map<String, Object>> getFeatureValues() {
return featureValues;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeCollection(featureValues, StreamOutput::writeMap);
}

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject();
builder.field(FEATURE_VALUES.getPreferredName(), featureValues);
builder.endObject();
return builder;
}

@Override
public boolean equals(Object other) {
if (this == other) return true;
if (other == null || getClass() != other.getClass()) return false;

Response that = (Response) other;
return Objects.equals(featureValues, that.featureValues);
}

@Override
public int hashCode() {
return Objects.hash(featureValues);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@

public class DataFrameAnalyticsConfig implements ToXContentObject, Writeable {

public static final String BLANK_ID = "blank_data_frame_id";
public static final String BLANK_DEST_INDEX = "blank_dest_index";

public static final String TYPE = "data_frame_analytics_config";

public static final ByteSizeValue DEFAULT_MODEL_MEMORY_LIMIT = ByteSizeValue.ofGb(1);
Expand Down Expand Up @@ -456,10 +459,10 @@ public DataFrameAnalyticsConfig build() {
*/
public DataFrameAnalyticsConfig buildForExplain() {
return new DataFrameAnalyticsConfig(
id != null ? id : "dummy",
id != null ? id : BLANK_ID,
description,
source,
dest != null ? dest : new DataFrameAnalyticsDest("dummy", null),
dest != null ? dest : new DataFrameAnalyticsDest(BLANK_DEST_INDEX, null),
analysis,
headers,
modelMemoryLimit,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.core.ml.action;

import org.elasticsearch.common.io.stream.NamedWriteableRegistry;
import org.elasticsearch.common.io.stream.Writeable;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.search.SearchModule;
import org.elasticsearch.test.AbstractWireSerializingTestCase;
import org.elasticsearch.xpack.core.ml.action.PreviewDataFrameAnalyticsAction.Request;
import org.elasticsearch.xpack.core.ml.dataframe.DataFrameAnalyticsConfigTests;
import org.elasticsearch.xpack.core.ml.dataframe.analyses.MlDataFrameAnalysisNamedXContentProvider;
import org.elasticsearch.xpack.core.ml.inference.MlInferenceNamedXContentProvider;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;


public class PreviewDataFrameAnalyticsActionRequestTests extends AbstractWireSerializingTestCase<Request> {

@Override
protected NamedWriteableRegistry getNamedWriteableRegistry() {
List<NamedWriteableRegistry.Entry> namedWriteables = new ArrayList<>();
namedWriteables.addAll(new MlDataFrameAnalysisNamedXContentProvider().getNamedWriteables());
namedWriteables.addAll(new MlInferenceNamedXContentProvider().getNamedWriteables());
namedWriteables.addAll(new SearchModule(Settings.EMPTY, Collections.emptyList()).getNamedWriteables());
return new NamedWriteableRegistry(namedWriteables);
}

@Override
protected Writeable.Reader<Request> instanceReader() {
return Request::new;
}

@Override
protected Request createTestInstance() {
return new Request(DataFrameAnalyticsConfigTests.createRandom(randomAlphaOfLength(10)));
}
}
Loading