Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] move find file structure to a new API endpoint #67123

Conversation

benwtrent
Copy link
Member

This introduces a new text-structure plugin. This is the new home of the find file structure API.

The old REST URL is still available but is deprecated.

The new URL is: _text_structure/find_structure. All parameters and behavior are unchanged.

Changes to the high-level REST client and docs will be in separate commit.

related to: #67001

@benwtrent benwtrent force-pushed the feature/move-file-structure-finder-to-new-plugin branch from 6c64954 to 8288cb2 Compare January 6, 2021 20:04
Copy link
Member Author

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes for reviewer:

  • I have verified that Kibana file upload still works fine. I think the deprecation headers are simply being dropped by the kibana server
  • The VAST majority of the changes are simple file moves + formatting updates (formatting is enforced in the new text-structure plugin).
  • I am planning on high level client changes in an other PR as those could be considered "breaking" (didn't want to mark this PR as breaking, and the churn is big enough already).
  • Doc changes will also come later

@@ -2895,8 +2895,17 @@ public void testFindFileStructure() throws IOException {
FindFileStructureRequest request = new FindFileStructureRequest();
request.setSample(sample.getBytes(StandardCharsets.UTF_8));

FindFileStructureResponse response =
execute(request, machineLearningClient::findFileStructure, machineLearningClient::findFileStructureAsync);
FindFileStructureResponse response = execute(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably be removed once the high level client is updated

@@ -1870,8 +1870,17 @@ public void testFindFileStructure() throws Exception {
// end::find-file-structure-request-options

// tag::find-file-structure-execute
FindFileStructureResponse findFileStructureResponse =
client.machineLearning().findFileStructure(findFileStructureRequest, RequestOptions.DEFAULT);
FindFileStructureResponse findFileStructureResponse = client
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably be removed once the high level client is updated

@@ -1,89 +0,0 @@
/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this shows only a delete, it is effectively a file move (it is added later in the diff). The issue is, I think, that the formatting changes (enforced by our spotless settings), caused github to not recognize it as a simple file rename.

@@ -1,5 +1,5 @@
{
"ml.find_file_structure":{
"text_structure.find_structure":{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opted to not keep the old yaml definition (with deprecation) as the API was experimental.

ScalingExecutorBuilder utility = new ScalingExecutorBuilder(
UTILITY_THREAD_POOL_NAME,
1,
256,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number has no significance. I am willing to reduce as it does seem rather high. I am not sure how many of these calls we expect to run in parallel on the same node. Maybe 16?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe this plugin shouldn't have its own thread pool at all, but reuse the "generic" one from core. Several other plugins do this, for example monitoring, ilm, slm, and parts of security. Since we expect this API to be pretty infrequently called I think it's reasonable to do the same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195 reusing generic seems OK with me. Will take a look.

@benwtrent benwtrent force-pushed the feature/move-file-structure-finder-to-new-plugin branch from 8288cb2 to 168f930 Compare January 6, 2021 20:29
@benwtrent benwtrent marked this pull request as ready for review January 6, 2021 21:24
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent requested a review from droberts195 January 6, 2021 21:24
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you pointed out, the main outstanding decision is which thread pool to get the threads from.

The REST spec still says it's experimental, but this can be changed at the same time as the docs.

ScalingExecutorBuilder utility = new ScalingExecutorBuilder(
UTILITY_THREAD_POOL_NAME,
1,
256,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe this plugin shouldn't have its own thread pool at all, but reuse the "generic" one from core. Several other plugins do this, for example monitoring, ilm, slm, and parts of security. Since we expect this API to be pretty infrequently called I think it's reasonable to do the same here.

@benwtrent benwtrent requested a review from droberts195 January 7, 2021 15:34
@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent
Copy link
Member Author

run elasticsearch-ci/default-distro

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit af179ab into elastic:master Jan 11, 2021
@benwtrent benwtrent deleted the feature/move-file-structure-finder-to-new-plugin branch January 11, 2021 13:56
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jan 11, 2021
This introduces a new `text-structure` plugin. This is the new home of the find file structure API.

The old REST URL is still available but is deprecated.

The new URL is: `_text_structure/find_structure`. All parameters and behavior are unchanged.

Changes to the high-level REST client and docs will be in separate commit.

related to: elastic#67001
benwtrent added a commit that referenced this pull request Jan 11, 2021
…67251)

* [ML] move find file structure to a new API endpoint (#67123)

This introduces a new `text-structure` plugin. This is the new home of the find file structure API.

The old REST URL is still available but is deprecated.

The new URL is: `_text_structure/find_structure`. All parameters and behavior are unchanged.

Changes to the high-level REST client and docs will be in separate commit.

related to: #67001
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants