Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] DataFrame Analytics: Regression results view #49667

Merged

Conversation

alvarezmelissa87
Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 commented Oct 29, 2019

Summary

Related meta issue: #47890

  • enables View link in Analytics jobs list for regression jobs
  • results view displays mean squared error and r squared
  • results view displays a table containing data from the regression job destination index.
    • filter by training/testing data
    • columns default to actual value of the dependent_variable, the predicted value, and is_training value
    • can select additional columns
    • can change sort order of columns
    • can search the contents of the table - a search fetches another 1000 records

image

image

image

image

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

For maintainers

- [ ] This was checked for breaking API changes and was labeled appropriately
- [ ] This includes a feature addition or change that requires a release note and was labeled appropriately

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@alvarezmelissa87 alvarezmelissa87 mentioned this pull request Oct 29, 2019
46 tasks
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@sophiec20
Copy link
Contributor

Looks good from screenshot, and have a few comments/questions.

  • I think it might be nice to see the prediction and the actual next to each other as columns. Perhaps we could anchor is_training, prediction, actual on the left.
  • I cannot see the word "Regression" on the screenshot. Is it somewhere else on page?
  • I wonder if we should pre-select first 3 or 4 fields (of supported data types) to display in the table. Seems odd just to have those 3 columns on display. Agree it is hard to know which columns to show, but we could just pick the first few and then people with smaller analysis get the benefit. Not sure about this, but there is a lot of real-estate on page.
  • It is possible to have a missing ml field due to documents containing arrays and/or missing dependent_variable and/or nested fields, what is the behaviour here? I'm sure you have, but a reminder to please test with a dirtier dataset too.

@peteharverson
Copy link
Contributor

peteharverson commented Oct 30, 2019

As well as ensuring the predicted and actual field are anchored next to each other, I wonder if ml.is_training should be anchored directly after the predicted and actual columns, or on the right? Not sure it makes sense to list the one alphabetically with the other fields? Maybe these three fields could always go to the top of the column selector?

image

@peteharverson - having the three main fields at the top of the list makes sense. Added in 3fe1ace

}
};

const search = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to do numeric search with the search bar? If so, might be useful to have an example of the syntax as a placeholder e.g. myField > 0.5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 3fe1ace

const [clearTable, setClearTable] = useState(false);
const [selectedFields, setSelectedFields] = useState([] as EsFieldName[]);
const [isColumnsPopoverVisible, setColumnsPopoverVisible] = useState(false);
const [searchQuery, setSearchQuery] = useState<SavedSearchQuery>(defaultSearchQuery);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change the sort on a column, the query seems to be retained (rightly), but on re-render after the sort, the query bar is empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - fixed with: 9f18919

@@ -120,7 +130,6 @@ export const EvaluatePanel: FC<Props> = ({ jobId, index, dependentVariable }) =>
)}
</span>
</EuiTitle>
<EuiSpacer />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's related to this deletion, but if a job is still running, the positioning of the Callouts seems slightly out:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with 3fe1ace


const url = getResultsUrl(item.id, analysisType, destIndex, dependentVariable);
// Disable 'View' link for regression until results view is complete
const url = getResultsUrl(item.id, analysisType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the View link be disabled until the job has finished?

@peteharverson
Copy link
Contributor

Would be good to display the 'type' of the job, plus the status e.g. 'failed'. above the table. Also applies to Outlier jobs!

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@alvarezmelissa87
Copy link
Contributor Author

Thank you for taking a look and the great comments! 😄
PR has been updated and is ready for another look. cc @peteharverson, @walterra

All comments have been addressed. Regarding the missing 'ml' field @sophiec20 asked about - yes it is possible. The evaluate panel will just show the error callouts (see last screenshot in PR description above) and the table will show any results from the dest index pattern. The failed status will be shown as well.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Found a couple of issues related to sorting.

import { getTaskStateBadge } from '../../../analytics_management/components/analytics_list/columns';
import { DATA_FRAME_TASK_STATE } from '../../../analytics_management/components/analytics_list/common';

import { useExploreData, defaultSearchQuery } from './use_explore_data';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to disable sorting on table columns which are of type text. An example with a (failed) regression job run on the Kibana ecommerce sample data:

image


// Used to sort columns:
// Anchor on the left ml.is_training, <predictedField>, <actual>
export const sortRegressionResultsColumns = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the predicted field getting right alignment because it's a nested field? Wonder if it would look better left aligned like the other numeric columns? Otherwise you end up with a big gap between the first and second columns:

image

let sorting: SortingPropType = false;
let onTableChange;

if (columns.length > 0 && sortField !== '' && sortField !== undefined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing a strange issue in this example. I enter a search of p3<-1, and table renders as expected:
image

But then when I try to sort on the p3 column, I get no results. Possibly related to it being the prediction field? I only seem to get this when sorting on the field used for prediction:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outlier results view might have similar issues, as it’s based on the same code. Along with the other suggested changes that impact both views, I will address that in follow-ups.

})}
</span>
</EuiTitle>
<EuiFlexGroup gutterSize="s">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add the job type and status to the Outliers results page too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 0eabc8d

Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some minor suggestions.

@@ -68,6 +70,104 @@ export const sortColumns = (obj: EsDocSource, resultsField: string) => (a: strin
return a.localeCompare(b);
};

export const sortRegressionResultsFields = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be a factory similar to sortRegressionResultsColumns, then you could pass it to the sort as .sort(sortRegressionResultsFields(jobConfig)) later on. Both sorting functions here would then have the same format.

Comment on lines +214 to +221
let value = false;
docs.forEach(row => {
const source = row._source;
if (source[k] !== null) {
value = true;
}
});
return value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could make use of some to avoid the need for let, something like

return docs.some(row => row._source[k] !== null);

const [trainingEval, setTrainingEval] = useState<Eval>(defaultEval);
const [generalizationEval, setGeneralizationEval] = useState<Eval>(defaultEval);
const [isLoadingTraining, setIsLoadingTraining] = useState<boolean>(false);
const [isLoadingGeneralization, setIsLoadingGeneralization] = useState<boolean>(false);

const index = idx(jobConfig, _ => _.dest.index) as string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why idx is necessary here? The type DataFrameAnalyticsConfig specifies .dest.index not as optional.

Comment on lines +21 to +23
<EuiPanel>
<EuiLoadingSpinner className="mlRegressionExploration__evaluateLoadingSpinner" size="xl" />
</EuiPanel>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to change this to so we don't need a custom class:

  <EuiPanel className="eui-textCenter">
    <EuiLoadingSpinner size="xl" />
  </EuiPanel>

useEffect(() => {
(async function() {
setIsLoadingJobConfig(true);
const analyticsConfigs: GetDataFrameAnalyticsResponse = await ml.dataFrameAnalytics.getDataFrameAnalytics(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail silently, suggest to add a try/catch block and display a toast or callout for errors.

const field = predictedFieldSelected ? predictedFieldName : selectedFields[0];
const direction = predictedFieldSelected ? SORT_DIRECTION.DESC : SORT_DIRECTION.ASC;
loadExploreData({ field, direction, searchQuery });
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this return can be removed.

const field = predictedFieldSelected ? predictedFieldName : selectedFields[0];
const direction = predictedFieldSelected ? SORT_DIRECTION.DESC : SORT_DIRECTION.ASC;
loadExploreData({ field, direction, searchQuery });
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this return can be removed.

@alvarezmelissa87
Copy link
Contributor Author

I'll be addressing the sorting issue @peteharverson mentioned and the refactor suggestions @walterra mentioned in a follow-up as those issues also affect the outlier results view.

@alvarezmelissa87
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Happy for the issues which also affect the Outliers page to be done in a follow-up.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@alvarezmelissa87 alvarezmelissa87 merged commit 6f464ad into elastic:master Oct 31, 2019
@alvarezmelissa87 alvarezmelissa87 deleted the ml-regression-results branch October 31, 2019 17:04
alvarezmelissa87 added a commit to alvarezmelissa87/kibana that referenced this pull request Oct 31, 2019
* enable analytics table view link for regression jobs

* add results table component

* can filter for training/testing

* add search functionality to table

* move shared types to analytics types file

* anchor isTraining,predicted,actual columns to left.

* ensure search string persists in search bar input

* show jobStatus badge in results view

* add jobType, status badge to outlier exploration page

* update exploration tests
alvarezmelissa87 added a commit that referenced this pull request Oct 31, 2019
* enable analytics table view link for regression jobs

* add results table component

* can filter for training/testing

* add search functionality to table

* move shared types to analytics types file

* anchor isTraining,predicted,actual columns to left.

* ensure search string persists in search bar input

* show jobStatus badge in results view

* add jobType, status badge to outlier exploration page

* update exploration tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants