[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

Winterflower · 2020-02-26T14:05:07Z

Describe the feature:
We display a confusion matrix to help users interpret the results of our classification analytics.
Since we split the source index into training and testing sets, there are three possible subsets of the results index on which the values of the confusion matrix can be calculated:

(1) The whole dataset

(2) The testing dataset

(3) The training dataset

You can toggle between the last two options using the Testing/Training buttons a bit further down near the results table.

Clearly indicate which of the three subsets is used for the confusion matrix

When the Results UI is first loaded, a user is automatically shown option 1.
Although we display the count of the documents near the confusion matrix as a helpful indicator of which of the three subsets listed above are being used to compute the values, I would argue that this on its own is not enough to tell the user they are viewing the confusion matrix values for the whole dataset - not everyone will remember the total number of documents in their index. We should indicate which of the three subsets is used for the computation this a bit more clearly somewhere near the confusion matrix graphic and not rely just on the bold type in Testing/Training toggle on the table.

Displaying a confusion matrix on the whole dataset is not necessarily useful - maybe display matrix on test set by default?

When the Results UI first loads, we, by default, compute the confusion matrix on the whole dataset, ignoring any test/train splitting. In my opinion, there is little value to the user of seeing the confusion matrix on the whole dataset. We would in general, be most, interested in the confusion matrix values on the test set, since this will give an indication of how well the trained model will perform on previously unseen data. (cc: @tveasey and @blaklaybul - would love to hear your opinion on this matter)

elasticmachine · 2020-02-26T14:52:36Z

Pinging @elastic/ml-ui (:ml)

sophiec20 · 2020-02-27T09:53:22Z

For page usability, I think it is important that the same in-page filter applies to the table results and the evaluation results. Perhaps this can be made visually clearer on page.

We could potentially have two confusion matrices, one for train and one for test. So, depending on the filter either one or two confusion matrices would be displayed. This would allow easy comparison of test/train evaluation results.

Winterflower · 2020-03-02T16:21:03Z

Hi @peteharverson and @alvarezmelissa87 !
I've been thinking a bit about the colour choices in the confusion matrix ie. using alternating teal and very light teal colours and I'm not really sure what the visual language is trying to tell to the user. For example, here, the dark teal colours highlight the class 1, the number of true positives for class 0 and the number of true positives for class 1 - why do we need to visually link them using this colouring scheme? I think it might make more sense if we just highlighted the true positives and true negatives without necessarily highlighting class 1.
@tveasey @blaklaybul Would appreciate your thoughts on the colouring of the confusion matrix

alvarezmelissa87 · 2020-03-03T15:35:29Z

Heya, @Winterflower - just responding to the color choices comment above. That first column is just the predicted label and should definitely not be colored ever. Looks like you found a bug! 😄
I've created an issue for it #59155

Winterflower · 2020-03-04T16:30:02Z

Thanks very much @alvarezmelissa87 !
I just looked at the confusion matrix with another dataset - breast-cancer-recurrence and the confusion matrix colouring looks normal here, so it must be something to do with class labels that are numbers.
I had a related question - if the status of the job is "Started" - why do we show numbers in the confusion matrix? I'd think that if the job hasn't completed yet, we wouldn't be able to display a confusion matrix since we wouldn't know the results.

alvarezmelissa87 · 2020-03-17T13:20:54Z

Adding a note here for an additional enhancement - fetching the job state at the time the results view is open to ensure it is up to date. Currently, it is stored in the link to the results view which is created when the analytics list table is loaded. cc @Winterflower

Winterflower · 2020-03-18T11:25:21Z

Thanks very much @alvarezmelissa87 ! Just adding a note that it might be simplest to just show the user some sort of "Loading/In progress" status page when they click "View" and the job hasn't completed yet or alternatively disable "View" button completely until the job has completed. Not sure what would be the best solution technically, but just putting these out there as possibilities from a user perspective.

Winterflower · 2020-03-23T10:14:13Z

Hey @alvarezmelissa87 ! I see that the fix has already been merged into master, but perhaps not backported into the 7.7.0 version yet?
I just wanted to make a note that I'm seeing the same label colouring issue with the seeds dataset in a multiclass context. It's a much smaller dataset and easier to work with in case you need more test cases etc.
I'm seeing this on build 2227 in the 7.7.0 versions.

peteharverson · 2020-03-23T15:15:24Z

@Winterflower yes the fix for the coloring of the label cell has been backported to 7.x - #60421. There has been an issue with the 7.x stack builds, which means the version of Kibana in the build (2227) you were testing was out of date. This is how the confusion matrix looks with the fix:

#60421

peteharverson · 2020-03-23T15:29:44Z

Related to the comments raised by @Winterflower in #58596 (comment), I'm thinking it might be useful to have a control which specifies whether the filter used for the results table should also apply to the confusion matrix. For example, on these results:

When I apply a filter of doc.district_name:Eixample on the results, I would be interested in seeing the full confusion matrix row for Eixample and not just the row as it is currently filtered:

alvarezmelissa87 · 2020-03-25T20:29:05Z

First I'll quickly summarize the current behavior of the results page:

A filter applied to the results table automatically applies to the evaluate confusion matrix as well (including testing/training)
Defaults to the entire dataset being used for the confusion matrix (updates when filter(s) applied to table)

Summary of proposed enhancements:

For 7.7:

Clearly indicate which data subset is being used for the confusion matrix (whether it be Testing, Training, or the whole dataset). [ML] DF Analytics Classification: clarify subset of data used in confusion matrix #61548
- This can be done by adding something like for test data or for training data next to the Normalized confusion matrix title in the evaluate panel.
Ensure job status badge is up to date in results view [ML] DF Analytics: ensure job state is up to date #61678

For 7.8

Default confusion matrix to display data for testing subset with the option to also view a separate confusion matrix for training data
Add a control to specify whether the filter used for the results table should also apply to the confusion matrix.

This would mean that the training/testing filter from the result table would have no effect on the confusion matrix displays since we'd have two separate confusion matrices for training/testing.
Other filters applied to the table would then also apply to both matrices, unless the control was set to specify that the filter should not apply to the confusion matrix

Thoughts on if all/some of these make sense? cc @peteharverson, @Winterflower, @sophiec20

peteharverson · 2020-03-27T11:59:35Z

@alvarezmelissa87 the suggested approach in #58596 (comment) sounds good, although I wonder how two confusion matrixes would look visually in the evaluate section of the page, especially now that the confusion matrix can have a lot of rows/columns with the addition of multi-class? It might be simpler to keep with the single matrix, but have a control for toggling between test / training / entire data set. Plus the control for specifying whether the filter for the table should apply to the results table.

Also looking at the suggestion from @Winterflower, I wonder if we should show the data for the test set by default when the page opens, rather than the entire data set?

alvarezmelissa87 · 2020-05-05T19:01:40Z

From the latest comment - #58596 (comment) - it sounds like the behavior we want moving forward is to

keep the single matrix with a control for toggling between test / training / entire data set.
have the matrix default to displaying test data
have a control for specifying whether the filter for the table should apply to the confusion matrix

@peteharverson, @Winterflower - do you agree?

peteharverson · 2020-05-06T10:54:42Z

@alvarezmelissa87 responding to #58596 (comment), yes I would agree with:

keep the single matrix with a control for toggling between test / training / entire data set.
have a control for specifying whether the filter for the table should apply to the confusion matrix

I would defer to @Winterflower on whether to default to the test data set when the page opens. If so, we should check if the training percent was 100%, and if so, switch on opening to the training data.

Winterflower · 2020-05-08T12:08:32Z

I would defer to @Winterflower on whether to default to the test data set when the page opens. If so, we should check if the training percent was 100%, and if so, switch on opening to the training data.

My personal opinion is to have the matrix default to the test dataset when the page first opens/loads, because this is the option that people would generally care most about - seeing how the model would perform on previously unseen data.

However, if the user has used 100% of the data for training, then of course we would default to showing the training set (which in this case is also the full dataset).

alvarezmelissa87 · 2020-08-27T13:19:25Z

Related work that can be added while making these updates: https://github.com/elastic/ml-team/issues/393#issuecomment-675342503

alvarezmelissa87 · 2020-10-29T18:47:39Z

@peteharverson, @Winterflower just bumping this conversation up for 7.11.

With the updated results view for DFA - do the prior tasks we had decided on before still apply? #58596 (comment)

With the new training/testing quick search tools to the right of the search bar - I think duplicate controls in the confusion matrix section are now unnecessary.

For 7.11 it seems the only update we should make is defaulting the entire results view to reflect just the testing data (assuming training is not set to 100%). Happy to hear thoughts!

Winterflower · 2020-11-26T12:01:29Z

Hey @alvarezmelissa87 ! I haven't looked at the confusion matrix in detail in a while. I'll run some test jobs on recent versions and come back with comments.

Winterflower · 2021-02-10T14:54:06Z

Hey @alvarezmelissa87 and @peteharverson !
I've been looking at the confusion matrix and I think the only outstanding item that still applies from the comments above is that the confusion matrix filter does not default to the Testing set, but instead to the whole dataset. However, I do not think this is a major issue and we can revisit later if needed.

alvarezmelissa87 · 2021-02-10T17:57:04Z

Closing this issue off for now. Please feel free to reopen/create new issue if behavior needs to be revisited in the future.

Winterflower added the Feature:Data Frame Analytics ML data frame analytics features label Feb 26, 2020

peteharverson added the :ml label Feb 26, 2020

peteharverson added the v7.7.0 label Feb 26, 2020

peteharverson assigned alvarezmelissa87 Feb 27, 2020

alvarezmelissa87 mentioned this issue Mar 3, 2020

[ML] Classification results view: confusion matrix label column should not be colored #59155

Closed

alvarezmelissa87 mentioned this issue Mar 26, 2020

[ML] DF Analytics Classification: clarify subset of data used in confusion matrix #61548

Merged

3 tasks

alvarezmelissa87 mentioned this issue Mar 27, 2020

[ML] DF Analytics: ensure job state is up to date #61678

Merged

2 tasks

alvarezmelissa87 added v7.8.0 and removed v7.7.0 labels Mar 30, 2020

peteharverson added v7.9.0 and removed v7.8.0 labels May 7, 2020

alvarezmelissa87 added v7.10.0 and removed v7.9.0 labels Jul 2, 2020

alvarezmelissa87 added the v7.11.0 label Oct 1, 2020

alvarezmelissa87 removed the v7.10.0 label Oct 1, 2020

peteharverson added v7.12.0 and removed v7.11.0 labels Dec 18, 2020

alvarezmelissa87 closed this as completed Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

Winterflower commented Feb 26, 2020 •

edited

Loading

elasticmachine commented Feb 26, 2020

sophiec20 commented Feb 27, 2020

Winterflower commented Mar 2, 2020 •

edited

Loading

alvarezmelissa87 commented Mar 3, 2020

Winterflower commented Mar 4, 2020

alvarezmelissa87 commented Mar 17, 2020

Winterflower commented Mar 18, 2020

Winterflower commented Mar 23, 2020

peteharverson commented Mar 23, 2020

peteharverson commented Mar 23, 2020

alvarezmelissa87 commented Mar 25, 2020 •

edited

Loading

peteharverson commented Mar 27, 2020

alvarezmelissa87 commented May 5, 2020

peteharverson commented May 6, 2020

Winterflower commented May 8, 2020

alvarezmelissa87 commented Aug 27, 2020

alvarezmelissa87 commented Oct 29, 2020

Winterflower commented Nov 26, 2020

Winterflower commented Feb 10, 2021

alvarezmelissa87 commented Feb 10, 2021

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

Comments

Winterflower commented Feb 26, 2020 • edited Loading

Clearly indicate which of the three subsets is used for the confusion matrix

Displaying a confusion matrix on the whole dataset is not necessarily useful - maybe display matrix on test set by default?

elasticmachine commented Feb 26, 2020

sophiec20 commented Feb 27, 2020

Winterflower commented Mar 2, 2020 • edited Loading

alvarezmelissa87 commented Mar 3, 2020

Winterflower commented Mar 4, 2020

alvarezmelissa87 commented Mar 17, 2020

Winterflower commented Mar 18, 2020

Winterflower commented Mar 23, 2020

peteharverson commented Mar 23, 2020

peteharverson commented Mar 23, 2020

alvarezmelissa87 commented Mar 25, 2020 • edited Loading

peteharverson commented Mar 27, 2020

alvarezmelissa87 commented May 5, 2020

peteharverson commented May 6, 2020

Winterflower commented May 8, 2020

alvarezmelissa87 commented Aug 27, 2020

alvarezmelissa87 commented Oct 29, 2020

Winterflower commented Nov 26, 2020

Winterflower commented Feb 10, 2021

alvarezmelissa87 commented Feb 10, 2021

Winterflower commented Feb 26, 2020 •

edited

Loading

Winterflower commented Mar 2, 2020 •

edited

Loading

alvarezmelissa87 commented Mar 25, 2020 •

edited

Loading