Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

Closed
Winterflower opened this issue Feb 26, 2020 · 20 comments
Closed

[ML] DF Analytics results: Enhancements to Confusion Matrix #58596

Winterflower opened this issue Feb 26, 2020 · 20 comments
Assignees
Labels
Feature:Data Frame Analytics ML data frame analytics features :ml v7.12.0

Comments

@Winterflower
Copy link

Winterflower commented Feb 26, 2020

Describe the feature:
We display a confusion matrix to help users interpret the results of our classification analytics.
Since we split the source index into training and testing sets, there are three possible subsets of the results index on which the values of the confusion matrix can be calculated:

(1) The whole dataset

Screen Shot 2020-02-26 at 11 05 18 AM

(2) The testing dataset

Screen Shot 2020-02-26 at 11 07 58 AM

(3) The training dataset

Screen Shot 2020-02-26 at 2 52 15 PM

You can toggle between the last two options using the Testing/Training buttons a bit further down near the results table.

Clearly indicate which of the three subsets is used for the confusion matrix

When the Results UI is first loaded, a user is automatically shown option 1.
Although we display the count of the documents near the confusion matrix as a helpful indicator of which of the three subsets listed above are being used to compute the values, I would argue that this on its own is not enough to tell the user they are viewing the confusion matrix values for the whole dataset - not everyone will remember the total number of documents in their index. We should indicate which of the three subsets is used for the computation this a bit more clearly somewhere near the confusion matrix graphic and not rely just on the bold type in Testing/Training toggle on the table.

Displaying a confusion matrix on the whole dataset is not necessarily useful - maybe display matrix on test set by default?

When the Results UI first loads, we, by default, compute the confusion matrix on the whole dataset, ignoring any test/train splitting. In my opinion, there is little value to the user of seeing the confusion matrix on the whole dataset. We would in general, be most, interested in the confusion matrix values on the test set, since this will give an indication of how well the trained model will perform on previously unseen data. (cc: @tveasey and @blaklaybul - would love to hear your opinion on this matter)

@Winterflower Winterflower added the Feature:Data Frame Analytics ML data frame analytics features label Feb 26, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@sophiec20
Copy link
Contributor

For page usability, I think it is important that the same in-page filter applies to the table results and the evaluation results. Perhaps this can be made visually clearer on page.

We could potentially have two confusion matrices, one for train and one for test. So, depending on the filter either one or two confusion matrices would be displayed. This would allow easy comparison of test/train evaluation results.

@Winterflower
Copy link
Author

Winterflower commented Mar 2, 2020

Hi @peteharverson and @alvarezmelissa87 !
I've been thinking a bit about the colour choices in the confusion matrix ie. using alternating teal and very light teal colours and I'm not really sure what the visual language is trying to tell to the user. For example, here, the dark teal colours highlight the class 1, the number of true positives for class 0 and the number of true positives for class 1 - why do we need to visually link them using this colouring scheme? I think it might make more sense if we just highlighted the true positives and true negatives without necessarily highlighting class 1.
@tveasey @blaklaybul Would appreciate your thoughts on the colouring of the confusion matrix

Screen Shot 2020-03-02 at 4 40 37 PM

@alvarezmelissa87
Copy link
Contributor

Heya, @Winterflower - just responding to the color choices comment above. That first column is just the predicted label and should definitely not be colored ever. Looks like you found a bug! 😄
I've created an issue for it #59155

@Winterflower
Copy link
Author

Thanks very much @alvarezmelissa87 !
I just looked at the confusion matrix with another dataset - breast-cancer-recurrence and the confusion matrix colouring looks normal here, so it must be something to do with class labels that are numbers.
I had a related question - if the status of the job is "Started" - why do we show numbers in the confusion matrix? I'd think that if the job hasn't completed yet, we wouldn't be able to display a confusion matrix since we wouldn't know the results.

Screen Shot 2020-03-04 at 5 27 18 PM

@alvarezmelissa87
Copy link
Contributor

Adding a note here for an additional enhancement - fetching the job state at the time the results view is open to ensure it is up to date. Currently, it is stored in the link to the results view which is created when the analytics list table is loaded. cc @Winterflower

@Winterflower
Copy link
Author

Thanks very much @alvarezmelissa87 ! Just adding a note that it might be simplest to just show the user some sort of "Loading/In progress" status page when they click "View" and the job hasn't completed yet or alternatively disable "View" button completely until the job has completed. Not sure what would be the best solution technically, but just putting these out there as possibilities from a user perspective.

@Winterflower
Copy link
Author

Hey @alvarezmelissa87 ! I see that the fix has already been merged into master, but perhaps not backported into the 7.7.0 version yet?
I just wanted to make a note that I'm seeing the same label colouring issue with the seeds dataset in a multiclass context. It's a much smaller dataset and easier to work with in case you need more test cases etc.
I'm seeing this on build 2227 in the 7.7.0 versions.
Screen Shot 2020-03-23 at 11 10 32 AM

@peteharverson
Copy link
Contributor

@Winterflower yes the fix for the coloring of the label cell has been backported to 7.x - #60421. There has been an issue with the 7.x stack builds, which means the version of Kibana in the build (2227) you were testing was out of date. This is how the confusion matrix looks with the fix:

#60421
image

@peteharverson
Copy link
Contributor

Related to the comments raised by @Winterflower in #58596 (comment), I'm thinking it might be useful to have a control which specifies whether the filter used for the results table should also apply to the confusion matrix. For example, on these results:

image

When I apply a filter of doc.district_name:Eixample on the results, I would be interested in seeing the full confusion matrix row for Eixample and not just the row as it is currently filtered:

image

@alvarezmelissa87
Copy link
Contributor

alvarezmelissa87 commented Mar 25, 2020

First I'll quickly summarize the current behavior of the results page:

  • A filter applied to the results table automatically applies to the evaluate confusion matrix as well (including testing/training)
  • Defaults to the entire dataset being used for the confusion matrix (updates when filter(s) applied to table)

Summary of proposed enhancements:

For 7.7:

For 7.8

  • Default confusion matrix to display data for testing subset with the option to also view a separate confusion matrix for training data
  • Add a control to specify whether the filter used for the results table should also apply to the confusion matrix.

This would mean that the training/testing filter from the result table would have no effect on the confusion matrix displays since we'd have two separate confusion matrices for training/testing.
Other filters applied to the table would then also apply to both matrices, unless the control was set to specify that the filter should not apply to the confusion matrix

Thoughts on if all/some of these make sense? cc @peteharverson, @Winterflower, @sophiec20

@peteharverson
Copy link
Contributor

@alvarezmelissa87 the suggested approach in #58596 (comment) sounds good, although I wonder how two confusion matrixes would look visually in the evaluate section of the page, especially now that the confusion matrix can have a lot of rows/columns with the addition of multi-class? It might be simpler to keep with the single matrix, but have a control for toggling between test / training / entire data set. Plus the control for specifying whether the filter for the table should apply to the results table.

Also looking at the suggestion from @Winterflower, I wonder if we should show the data for the test set by default when the page opens, rather than the entire data set?

@alvarezmelissa87
Copy link
Contributor

From the latest comment - #58596 (comment) - it sounds like the behavior we want moving forward is to

  • keep the single matrix with a control for toggling between test / training / entire data set.
  • have the matrix default to displaying test data
  • have a control for specifying whether the filter for the table should apply to the confusion matrix

@peteharverson, @Winterflower - do you agree?

@peteharverson
Copy link
Contributor

@alvarezmelissa87 responding to #58596 (comment), yes I would agree with:

  • keep the single matrix with a control for toggling between test / training / entire data set.
  • have a control for specifying whether the filter for the table should apply to the confusion matrix

I would defer to @Winterflower on whether to default to the test data set when the page opens. If so, we should check if the training percent was 100%, and if so, switch on opening to the training data.

@Winterflower
Copy link
Author

I would defer to @Winterflower on whether to default to the test data set when the page opens. If so, we should check if the training percent was 100%, and if so, switch on opening to the training data.

My personal opinion is to have the matrix default to the test dataset when the page first opens/loads, because this is the option that people would generally care most about - seeing how the model would perform on previously unseen data.

However, if the user has used 100% of the data for training, then of course we would default to showing the training set (which in this case is also the full dataset).

@alvarezmelissa87
Copy link
Contributor

Related work that can be added while making these updates: https://github.com/elastic/ml-team/issues/393#issuecomment-675342503

@alvarezmelissa87
Copy link
Contributor

@peteharverson, @Winterflower just bumping this conversation up for 7.11.

With the updated results view for DFA - do the prior tasks we had decided on before still apply? #58596 (comment)

With the new training/testing quick search tools to the right of the search bar - I think duplicate controls in the confusion matrix section are now unnecessary.

For 7.11 it seems the only update we should make is defaulting the entire results view to reflect just the testing data (assuming training is not set to 100%). Happy to hear thoughts!

@Winterflower
Copy link
Author

Hey @alvarezmelissa87 ! I haven't looked at the confusion matrix in detail in a while. I'll run some test jobs on recent versions and come back with comments.

@Winterflower
Copy link
Author

Hey @alvarezmelissa87 and @peteharverson !
I've been looking at the confusion matrix and I think the only outstanding item that still applies from the comments above is that the confusion matrix filter does not default to the Testing set, but instead to the whole dataset. However, I do not think this is a major issue and we can revisit later if needed.

@alvarezmelissa87
Copy link
Contributor

Closing this issue off for now. Please feel free to reopen/create new issue if behavior needs to be revisited in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Data Frame Analytics ML data frame analytics features :ml v7.12.0
Projects
None yet
Development

No branches or pull requests

5 participants