[ENH] Replace `prts` metrics #2400

aryanpola · 2024-11-26T05:42:42Z

Reference Issues/PRs

Addresses #2066

What does this implement/fix? Explain your changes.

Implementation of Precision, Recall and F1-score metrics.

PR checklist

For all contributions

I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you.
The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.

aeon-actions-bot · 2024-11-26T05:43:06Z

Thank you for contributing to `aeon`

I have added the following labels to this PR based on the title: [ $\color{#FEF1BE}{\textsf{enhancement}}$ ].
I have added the following labels to this PR based on the changes made: [ $\color{#264F59}{\textsf{benchmarking}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
Disable numba cache loading
Push an empty commit to re-run CI checks

MatthewMiddlehurst

Some of these functions look like they should be private/protected. In the tests please run both the new and current functions to ensure the output is the same (using pytest)

aeon/benchmarking/metrics/anomaly_detection/range_metrics.py

aeon/benchmarking/metrics/anomaly_detection/tests/test_metrics.py

aryanpola · 2024-12-30T18:40:48Z

Changes made from the previous commit:

Added a function for flattening lists of lists into a list.
Removes overlapping functions.
Changed the metric calculations from averaging(w.r.t cardinality) to global(w.r.t overlapping positions) calculation of real and pred ranges.

aryanpola · 2024-12-30T19:29:40Z

@MatthewMiddlehurst Please let me know if the test cases are fine, I can separate them into different functions if necessary.

TonyBagnall · 2024-12-31T18:27:20Z

thanks for this, its really helpful. We will look again next week

aryanpola · 2025-01-14T10:20:36Z

lgtm, just make the functions protected and its good to go imo

Done. Thanks a bunch for reviewing!

SebastianSchmidl

Thank you for your valuable contribution! The code looks good, and I especially appreciate your effort to compare the results of your implementation to the existing one. Which code did you use to generate the fixtures?

However, I am not 100% happy with the existing state:

The code lacks information and details in the documentation, e.g.
- There is no reference to the original paper: http://papers.nips.cc/paper/7462-precision-and-recall-for-time-series.pdf
- There is no description, how the ts_precision or the ts_recall are calculated. What is "Global Precision"?
- It is unclear, how the parameters influence the calculations. E.g., for ts_precision, the combination bias_type="udf_gamma" && udf_gamma!=0 (I guess this is how one would want to set the existence reward weight) is not a good idea. Because precision by definition emphasizes on prediction quality, there is no need for an existence reward and the Gamma should always be set to 0.
- From the documentation, it is unclear, how the inputs y_pred and y_real should look like. How are the "real (actual) ranges" defined: in this case, we require a list of tuples with the begin and end indices. Is the end index inclusive or exclusive? How does the nested list work?
The other AD performance metrics use bit masks as input. The new functions are incompatible. I guess, we can merge this without properly integrating the functions yet. Then, we can look at how to transform the bit masks to index ranges in a separate PR, when we remove the prts-package.
The tests are all using the same parameter configuration. We should also test other parameters to see if the code is compatible with the original authors' one.

MatthewMiddlehurst · 2025-01-15T14:50:44Z

I would like someone to independently verify the output of old and new are the same, given that the the expected results are hard coded.

Alternatively, if you could provide a code snippet running both functions to help us verify that would be helpful (this would not go in the repo).

aryanpola · 2025-01-16T07:52:47Z

The current implementation is a classical metric one. But the issue is to remove the prts package which is a range based metric, my previous commits cover the code for range based metrics.
It was a fault on my end while trying to run TSAD-evaluator which provides classical and range based metrics.
TSAD-evaluator: https://github.com/IntelLabs/TSAD-Evaluator

Apologies for any confusion caused.

aryanpola · 2025-01-16T07:57:05Z

@SebastianSchmidl @MatthewMiddlehurst Do you want me to add point based metrics in a different file? I'll alter code to range based in this file.

SebastianSchmidl · 2025-01-16T13:46:58Z

Hi @aryanpola,

I do not get what you mean by "point based metrics". We are interested in the metrics described in the paper by Tatbul et al.: "Precision and Recall for Time Series" at NeurIPS 2018, linked in the issue. The paper introduces "range-based recall" in Sect. 4.1 and "range-based precision" in Sect. 4.2. Those are the metrics, I want to have in aeon. They can, then, also be combined for a "range-based F1"-score.

As I understand it, the repository https://github.com/IntelLabs/TSAD-Evaluator was created by the paper's authors and also links to this paper. So, I assume that it contains the implementation for the metrics. But I have not looked into the code in details, so I cannot tell you, which portion they describe in the paper. Excuse any possible confusions caused by me not explaining my view on the code earlier.

Can you identify the relevant code for "range-based recall" and "range-based precision" in https://github.com/IntelLabs/TSAD-Evaluator?

aryanpola · 2025-01-17T07:38:33Z

Can you identify the relevant code for "range-based recall" and "range-based precision" in https://github.com/IntelLabs/TSAD-Evaluator?

Yes yes, it's just that I got carried away trying to implement classic metrics (obviously not what the issue stated).
I've updated the code with the necessary changes and added more test cases which should cover everything.

aryanpola · 2025-01-17T07:41:04Z

Also, value errors will rise when gamma = udf_gamma in precision which wouldn't affect the precision calculation.

MatthewMiddlehurst

Its fine to use bits of documentation from the other function when applicable. I'll leave the input type call to Sebastian but bit odd to change it IMO.

Again i would like the output of the current functions verified against the current ones or the current functions to be used in testing for now just to verify that the output is the same.

aeon/benchmarking/metrics/anomaly_detection/range_metrics.py

aryanpola · 2025-01-20T12:35:07Z

@MatthewMiddlehurst There is repetition of documentation in public functions so changing them to point towards other functions would be a bad idea.
Let me know if you want me to change them.

MatthewMiddlehurst · 2025-01-22T14:28:14Z

I'm not quite sure what you mean. In case there is confusion of the original issue, the end goal is to remove the other version of these functions eventually.

aryanpola · 2025-01-23T09:34:23Z

I'm not quite sure what you mean. In case there is confusion of the original issue, the end goal is to remove the other version of these functions eventually.

The implementation supports range metrics (ts_precision, ts_recall, ts_fscore) and removes the dependency on prts package.

aryanpola · 2025-01-23T09:39:05Z

Also, there are some changes to be made:

Separate parameters in ts_fscore -> alpha, bias to p_alpha, r_alpha, p_bias, r_bias.
Remove udf_gamma from precision. Currently, it only raises errors.

I'll update in some time.

aryanpola · 2025-01-24T12:48:38Z

@MatthewMiddlehurst @SebastianSchmidl @TonyBagnall changes from my side are done, please review when free and suggest changes if any :)

MatthewMiddlehurst · 2025-01-24T13:26:08Z

The implementation supports range metrics (ts_precision, ts_recall, ts_fscore) and removes the dependency on prts package.

I may just be missing something here. Why would there be repetition in documentation if you use documentation from the functions we are removing and replacing. The prts functions are still present and prts is still a dependency for the package. The replacement functions do not take the same input. I would disagree with the old ones being removed in this PR currently, given the ones you have implemented are different and unverified.

Could you please:

Describe why the new implementations have deviated from the original wrappers in terms of input https://github.com/aeon-toolkit/aeon/blob/main/aeon/benchmarking/metrics/anomaly_detection/_binary.py
Write tests comparing the old and new versions to ensure they have the same output.

aryanpola · 2025-01-25T06:16:16Z

I may just be missing something here. Why would there be repetition in documentation if you use documentation from the functions we are removing and replacing.

I meant for the functions precision and recall in the newer implementation, they have the same docstrings.

aryanpola · 2025-01-25T06:25:36Z

The already existing implementation (_binary.py) takes inputs as binary values (arrays). The one we are trying to replace with, takes inputs as list of tuples. So the changes would be to convert the list of tuples to binary arrays for compatibility with already existing functions?

Thanks for pointing out the conversion, haven't checked the input data type for _binary.py.

aryanpola and others added 2 commits November 26, 2024 11:08

Pre-commit fixes

301641d

Merge branch 'aeon-toolkit:main' into recall

c4db216

aryanpola requested review from TonyBagnall, MatthewMiddlehurst, hadifawaz1999 and dguijo as code owners November 26, 2024 05:42

aeon-actions-bot bot added benchmarking Benchmarking package enhancement New feature, improvement request or other non-bug code enhancement labels Nov 26, 2024

aryanpola marked this pull request as draft November 26, 2024 05:43

aryanpola added 2 commits November 26, 2024 15:13

Position parameter in calculate_bias

cc1101a

Merge remote-tracking branch 'origin/recall' into recall

31ed73d

MatthewMiddlehurst changed the title ~~Recall [ENH]~~ [ENH] Recall Nov 26, 2024

aryanpola and others added 5 commits November 30, 2024 15:48

Added recall metric

1028942

merged into into one file

d4dc5ca

test added

4db8027

Merge branch 'main' into recall

4baaec7

Merge branch 'main' into recall

43cd9ac

MatthewMiddlehurst reviewed Dec 27, 2024

View reviewed changes

aeon/benchmarking/metrics/anomaly_detection/range_metrics.py Outdated Show resolved Hide resolved

aeon/benchmarking/metrics/anomaly_detection/tests/test_metrics.py Outdated Show resolved Hide resolved

MatthewMiddlehurst changed the title ~~[ENH] Recall~~ [ENH] Replace prts metrics Dec 27, 2024

aryanpola added 3 commits December 29, 2024 08:01

Changes in test and range_metrics

c098731

list of list running but error!

497362f

flattening lists, all cases passed

ab87680

Merge branch 'main' into recall

446e058

aryanpola marked this pull request as ready for review December 30, 2024 18:41

aryanpola added 2 commits December 31, 2024 00:31

Empty-Commit

c18af4f

Merge remote-tracking branch 'origin/recall' into recall

010d994

Merge branch 'main' into recall

dfa9046

aryanpola requested a review from TonyBagnall January 14, 2025 10:27

SebastianSchmidl reviewed Jan 14, 2025

View reviewed changes

aryanpola added 3 commits January 15, 2025 15:05

Changes in documentation

b5bfab4

Merge remote-tracking branch 'origin/recall' into recall

576aaae

Changed test cases into seperate functions

da81823

aryanpola added 4 commits January 17, 2025 11:29

test cases added and added range recall

f9732eb

udf_gamma removed from precision

48238f3

changes

0561981

more changes

4f4f617

MatthewMiddlehurst reviewed Jan 20, 2025

View reviewed changes

aryanpola added 2 commits January 20, 2025 17:57

recommended changes

26b5029

changes

fa60406

aryanpola added 2 commits January 23, 2025 15:34

Added Parameters

c48d426

removed udf_gamma from precision

b13ba4a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Replace `prts` metrics #2400

[ENH] Replace `prts` metrics #2400

aryanpola commented Nov 26, 2024 •

edited

Loading

aeon-actions-bot bot commented Nov 26, 2024

MatthewMiddlehurst left a comment

aryanpola commented Dec 30, 2024

aryanpola commented Dec 30, 2024

TonyBagnall commented Dec 31, 2024

aryanpola commented Jan 14, 2025

SebastianSchmidl left a comment

MatthewMiddlehurst commented Jan 15, 2025

aryanpola commented Jan 16, 2025

aryanpola commented Jan 16, 2025

SebastianSchmidl commented Jan 16, 2025

aryanpola commented Jan 17, 2025

aryanpola commented Jan 17, 2025

MatthewMiddlehurst left a comment •

edited

Loading

aryanpola commented Jan 20, 2025

MatthewMiddlehurst commented Jan 22, 2025

aryanpola commented Jan 23, 2025

aryanpola commented Jan 23, 2025 •

edited

Loading

aryanpola commented Jan 24, 2025 •

edited

Loading

MatthewMiddlehurst commented Jan 24, 2025 •

edited

Loading

aryanpola commented Jan 25, 2025 •

edited

Loading

aryanpola commented Jan 25, 2025 •

edited

Loading

[ENH] Replace prts metrics #2400

Are you sure you want to change the base?

[ENH] Replace prts metrics #2400

Conversation

aryanpola commented Nov 26, 2024 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

PR checklist

For all contributions

aeon-actions-bot bot commented Nov 26, 2024

Thank you for contributing to aeon

PR CI actions

MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

aryanpola commented Dec 30, 2024

aryanpola commented Dec 30, 2024

TonyBagnall commented Dec 31, 2024

aryanpola commented Jan 14, 2025

SebastianSchmidl left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst commented Jan 15, 2025

aryanpola commented Jan 16, 2025

aryanpola commented Jan 16, 2025

SebastianSchmidl commented Jan 16, 2025

aryanpola commented Jan 17, 2025

aryanpola commented Jan 17, 2025

MatthewMiddlehurst left a comment • edited Loading

Choose a reason for hiding this comment

aryanpola commented Jan 20, 2025

MatthewMiddlehurst commented Jan 22, 2025

aryanpola commented Jan 23, 2025

aryanpola commented Jan 23, 2025 • edited Loading

aryanpola commented Jan 24, 2025 • edited Loading

MatthewMiddlehurst commented Jan 24, 2025 • edited Loading

aryanpola commented Jan 25, 2025 • edited Loading

aryanpola commented Jan 25, 2025 • edited Loading

[ENH] Replace `prts` metrics #2400

[ENH] Replace `prts` metrics #2400

aryanpola commented Nov 26, 2024 •

edited

Loading

Thank you for contributing to `aeon`

MatthewMiddlehurst left a comment •

edited

Loading

aryanpola commented Jan 23, 2025 •

edited

Loading

aryanpola commented Jan 24, 2025 •

edited

Loading

MatthewMiddlehurst commented Jan 24, 2025 •

edited

Loading

aryanpola commented Jan 25, 2025 •

edited

Loading

aryanpola commented Jan 25, 2025 •

edited

Loading