Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #998: Speed up stumpi and aampi #1001

Merged
merged 24 commits into from
Sep 13, 2024
Merged

Conversation

NimaSarajpoor
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor commented Jul 8, 2024

See #998 .

  • Speed up stumpi, _update_egress method
  • Speed up stumpi, _update method
  • Speed up aampi, _update_egress method
  • Speed up aampi, _update method

stumpy/stumpi.py Outdated Show resolved Hide resolved
stumpy/stumpi.py Outdated Show resolved Hide resolved
stumpy/stumpi.py Outdated Show resolved Hide resolved
stumpy/stumpi.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jul 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.33%. Comparing base (fb9a125) to head (8fc35ea).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1001      +/-   ##
==========================================
+ Coverage   97.32%   97.33%   +0.01%     
==========================================
  Files          89       89              
  Lines       14964    15027      +63     
==========================================
+ Hits        14563    14626      +63     
  Misses        401      401              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

stumpy/aampi.py Outdated Show resolved Hide resolved
stumpy/stumpi.py Outdated Show resolved Hide resolved
tests/test_core.py Outdated Show resolved Hide resolved
@NimaSarajpoor NimaSarajpoor changed the title [WIP] Fix #998: Speed up stumpi and aampi Fix #998: Speed up stumpi and aampi Jul 28, 2024
@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I think it is ready. What do you think?

@seanlaw
Copy link
Contributor

seanlaw commented Jul 28, 2024

@NimaSarajpoor Please allow me some time to review it

@seanlaw
Copy link
Contributor

seanlaw commented Jul 29, 2024

@NimaSarajpoor For completeness, are you able to provide some timings for the speedup (before and after your code changes) here in the comments? I trust that the code is indeed faster but at least we can document it here.

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I checked the performance for time series with length 1000, 10_000, and 100_000.

import time
import numpy as np
import stumpy

def get_running_time(n, m=50):
    seed = 0
    np.random.seed(seed)
    T = np.random.rand(n)

    n_iter = 100
    vals = np.random.rand(n_iter)

    obj = stumpy.stumpi(T, m, egress=True)  # try with stumpi / aampi, and egress: True / False
    t_lst = []
    for val in vals:
        start = time.time()
        obj.update(val)
        t_lst.append(time.time() - start)
    
    return np.mean(t_lst[1:]), np.std(t_lst[1:])


if __name__ == "__main__":
        n = 1000  # try with 1000 / 10_000 / 100_000
        out = get_running_time(n)
        print(f'mean: {out[0]}, std: {out[1]}' )

In the following table, n is the length of time series. The results are obtained by running the script above on Apple M1 with 8G Memory. Each value provided in the running time columns is the average running time of 100 .updates() excluding the first one. The speedup percentage is provided in the right-most column.

n = 1000 running time
(current version)
running time
(PR's version)
Speedup (%)
stumpi(egress=True) 0.00035 0.00014 60
stumpi(egress=False) 0.00033 0.00015 54.54545455
aampi(egress=True) 0.00026 0.00004 84.61538462
aampi(egress=False) 0.00022 0.00006 72.72727273
n = 10_000 running time
(current version)
running time
(PR's version)
Speedup (%)
stumpi(egress=True) 0.00138 0.00022 84.05797101
stumpi(egress=False) 0.00114 0.00023 79.8245614
aampi(egress=True) 0.00147 0.00032 78.23129252
aampi(egress=False) 0.00121 0.00032 73.55371901
n = 100_000 running time
(current version)
running time
(PR's version)
Speedup (%)
stumpi(egress=True) 0.01162 0.00079 93.20137694
stumpi(egress=False) 0.00943 0.00088 90.66808059
aampi(egress=True) 0.01387 0.00282 79.66834895
aampi(egress=False) 0.01154 0.00285 75.30329289

@seanlaw
Copy link
Contributor

seanlaw commented Aug 1, 2024

@NimaSarajpoor Can you tell me how you are computing "Speedup %"? The numbers don't look right to me. I think perhaps the wording should be "Percent Reduction" (i.e., 100 * (new-old)/old).

I think Percent Speedup would be 100 * (old-new)/new OR you can say "X times faster" by simply doing old/new. I prefer "X times faster" for our comparison

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw

I think perhaps the wording should be "Percent Reduction" (i.e., 100 * (new-old)/old).

Right.... that is how I calculated the numbers.

OR you can say "X times faster" by simply doing old/new. I prefer "X times faster" for our comparison.

Noted. I like this more as it is clearer. To avoid confusion for future readers who follow the comments, I am going to provide the tables with the new numbers below:

n = 1000 running time
(current version)
running time
(PR's version)
X times faster
stumpi(egress=True) 0.00035 0.00014 2.5
stumpi(egress=False) 0.00033 0.00015 2.2
aampi(egress=True) 0.00026 0.00004 6.5
aampi(egress=False) 0.00022 0.00006 3.7
n = 10_000 running time
(current version)
running time
(PR's version)
X times faster
stumpi(egress=True) 0.00138 0.00022 6.27
stumpi(egress=False) 0.00114 0.00023 4.9
aampi(egress=True) 0.00147 0.00032 4.6
aampi(egress=False) 0.00121 0.00032 3.8
n = 100_000 running time
(current version)
running time
(PR's version)
X times faster
stumpi(egress=True) 0.01162 0.00079 14.7
stumpi(egress=False) 0.00943 0.00088 10.7
aampi(egress=True) 0.01387 0.00282 4.9
aampi(egress=False) 0.01154 0.00285 4.0

@seanlaw
Copy link
Contributor

seanlaw commented Aug 1, 2024

@NimaSarajpoor Considering that all of the existing tests are passing and the performance is improved, I feel pretty good about merging this. Do you think it's ready? Was there anything that you had doubts about? It looks like there's refactor of the code and then njit-ing that code

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw

It looks like there's refactor of the code and then njit-ing that code

Right. That's it!

Was there anything that you had doubts about?

My first concern is whether the added test function is clear. My second concern is regarding the comment I added for case 2 in the test function, i.e.

    # case 2: For a given time series `T`, obtain the matrix profile `P` and
    # matrix profile indices `I` of `T[1:]` based on the matrix profile and
    # matrix profile indices of `T[:-1]`.
    # In the following test: n_appended = 1

I think the comment above is slightly wrong. I think I need to make it clear that the updated profile is different than just doing stumpy.stump(T[1:], m). So, I think it should have been something like this:

    # case 2: For a given time series `T`, obtain the matrix profile `P` and
    # matrix profile indices `I` of `T[1:]` based on the matrix profile and
    # matrix profile indices of `T[:-1]`, WITHOUT DISREGARDING THE NEAREST
    # NEIGHBOURS IN THE PROFILE THAT REFERS TO ALREADY-REMOVED DATA.
    # In the following test: n_appended = 1

@seanlaw
Copy link
Contributor

seanlaw commented Aug 4, 2024

My first concern is whether the added test function is clear. My second concern is regarding the comment I added for case 2 in the test function

Okay, I will take a closer look.

[Update]

@NimaSarajpoor For the most part, I think case 1 is fine as it is looks like it is simply updating things by adding a single new data point. Having said that, I can't understand case 2. There seems to be too much happening and your intent isn't clear even with the comment(s). Also, you refer to n_appended but it would be nice if you could leave a note to remind people why it is important or what the relevance of that variable is (it's been a long time and I can guess at it but I too have also forgotten).

WITHOUT DISREGARDING THE NEAREST NEIGHBOURS IN THE PROFILE THAT REFERS TO ALREADY-REMOVED DATA.

I think this is probably the most important thing to highlight. It sounds like this test is trying to make sure that your newly added function respects this point and does not ignore it. Is that right? And all of this is associated with n_appended?

I think it would make sense to split case 1 and case 2 into two separate tests with more specific names (i.e., the name of the first case can be kept but it seems like you are testing something more nuanced in the second case).

stumpy/core.py Show resolved Hide resolved
@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 6, 2024

@seanlaw

I can't understand case 2. There seems to be too much happening and your intent isn't clear even with the comment(s)

I have the same feeling regarding case 2, which represents egress=True case.

WITHOUT DISREGARDING THE NEAREST NEIGHBOURS IN THE PROFILE THAT REFERS TO ALREADY-REMOVED DATA.

I think this is probably the most important thing to highlight. It sounds like this test is trying to make sure that your newly added function respects this point and does not ignore it. Is that right? And all of this is associated with n_appended?
Right.

My main point was to help future me remember why I used a particular approach for calculating P_ref and I_ref. Now that you mention it, I think it would be good to have a test to just check that specific statement.

I think it would make sense to split case 1 and case 2 into two separate tests with more specific names (i.e., the name of the first case can be kept but it seems like you are testing something more nuanced in the second case).

Noted. Please allow me to separate the cases, and revise the code.

@seanlaw
Copy link
Contributor

seanlaw commented Aug 6, 2024

My main point was to help future me remember why I used a particular approach for calculating P_ref and I_ref. Now that you mention it, I think it would be good to have a test to just check that specific statement.

Exactly! Thank you for persisting

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I am making some minor changes. Will let you know once I am done so that you can provide with me your comments.

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I think it is ready for your review. You may want to pay closer attention to the docstring of the new function in core.py and the test functions, particularly test_update_incremental_PI_egressTrue_MemoryCheck. This test function still looks a bit weird but let's see what you think.

@seanlaw
Copy link
Contributor

seanlaw commented Sep 11, 2024

@NimaSarajpoor I will take a look

Copy link
Contributor

@seanlaw seanlaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NimaSarajpoor I think the docstrings are fine. Do you think we are ready to merge?

tests/test_core.py Outdated Show resolved Hide resolved
@NimaSarajpoor
Copy link
Collaborator Author

I replaced random with np.random.

@seanlaw

I think the docstrings are fine. Do you think we are ready to merge?

Thanks for checking that! So, feel free to merge it once all tests pass

@seanlaw seanlaw merged commit 692c99c into TDAmeritrade:main Sep 13, 2024
33 checks passed
@seanlaw
Copy link
Contributor

seanlaw commented Sep 13, 2024

@NimaSarajpoor Thanks again for the wonderful contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants