Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix precision #668

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
eeb257a
add test function for input with identical subseqs
NimaSarajpoor Sep 6, 2022
60bc457
Recalculate var if it is less than a threshold
NimaSarajpoor Sep 6, 2022
f343fac
Add test function for identical subseq with different scales
NimaSarajpoor Sep 7, 2022
0463654
recalculare pearson if it exceeds a threshold
NimaSarajpoor Sep 7, 2022
9ceaf11
Add new test function
NimaSarajpoor Sep 7, 2022
5ecaf36
Add volatile test function
NimaSarajpoor Sep 7, 2022
f278a1d
add test function to investigate issue
NimaSarajpoor Sep 7, 2022
2714e72
Modify test functions
NimaSarajpoor Sep 7, 2022
a5a9f17
modify test function
NimaSarajpoor Sep 7, 2022
031a173
modify test function and add spacer np.inf
NimaSarajpoor Sep 7, 2022
21aacf6
modify input for test function
NimaSarajpoor Sep 10, 2022
9d55342
merge from main
NimaSarajpoor Sep 14, 2022
6eea65b
change std threshold
NimaSarajpoor Sep 15, 2022
2862c52
commenting the bypassing of test function
NimaSarajpoor Sep 15, 2022
36560d7
Rename var name and change its default value
NimaSarajpoor Oct 2, 2022
897b448
Add new seed to test function
NimaSarajpoor Oct 4, 2022
e98a0b8
return std itself instead of its inverse
NimaSarajpoor Oct 4, 2022
4de61e5
recalculate cov if its denom becomes too small
NimaSarajpoor Oct 4, 2022
09a9af2
replace hard coded value with config variable
NimaSarajpoor Oct 4, 2022
f6989fa
Fix format
NimaSarajpoor Oct 4, 2022
353336a
Update test function
NimaSarajpoor Oct 4, 2022
1328d68
update code and docstring
NimaSarajpoor Oct 4, 2022
f3e9f23
Add precision calculation notebook
NimaSarajpoor Oct 8, 2022
49651ff
update notebook
NimaSarajpoor Oct 9, 2022
2adafb0
Add unit tests to capture loss of precision
NimaSarajpoor Oct 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
463 changes: 463 additions & 0 deletions docs/Precision_Calculation.ipynb

Large diffs are not rendered by default.

5 changes: 4 additions & 1 deletion stumpy/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@
STUMPY_MEAN_STD_NUM_CHUNKS = 1
STUMPY_MEAN_STD_MAX_ITER = 10
STUMPY_DENOM_THRESHOLD = 1e-14
STUMPY_STDDEV_THRESHOLD = 1e-7
STUMPY_STDDEV_THRESHOLD = 1e-20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change this?

Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change this?

So, I think to answer this question, we need to answer the following (philosophical!) question first:

Let's say my data is:

seed = 0
T = np.random.rand(64)
identical = T[:8].copy() 
T[:8] = identical * 1e-7
T[-8:] = identical

# m is 8

Note that the z-norm of T[:8] and T[-8:] are the same. Now, of the two statements below, which one is correct?

  • The z-norm Euclidean distance between T[:8] and T[-8:] is zero because their z-norm are the same.
  • The z-norm Euclidean distance between T[:8] and T[-8:] is $\sqrt{m}$ (because T[:8] should be treated as a constant subsequence. And, we know the distance between a constant subsequence and a non-constant subsequence is $\sqrt{m}$ )

If we go with the latter statement, then STUMPY_STDDEV_THRESHOLD = 1e-7 is fine (If you do mp=stumpy.stump(T, m), then you will see that mp[0, 0] is $\sqrt{8}$)

If we choose the former statement, then I think we should do: STUMPY_STDDEV_THRESHOLD = 1e-100

Copy link
Contributor

@seanlaw seanlaw Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going purely by my instincts, I think it should be the former (the distance should be zero because their z-norm are identical).

So, it looks like we only use STUMPY_STDDEV_THRESHOLD in three functions:

stumpy/stumpy/core.py

Lines 923 to 935 in 9a2e2ec

if σ_Q < config.STUMPY_STDDEV_THRESHOLD or Σ_T < config.STUMPY_STDDEV_THRESHOLD:
D_squared = m
else:
denom = m * σ_Q * Σ_T
if np.abs(denom) < config.STUMPY_DENOM_THRESHOLD: # pragma nocover
denom = config.STUMPY_DENOM_THRESHOLD
D_squared = np.abs(2 * m * (1.0 - (QT - m * μ_Q * M_T) / denom))
if (
σ_Q < config.STUMPY_STDDEV_THRESHOLD
and Σ_T < config.STUMPY_STDDEV_THRESHOLD
) or D_squared < config.STUMPY_P_NORM_THRESHOLD:
D_squared = 0

AND

stumpy/stumpy/gpu_stump.py

Lines 140 to 155 in 9a2e2ec

if (
σ_Q[i] < config.STUMPY_STDDEV_THRESHOLD
or Σ_T[j] < config.STUMPY_STDDEV_THRESHOLD
):
p_norm = m
else:
denom = m * σ_Q[i] * Σ_T[j]
if math.fabs(denom) < config.STUMPY_DENOM_THRESHOLD: # pragma nocover
denom = config.STUMPY_DENOM_THRESHOLD
p_norm = abs(2 * m * (1.0 - (QT_out[j] - m * μ_Q[i] * M_T[j]) / denom))
if (
σ_Q[i] < config.STUMPY_STDDEV_THRESHOLD
and Σ_T[j] < config.STUMPY_STDDEV_THRESHOLD
) or p_norm < config.STUMPY_P_NORM_THRESHOLD:
p_norm = 0

AND

T_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD

In the first two cases, it feels like we could/should replace the use of that threshold with T_subseq_isconstant instead? However, in the 3rd case, we are using STUMPY_STDDEV_THRESHOLD to set T_subseq_isconstant! Hmmm, this is nasty. 😢 I don't know what to do.

i guess in all cases, we are trying to allow for some small leniency in defining a constant subsequence. Now I see your point as to why STUMPY_STDDEV_THRESHOLD should be smaller that 1e-7.

Copy link
Contributor

@seanlaw seanlaw Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fundamentally, since all of this relates to identifying a constant subsequence, is there a different/better way to identify a constant subsequence besides using the STDDEV? Initially, I thought it was a simple/cute approach but perhaps it is flawed?

Just to state it out loud, it appears that it is not possible/trivial to distinguish whether a subsequence is constant OR whether it simply has very small values to start with OR both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think we are struggling with this because we are trying to make assumptions for the user (i.e., what should be considered "constant"). The purest/strictest definition is when the STDDEV is equal to zero and I doubt that there is any argument to this, right? Then, there is everything else that is an approximation.

I am starting to think if we should add T_A_subseq_isconstant=None and T_B_subseq_isconstant=None to our top level API which lets the user define the truth and we stop guessing. Any thoughts?

Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd simply do:

# `stumpy.stump`

if T_A_subseq_isconstant is None:
    T_A_subseq_isconstant = σ_Q < config.STUMPY_STDDEV_THRESHOLD

if T_B_subseq_isconstant is None:
    T_B_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD

and then we'd use T_A_subseq_isconstant and T_B_subseq_isconstant to check whether the subsequence is constant. And then we'd leave it as config.STUMPY_STDDEV_THRESHOLD = 1e-7 as the default and if the user doesn't like it then they either choose to change config.STUMPY_STDDEV_THRESHOLD to whatever they want OR pass in T_A_subseq_isconstant and/or T_B_subseq_isconstant directly.

What's really interesting is that if you allow the user to specify T_A_subseq_isconstant then they can choose to apply different constant rules to different subsequences. Imagine your example above:

seed = 0
T = np.random.rand(64)
identical = T[:8].copy() 
T[:8] = identical * 1e-7
T[-8:] = identical

The user can actually choose to treat T[:8] and T[-8:] completely independently:

T_subseq_isconstant = np.full(64 - m + 1, False)
T_subseq_isconstant[0] = False
T_subseq_isconstant[-1] = True

or any permutation of True/False for the first and last subsequences (or any other subsequence for that matter).

I can understand your point of view. I think, however, this might make things a little more complicated. I mean...it would not be easy for a user to set T_subseq_isconstant. Right? Because, they probably need to calculate the rolling std and then use it to compute T_subseq_isconstant. And, this may be used by only a few users. I know you are usually interested in solving the issues for general audience. But, if you think this is fine, then I think your proposed solution is good :)


I just realized that we still have config.STUMPY_STDDEV_THRESHOLD. So, it should be good :) we can think of param T_subseq_isconstant for a little bit more advanced users.

Copy link
Contributor

@seanlaw seanlaw Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean...it would not be easy for a user to set T_subseq_isconstant. Right? Because, they probably need to calculate the rolling std and then use it to compute T_subseq_isconstant. And, this may be used by only a few users. I know you are usually interested in solving the issues for general audience. But, if you think this is fine, then I think your proposed solution is good :)

You raise a valid point. So, first things first, does setting T_A_subseq_isconstant = σ_Q < config.STUMPY_STDDEV_THRESHOLD and T_B_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD help with our constant subsequence precision issue?

Then, separately, do we expose T_A_subseq_isconstant and T_B_subseq_isconstant to the user? It might make sense to take a moment and dig through our past issues to see what issues users were having when dealing with constant subsequences or near constant subsequences.

Again, I think the first part is more important while this second part is "nice to have" but, to your point, it's really for advanced users (because they'd need to know how to call core.rolling_window, which is not particularly hard but also not obvious)!

Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does setting T_A_subseq_isconstant = σ_Q < config.STUMPY_STDDEV_THRESHOLD and T_B_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD help with our constant subsequence precision issue?

I think we are (indirectly) doing it already:

  • In stumpy/stump.py: we use T_A_subseq_isconstant which is one of the outputs returned by core. preprocess_diagonal in which we do T_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD.
  • In core._calculate_squared_distance, we do the same. For instance, let's see lines 923-924 in core.py:
# In `core._calculate_squared_distance` (line 923)
if σ_Q < config.STUMPY_STDDEV_THRESHOLD or Σ_T < config.STUMPY_STDDEV_THRESHOLD:
            D_squared = m 

which is equivalent to:

# in `stumpy/stump.py` (line 185)
 if T_B_subseq_isconstant[i + k] or T_A_subseq_isconstant[i]:
         pearson = 0.5 # NOTE: this is equivalent to `D_squared = m`

btw, if I remember correctly, setting config.STUMPY_STDDEV_THRESHOLD to lower value helped me to resolve imprecision in one case. I cannot remember that particular case, but I can remember a test failed and the reason was the imprecision. The naive version gave 0 as the distance between the two identical case, but the performant version treated one of the identical subseq as constant and gave $\sqrt{m}$ as their distance. (similar to the example I provided earlier). That was the main reason behind setting config.STUMPY_STDDEV_THRESHOLD=1e-20. We can even set it lower, like 1e-100 but I am not sure if it makes sense or not.


do we expose T_A_subseq_isconstant and T_B_subseq_isconstant to the user?

Did you mean if it is reasonable or not to allow user to set these two parameters?


It might make sense to take a moment and dig through our past issues to see what issues users were having when dealing with constant subsequences or near constant subsequences.

Yeah... that's a good idea! I will go and check out the previous issues.

I think the first part is more important while this second part is "nice to have"

You are right! So, to wrap up, we may do:

  • use T_subseq_isconstant in core._calculate_squared_distance to make it consistent with stumpy/stump.py
  • checkout the previous issues regarding identical subseqs
  • try setting config.STUMPY_STDDEV_THRESHOLD to lower value

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that summary sounds good. There is one small note as to why we chose to use the STDDEV to infer a constant region (rather than adding T_subseq_isconstant) and that was to save on memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw
I provided a recap in the main page of PR. But, I will leave this conversation open for now just in case you want to mention something.

STUMPY_P_NORM_THRESHOLD = 1e-14
STUMPY_TEST_PRECISION = 5
STUMPY_MAX_P_NORM_DISTANCE = np.finfo(np.float64).max
STUMPY_MAX_DISTANCE = np.sqrt(STUMPY_MAX_P_NORM_DISTANCE)
STUMPY_EXCL_ZONE_DENOM = 4
STUMPY_MIN_VAR = 1.0
STUMPY_MIN_STD_AB = 1.0 # denom in equation: pearson_AB = cov / (std_A * std_B)
NimaSarajpoor marked this conversation as resolved.
Show resolved Hide resolved
STUMPY_CORRELATION_THRESHOLD = 0.99999999 # 1 - 1e-08
10 changes: 6 additions & 4 deletions stumpy/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -577,6 +577,8 @@ def _welford_nanvar(a, w, a_subseq_isfinite):
* (a[last_idx] - curr_mean + a[prev_start_idx] - prev_mean)
/ w
)
if curr_var < config.STUMPY_MIN_VAR:
curr_var = np.nanvar(a[start_idx:stop_idx])

all_variances[start_idx] = curr_var

Expand Down Expand Up @@ -1738,8 +1740,8 @@ def preprocess_diagonal(T, m):
M_T : numpy.ndarray
Rolling mean with a subsequence length of `m`

Σ_T_inverse : numpy.ndarray
Inverted rolling standard deviation
Σ_T : numpy.ndarray
Rolling standard deviation

M_T_m_1 : numpy.ndarray
Rolling mean with a subsequence length of `m-1`
Expand All @@ -1753,12 +1755,12 @@ def preprocess_diagonal(T, m):
"""
T, T_subseq_isfinite = preprocess_non_normalized(T, m)
M_T, Σ_T = compute_mean_std(T, m)

T_subseq_isconstant = Σ_T < config.STUMPY_STDDEV_THRESHOLD
Σ_T[T_subseq_isconstant] = 1.0 # Avoid divide by zero in next inversion step
Σ_T_inverse = 1.0 / Σ_T
M_T_m_1, _ = compute_mean_std(T, m - 1)

return T, M_T, Σ_T_inverse, M_T_m_1, T_subseq_isfinite, T_subseq_isconstant
return T, M_T, Σ_T, M_T_m_1, T_subseq_isfinite, T_subseq_isconstant


def replace_distance(D, search_val, replace_val, epsilon=0.0):
Expand Down
8 changes: 4 additions & 4 deletions stumpy/scrump.py
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ def __init__(
(
self._T_A,
self._μ_Q,
self._σ_Q_inverse,
self._σ_Q,
self._μ_Q_m_1,
self._T_A_subseq_isfinite,
self._T_A_subseq_isconstant,
Expand All @@ -538,7 +538,7 @@ def __init__(
(
self._T_B,
self._M_T,
self._Σ_T_inverse,
self._Σ_T,
self._M_T_m_1,
self._T_B_subseq_isfinite,
self._T_B_subseq_isconstant,
Expand Down Expand Up @@ -639,8 +639,8 @@ def update(self):
self._m,
self._M_T,
self._μ_Q,
self._Σ_T_inverse,
self._σ_Q_inverse,
self._Σ_T,
self._σ_Q,
self._M_T_m_1,
self._μ_Q_m_1,
self._T_A_subseq_isfinite,
Expand Down
68 changes: 45 additions & 23 deletions stumpy/stump.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ def _compute_diagonal(
m,
M_T,
μ_Q,
Σ_T_inverse,
σ_Q_inverse,
Σ_T,
σ_Q,
cov_a,
cov_b,
cov_c,
Expand Down Expand Up @@ -66,11 +66,11 @@ def _compute_diagonal(
μ_Q : numpy.ndarray
Mean of the query sequence, `Q`, relative to the current sliding window

Σ_T_inverse : numpy.ndarray
Inverse sliding standard deviation of time series, `T`
Σ_T : numpy.ndarray
Sliding standard deviation of time series, `T`

σ_Q_inverse : numpy.ndarray
Inverse standard deviation of the query sequence, `Q`, relative to the current
σ_Q : numpy.ndarray
Standard deviation of the query sequence, `Q`, relative to the current
sliding window

cov_a : numpy.ndarray
Expand Down Expand Up @@ -182,13 +182,35 @@ def _compute_diagonal(

if T_B_subseq_isfinite[i + k] and T_A_subseq_isfinite[i]:
# Neither subsequence contains NaNs
if T_B_subseq_isconstant[i + k] or T_A_subseq_isconstant[i]:
pearson = 0.5
else:
pearson = cov * Σ_T_inverse[i + k] * σ_Q_inverse[i]

if T_B_subseq_isconstant[i + k] and T_A_subseq_isconstant[i]:
pearson = 1.0
elif T_B_subseq_isconstant[i + k] or T_A_subseq_isconstant[i]:
pearson = 0.5
else:
denom = Σ_T[i + k] * σ_Q[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This:

                    denom = Σ_T[i + k] * σ_Q[i]
                    if denom < config.STUMPY_MIN_STD_AB:
                        cov = (
                            np.dot(
                                (T_B[i + k : i + k + m] - M_T[i + k]),
                                (T_A[i : i + m] - μ_Q[i]),
                            )
                            * m_inverse
                        )

Should not be here. I think we want to move this higher up:

        for i in iter_range:
            denom = Σ_T[i + k] * σ_Q[i]
            if i == 0 or (k < 0 and i == -k) or denom < config.STUMPY_MIN_STDDEVS:
                cov = (
                    np.dot(
                        (T_B[i + k : i + k + m] - M_T[i + k]), (T_A[i : i + m] - μ_Q[i])
                    )
                    * m_inverse
                )
            else:
                # The next lines are equivalent and left for reference
                # cov = cov + constant * (
                #     (T_B[i + k + m - 1] - M_T_m_1[i + k])
                #     * (T_A[i + m - 1] - μ_Q_m_1[i])
                #     - (T_B[i + k - 1] - M_T_m_1[i + k]) * (T_A[i - 1] - μ_Q_m_1[i])
                # )
                cov = cov + constant * (
                    cov_a[i + k] * cov_b[i] - cov_c[i + k] * cov_d[i]
                )

This would make things much cleaner and without repeating the np.dot code in multiple places. Note that the inverse stddevs were used because they would avoid the slightly more costly division operation (compared with simply multiplication operations). By using the inverse, the inverse (division) of each subsequence's stddev is only computed once and then used in multiplications. Here, we are basically re-computing the division every single time each subsequence's stddev is compared with another. There is no reason why you couldn't do something like:

stddevs_inverse = Σ_T_inverse[i + k] * σ_Q_inverse[i]
if i == 0 or (k < 0 and i == -k)  or stddevs_inverse > config.STUMPY_MIN_STDDEVS_INVERSE:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make things much cleaner and without repeating the np.dot code in multiple places

Yeah...right! Thanks for the feedback!

Note that the inverse stddevs were used because they would avoid the slightly more costly division operation (compared with simply multiplication operations). By using the inverse, the inverse (division) of each subsequence's stddev is only computed once and then used in multiplications. Here, we are basically re-computing the division every single time each subsequence's stddev is compared with another. There is no reason why you couldn't do something like:

Yeah...I remember your explanation. But, I thought it is not possible to use inverse here. I was wrong! config.STUMPY_MIN_STDDEVS_INVERSE = 1.0 (default) should work.

if denom < config.STUMPY_MIN_STD_AB:
cov = (
np.dot(
(T_B[i + k : i + k + m] - M_T[i + k]),
(T_A[i : i + m] - μ_Q[i]),
)
* m_inverse
)

pearson = cov / denom
if pearson > 1.0:
pearson = 1.0

# if config.STUMPY_CORRELATION_THRESHOLD <= pearson < 1.0:
# cov = (
# np.dot(
# (T_B[i + k : i + k + m] - M_T[i + k]),
# (T_A[i : i + m] - μ_Q[i]),
# )
# * m_inverse
# )

# pearson = cov * Σ_T_inverse[i + k] * σ_Q_inverse[i]

if pearson > ρ[thread_idx, i, 0]:
ρ[thread_idx, i, 0] = pearson
Expand Down Expand Up @@ -225,8 +247,8 @@ def _stump(
m,
M_T,
μ_Q,
Σ_T_inverse,
σ_Q_inverse,
Σ_T,
σ_Q,
M_T_m_1,
μ_Q_m_1,
T_A_subseq_isfinite,
Expand Down Expand Up @@ -259,11 +281,11 @@ def _stump(
μ_Q : numpy.ndarray
Mean of the query sequence, `Q`, relative to the current sliding window

Σ_T_inverse : numpy.ndarray
Inverse sliding standard deviation of time series, `T`
Σ_T : numpy.ndarray
Sliding standard deviation of time series, `T`

σ_Q_inverse : numpy.ndarray
Inverse standard deviation of the query sequence, `Q`, relative to the current
σ_Q : numpy.ndarray
Standard deviation of the query sequence, `Q`, relative to the current
sliding window

M_T_m_1 : numpy.ndarray
Expand Down Expand Up @@ -384,8 +406,8 @@ def _stump(
m,
M_T,
μ_Q,
Σ_T_inverse,
σ_Q_inverse,
Σ_T,
σ_Q,
cov_a,
cov_b,
cov_c,
Expand Down Expand Up @@ -545,7 +567,7 @@ def stump(T_A, m, T_B=None, ignore_trivial=True, normalize=True, p=2.0):
(
T_A,
μ_Q,
σ_Q_inverse,
σ_Q,
μ_Q_m_1,
T_A_subseq_isfinite,
T_A_subseq_isconstant,
Expand All @@ -554,7 +576,7 @@ def stump(T_A, m, T_B=None, ignore_trivial=True, normalize=True, p=2.0):
(
T_B,
M_T,
Σ_T_inverse,
Σ_T,
M_T_m_1,
T_B_subseq_isfinite,
T_B_subseq_isconstant,
Expand Down Expand Up @@ -600,8 +622,8 @@ def stump(T_A, m, T_B=None, ignore_trivial=True, normalize=True, p=2.0):
m,
M_T,
μ_Q,
Σ_T_inverse,
σ_Q_inverse,
Σ_T,
σ_Q,
M_T_m_1,
μ_Q_m_1,
T_A_subseq_isfinite,
Expand Down
12 changes: 6 additions & 6 deletions stumpy/stumped.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ def stumped(dask_client, T_A, m, T_B=None, ignore_trivial=True, normalize=True,
(
T_A,
μ_Q,
σ_Q_inverse,
σ_Q,
μ_Q_m_1,
T_A_subseq_isfinite,
T_A_subseq_isconstant,
Expand All @@ -150,7 +150,7 @@ def stumped(dask_client, T_A, m, T_B=None, ignore_trivial=True, normalize=True,
(
T_B,
M_T,
Σ_T_inverse,
Σ_T,
M_T_m_1,
T_B_subseq_isfinite,
T_B_subseq_isconstant,
Expand Down Expand Up @@ -202,8 +202,8 @@ def stumped(dask_client, T_A, m, T_B=None, ignore_trivial=True, normalize=True,
T_B_future = dask_client.scatter(T_B, broadcast=True, hash=False)
M_T_future = dask_client.scatter(M_T, broadcast=True, hash=False)
μ_Q_future = dask_client.scatter(μ_Q, broadcast=True, hash=False)
Σ_T_inverse_future = dask_client.scatter(Σ_T_inverse, broadcast=True, hash=False)
σ_Q_inverse_future = dask_client.scatter(σ_Q_inverse, broadcast=True, hash=False)
Σ_T_future = dask_client.scatter(Σ_T, broadcast=True, hash=False)
σ_Q_future = dask_client.scatter(σ_Q, broadcast=True, hash=False)
M_T_m_1_future = dask_client.scatter(M_T_m_1, broadcast=True, hash=False)
μ_Q_m_1_future = dask_client.scatter(μ_Q_m_1, broadcast=True, hash=False)
T_A_subseq_isfinite_future = dask_client.scatter(
Expand Down Expand Up @@ -238,8 +238,8 @@ def stumped(dask_client, T_A, m, T_B=None, ignore_trivial=True, normalize=True,
m,
M_T_future,
μ_Q_future,
Σ_T_inverse_future,
σ_Q_inverse_future,
Σ_T_future,
σ_Q_future,
M_T_m_1_future,
μ_Q_m_1_future,
T_A_subseq_isfinite_future,
Expand Down
2 changes: 1 addition & 1 deletion tests/naive.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from stumpy import core, config


def z_norm(a, axis=0, threshold=1e-7):
def z_norm(a, axis=0, threshold=config.STUMPY_STDDEV_THRESHOLD):
std = np.std(a, axis, keepdims=True)
std[np.less(std, threshold, where=~np.isnan(std))] = 1.0

Expand Down
9 changes: 4 additions & 5 deletions tests/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -792,36 +792,35 @@ def test_preprocess_diagonal():

ref_T = np.array([0, 0, 2, 3, 4, 5, 6, 7, 0, 9], dtype=float)
ref_M, ref_Σ = naive.compute_mean_std(ref_T, m)
ref_Σ_inverse = 1.0 / ref_Σ
ref_M_m_1, _ = naive.compute_mean_std(ref_T, m - 1)

(
comp_T,
comp_M,
comp_Σ_inverse,
comp_Σ,
comp_M_m_1,
comp_T_subseq_isfinite,
comp_T_subseq_isconstant,
) = core.preprocess_diagonal(T, m)

npt.assert_almost_equal(ref_T, comp_T)
npt.assert_almost_equal(ref_M, comp_M)
npt.assert_almost_equal(ref_Σ_inverse, comp_Σ_inverse)
npt.assert_almost_equal(ref_Σ, comp_Σ)
npt.assert_almost_equal(ref_M_m_1, comp_M_m_1)

T = pd.Series(T)
(
comp_T,
comp_M,
comp_Σ_inverse,
comp_Σ,
comp_M_m_1,
comp_T_subseq_isfinite,
comp_T_subseq_isconstant,
) = core.preprocess_diagonal(T, m)

npt.assert_almost_equal(ref_T, comp_T)
npt.assert_almost_equal(ref_M, comp_M)
npt.assert_almost_equal(ref_Σ_inverse, comp_Σ_inverse)
npt.assert_almost_equal(ref_Σ, comp_Σ)
npt.assert_almost_equal(ref_M_m_1, comp_M_m_1)


Expand Down
Loading