Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Description not matching function name #724

Closed
emanuelef opened this issue Jun 25, 2020 · 7 comments
Closed

Description not matching function name #724

emanuelef opened this issue Jun 25, 2020 · 7 comments

Comments

@emanuelef
Copy link
Contributor

Hi,
just noticed a possible mismatch in the documentation, is it possible that the descriptions for
percentage_of_reoccurring_datapoints_to_all_datapoints and
percentage_of_reoccurring_values_to_all_values
have been swapped ?

def percentage_of_reoccurring_datapoints_to_all_datapoints(x):

def percentage_of_reoccurring_values_to_all_values(x):

@nils-braun
Copy link
Collaborator

Hi @emanuelef !
Thanks for the issue :-) You are right, there is something unclear there.
In my opinion, I think the descriptions match the code:

   ...
        len(different values occurring more than once) / len(different values)
    This means the percentage is normalized to the number of unique values,
    ...
np.sum(counts > 1) / float(counts.shape[0])

and

    ...
        # of data points occurring more than once / # of all data points
    This means the ratio is normalized to the number of data points in the time series
    ...
reoccuring_values / x.size

However, one could argue that the function names themselves are swapped!
Would you like to do a pull request to change that?

@emanuelef
Copy link
Contributor Author

Sure, I guess also tests needs to be fixed then:

self.assertAlmostEqualOnAllArrayTypes(percentage_of_reoccurring_datapoints_to_all_datapoints, [1, 1, 2, 3, 4],

I also wanted to ask why one has two decorators:
@set_property("fctype", "simple")
@set_property("input", "pd.Series")
but the other one misses "input"

Thanks

@emanuelef
Copy link
Contributor Author

emanuelef commented Jun 26, 2020

Another question is about the name of the test for percentage_of_reoccurring_values_to_all_values
which is currently test_ratio_of_doubled_values

def test_ratio_of_doubled_values(self):

@emanuelef
Copy link
Contributor Author

Last question is about creating the PR, I cannot create a branch here, ok to create the PR from a fork ?

@nils-braun
Copy link
Collaborator

nils-braun commented Jun 27, 2020

Let me try to answer:

  • Yes, forking is the usual way (not only for tsfresh, but basically for every project on GitHub)
  • Feel free to change every naming that makes sense to you! We can discuss the details on the pull request :-)
  • The input parameter controls the type conversion. I need to look up the details, but it looks like it is not needed. You can leave it as it is for now

@nils-braun
Copy link
Collaborator

Sorry quick update: the input parameter is fine. The second function uses the "value_counts" method of the pandas series whereas the first one uses numpy functionality.

@nils-braun
Copy link
Collaborator

Fixed in referencing PR

earthgecko added a commit to earthgecko/tsfresh that referenced this issue Dec 31, 2020
IssueID #3924: v0.17.9

- Readded baseline unit tests
- Revert to the original sum_of_reoccurring_values v0.4.0 method which was
  changed and the new feature called sum_of_reoccurring_data_points was
  added which results in the same value as the original v0.4.0
  sum_of_reoccurring_values method. The new sum_of_reoccurring_values method
  introduced results in different results as per:
  NOT in baseline   :: [['value__sum_of_reoccurring_values', '49922.0']]
  NOT in calculated :: [['value__sum_of_reoccurring_values', '109822.0']]
- Disable estimate_friedrich_coefficients feature added in v0.6.0
- Disable friedrich_coefficients feature added in v0.6.0
- Disabled max_langevin_fixed_point added in v0.6.0
- Disabled friedrich_coefficients and max_langevin_fixed_point in settings added
  in v0.6.0
- Updated very minor precision changes in the following features which changed
  in v0.6.0
  value__autocorrelation__lag_6 old: 0.5124801685138611, new: 0.5124801685138614, diff: -0.00000000000000022204
  value__autocorrelation__lag_8 old: 0.3600822542968588, new: 0.3600822542968586, diff: 0.00000000000000022204
  value__autocorrelation__lag_5 old: 0.46463952576506423, new: 0.46463952576506445, diff: -0.00000000000000022204
  value__autocorrelation__lag_1 old: 0.5154799442499527, new: 0.5154799442499526, diff: 0.00000000000000011102
  value__autocorrelation__lag_7 old: 0.6538534951469427, new: 0.6538534951469428, diff: -0.00000000000000011102
  value__autocorrelation__lag_2 old: 0.36765813197781533, new: 0.36765813197781516, diff: 0.00000000000000016653
  value__autocorrelation__lag_9 old: 0.21748400096837436, new: 0.21748400096837414, diff: 0.00000000000000022204
  value__augmented_dickey_fuller old: -0.8041220342033505, new: -0.8041220342033477, diff: -0.00000000000000277556
  value__mean_autocorrelation old: 1.1720475293977406, new: 1.1720475293977404, diff: 0.00000000000000022204
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_2" old: -40.265846960764975, new: -40.26584696076512, diff: 0.00000000000014210855
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_2" old: 5485.741180131765, new: 5485.741180131762, diff: 0.00000000000272848411
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_2" old: 7535.022844459651, new: 7535.02284445965, diff: 0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_2" old: 6017.192007927548, new: 6017.192007927546, diff: 0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_2" old: 3308.4304014332156, new: 3308.4304014332133, diff: 0.00000000000227373675
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_2" old: 1295.7433671924819, new: 1295.7433671924832, diff: -0.00000000000136424205
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_2" old: 39.916767258584514, new: 39.91676725858371, diff: 0.00000000000080291329
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_2" old: 17.955485691823014, new: 17.95548569182395, diff: -0.00000000000093436370
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_2" old: 50.259030087877306, new: 50.25903008787768, diff: -0.00000000000037658765
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_2" old: 35.90470247450105, new: 35.90470247450137, diff: -0.00000000000031974423
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_2" old: -24.14602386100944, new: -24.14602386100941, diff: -0.00000000000002842171
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_2" old: -61.88712524130847, new: -61.88712524130824, diff: -0.00000000000022737368
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_2" old: -33.668504325219715, new: -33.66850432521918, diff: -0.00000000000053290705
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_2" old: 24.20883821024688, new: 24.2088382102474, diff: -0.00000000000051869620
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_5" old: -20.257597134272146, new: -20.25759713427192, diff: -0.00000000000022737368
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_5" old: 3771.325441515319, new: 3771.32544151532, diff: -0.00000000000090949470
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_2__w_5" old: 7120.960920890311, new: 7120.960920890312, diff: -0.00000000000090949470
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_5" old: 11207.92940647991, new: 11207.929406479912, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_5" old: 11696.157551031656, new: 11696.157551031654, diff: 0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_5" old: 11253.943680982826, new: 11253.943680982822, diff: 0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_5" old: 10110.89944351567, new: 10110.899443515671, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_5" old: 8545.47382821769, new: 8545.473828217693, diff: -0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_5" old: 6826.238621617836, new: 6826.238621617837, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_5" old: 5169.353887616803, new: 5169.353887616802, diff: 0.00000000000090949470
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_5" old: 3717.969303101324, new: 3717.9693031013257, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_5" old: 2542.0196875693546, new: 2542.019687569354, diff: 0.00000000000045474735
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_5" old: 1652.101855511854, new: 1652.1018555118546, diff: -0.00000000000068212103
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_5" old: 1019.5707851504084, new: 1019.5707851504081, diff: 0.00000000000022737368
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_10" old: 836.6419785398183, new: 836.6419785398173, diff: 0.00000000000102318154
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_10" old: 3543.0796763032777, new: 3543.079676303278, diff: -0.00000000000045474735
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_3__w_10" old: 8634.724847532967, new: 8634.724847532969, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_4__w_10" old: 10876.523736377072, new: 10876.52373637707, diff: 0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_10" old: 12835.398940237148, new: 12835.39894023715, diff: -0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_6__w_10" old: 14466.10948981898, new: 14466.109489818979, diff: 0.00000000000181898940
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_10" old: 15737.72244365614, new: 15737.722443656134, diff: 0.00000000000545696821
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_10" old: 17169.076640994837, new: 17169.07664099483, diff: 0.00000000000727595761
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_10" old: 17183.302683017104, new: 17183.302683017107, diff: -0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_10" old: 15154.905872253841, new: 15154.905872253847, diff: -0.00000000000545696821
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_0__w_20" old: 18718.957258866503, new: 18718.957258866507, diff: -0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_1__w_20" old: 20645.63503140842, new: 20645.635031408423, diff: -0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_5__w_20" old: 28065.04062099347, new: 28065.040620993466, diff: 0.00000000000363797881
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_7__w_20" old: 31428.519814904776, new: 31428.519814904783, diff: -0.00000000000727595761
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_8__w_20" old: 32985.81511950059, new: 32985.8151195006, diff: -0.00000000000727595761
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_9__w_20" old: 34437.5408408601, new: 34437.54084086011, diff: -0.00000000000727595761
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_10__w_20" old: 35770.92323199827, new: 35770.923231998284, diff: -0.00000000001455191523
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_11__w_20" old: 36992.814788488264, new: 36992.81478848827, diff: -0.00000000000727595761
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_12__w_20" old: 38098.193912726434, new: 38098.19391272645, diff: -0.00000000001455191523
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_13__w_20" old: 39076.9898057395, new: 39076.98980573952, diff: -0.00000000002182787284
  "value__cwt_coefficients__widths_(2, 5, 10, 20)__coeff_14__w_20" old: 39919.05725014527, new: 39919.05725014526, diff: 0.00000000000727595761
  value__spkt_welch_density__coeff_2 old: 1843.821171807498, new: 1843.8211718074986, diff: -0.00000000000045474735
  value__spkt_welch_density__coeff_8 old: 2536.9954700088933, new: 2536.9954700088906, diff: 0.00000000000272848411
  value__ar_coefficient__k_10__coeff_0 old: 904.439185079118, new: 904.4391850794491, diff: -0.00000000033105607145
  value__ar_coefficient__k_10__coeff_1 old: 0.16357894811580564, new: 0.1635789481157781, diff: 0.00000000000002753353
  value__ar_coefficient__k_10__coeff_2 old: -0.04324700014744565, new: -0.0432470001474492, diff: 0.00000000000000355271
  value__ar_coefficient__k_10__coeff_3 old: -0.06654237068303814, new: -0.06654237068301239, diff: -0.00000000000002575717
  value__ar_coefficient__k_10__coeff_4 old: 0.2836853193919353, new: 0.2836853193919273, diff: 0.00000000000000799361
  value__fft_coefficient__coeff_1 old: -0.8045103874789135, new: -0.8045103874789561, diff: 0.00000000000004263256
  value__fft_coefficient__coeff_2 old: -53.13286168327596, new: -53.13286168327602, diff: 0.00000000000005684342
  value__fft_coefficient__coeff_3 old: -338.00000000000006, new: -338.0, diff: -0.00000000000005684342
  value__fft_coefficient__coeff_4 old: 122.44503935479224, new: 122.44503935479203, diff: 0.00000000000021316282
  value__fft_coefficient__coeff_5 old: -58.930796134231116, new: -58.930796134230846, diff: -0.00000000000027000624
  value__fft_coefficient__coeff_6 old: 13.000000000000057, new: 13.0, diff: 0.00000000000005684342
  value__fft_coefficient__coeff_7 old: 112.23530652170982, new: 112.23530652170984, diff: -0.00000000000002842171
  value__fft_coefficient__coeff_8 old: 118.18782232848393, new: 118.18782232848395, diff: -0.00000000000001421085
- Readded baseline unit tests removed in v0.7.0
- Readded large_number_of_peaks removed in v0.9.0
- Readded mean_autocorrelation removed in v0.9.0
- Reverted to original augmented_dickey_fuller that was changed in v0.9.0
- Reverted to original fft_coefficient that was changed in v0.9.0
- Readded mean_abs_change_quantiles that was removed in v0.9.0
- Readded the original time_reversal_asymmetry_statistic that was in use pre
  v0.9.0 - blue-yonder#198
- Readded original autocorrelation that was removed in v0.9.0
- Disabled partial_autocorrelation added in v0.10.0
- Disabled cid_ce added in v0.11.1
- Disabled fft_aggregated added in v0.11.0
- Disabled Fix agg change made to agg_autocorrelation added in v0.11.1
blue-yonder@a53fb6a
- Changed to new value_count and range_count method added in v0.11.1
- Hardcoded TSFRESH_BASELINE_VERSION = '0.9.1' in tests
- Disabled linear_trend_timewise added in v0.12.0
- Readded tsfresh/examples/test_tsfresh_baseline_dataset.py which was removed
  in v0.12.0
- Use v0.11.01 value_count and range_count method not as per v0.13.0
- Disabled count_above and count_below features that were added in v0.15.0
- Readded the original percentage_of_reoccurring_datapoints_to_all_datapoints
  before the feature name change to percentage_of_reoccurring_values_to_all_values
  implemented in v0.17.0 (feature names should be immutable)
  blue-yonder#725
  blue-yonder@6f9c795
  blue-yonder#724
- Rename the new feature percentage_of_reoccurring_values_to_all_values to
  v0170_percentage_of_reoccurring_values_to_all_values and disabled
- Readded the original percentage_of_reoccurring_values_to_all_values
  before the feature name change to percentage_of_reoccurring_datapoints_to_all_datapoints
  implemented in v0.17.0 (feature names should be immutable)
- Rename the new feature percentage_of_reoccurring_datapoints_to_all_datapoints
  to v0170_percentage_of_reoccurring_datapoints_to_all_datapoints and disabled
- Disabled lempel_ziv_complexity,fourier_entropy and permutation_entropy
  features that were added in v0.17.0
- Revert to the original cwt_coefficients feature names changed in v0.16.0
- Renamed the new sample_entropy introduced in v0.16.0 to v0160_sample_entropy
  and readded sample_entropy from v0.15.1 as this is a breaking change as per:
  blue-yonder#681 and
  blue-yonder@ce493e5
- Configured settings for pre v0.9.0 features
- Hardcoded TSFRESH_BASELINE_VERSION = '0.17.9' in tests

Added:
tests/baseline/tsfresh-0.1.2.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.3.0.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.3.0.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.3.1.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.3.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.4.0.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.4.0.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.5.0.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.5.0.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.5.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.6.0.py2.data.json.features.transposed.csv
tests/baseline/tsfresh-0.6.0.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.6.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.7.2.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.8.2.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.9.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.10.2.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.11.3.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.12.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.13.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.14.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.15.2.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.16.1.py3.data.json.features.transposed.csv
tests/baseline/tsfresh-0.17.9.py3.data.json.features.transposed.csv
tests/baseline/tsfresh_features_test.py
Modified:
CHANGES.rst
README.md
tsfresh/feature_extraction/feature_calculators.py
tsfresh/feature_extraction/settings.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants