Skip to content
This repository was archived by the owner on Jun 5, 2023. It is now read-only.

Enhance whitelist #308

Merged
merged 7 commits into from
Aug 6, 2019
Merged

Conversation

detobel36
Copy link
Contributor

Remove loop in the whitelist test. Use in operator (for literal) and filter (for regex).

Tests where adapted.

@detobel36
Copy link
Contributor Author

Some stats. Note that these tests are made with py-spy that need to delete "91,221" outliers from ES and then process "129,160" events

Result with current development code:

0.00% 100.00%   0.000s    136.6s   run_interactive_mode (outliers.py)
0.00% 100.00%   0.000s    136.6s   run_outliers (outliers.py)
0.00% 100.00%   0.000s    112.5s   perform_analysis (outliers.py)
0.00%  57.00%   0.250s    53.68s   process_outlier (helpers/analyzer.py)
0.00%  57.00%   0.290s    53.43s   process_outlier (helpers/es.py)
0.00%  55.00%   0.300s    51.10s   save_outlier (helpers/es.py)
0.00%  53.00%   0.090s    47.66s   add_update_bulk_action (helpers/es.py)
0.00%  63.00%   0.100s    47.61s   flush_bulk_actions (helpers/es.py)
0.00%  53.00%   0.060s    47.57s   add_bulk_action (helpers/es.py)
0.00%  20.00%   0.020s    38.34s   _evaluate_batch_for_outliers (analyzers/terms.py)
0.00%  20.00%   0.040s    38.32s   _evaluate_aggregator_for_outliers_within (analyzers/terms.py)
0.00%  20.00%   0.560s    38.13s   _evaluate_each_aggregator_for_outliers (analyzers/terms.py)
0.00%   0.00%   0.020s    24.09s   remove_all_outliers (helpers/es.py)
2.00%  11.00%   0.900s    23.25s   _create_outlier (analyzers/terms.py)
0.00%   9.00%   0.520s    20.57s   create_outlier (helpers/analyzer.py)
0.00%   7.00%   0.780s    17.03s   _prepare_outlier_parameters (helpers/analyzer.py)
2.00%   7.00%    1.60s    15.54s   extract_outlier_asset_information (helpers/utils.py)
7.00%   7.00%   14.34s    14.34s   get_dotkey_value (helpers/utils.py)
0.00%   9.00%   0.270s    13.82s   is_whitelisted (helpers/outlier.py)
4.00%   9.00%    2.92s    13.55s   is_whitelisted_doc (helpers/outlier.py)
1.00%   5.00%    2.55s    11.70s   dict_contains_dotkey (helpers/utils.py)
0.00%   3.00%   0.260s     7.30s   _compute_aggregator_and_target_value (analyzers/terms.py)
0.00%   3.00%    1.36s     6.92s   flatten_fields_into_sentences (helpers/utils.py)
3.00%   3.00%    4.71s     4.71s   dictionary_matches_specific_whitelist_item_regexp (helpers/outlier.py)

Result with this pull request code:

0.00% 100.00%   0.000s    139.0s   <module> (outliers.py)
0.00% 100.00%   0.000s    135.7s   run_interactive_mode (outliers.py)
0.00% 100.00%   0.000s    135.7s   run_outliers (outliers.py)
0.00% 100.00%   0.000s    114.0s   perform_analysis (outliers.py)
0.00%  89.00%   0.690s    113.9s   evaluate_model (analyzers/terms.py)
0.00%  66.00%   0.260s    59.47s   process_outlier (helpers/analyzer.py)
0.00%  66.00%   0.420s    59.21s   process_outlier (helpers/es.py)
0.00%  64.00%   0.210s    56.59s   save_outlier (helpers/es.py)
0.00%  63.00%   0.150s    52.73s   add_update_bulk_action (helpers/es.py)
0.00%  63.00%   0.150s    52.58s   add_bulk_action (helpers/es.py)
0.00%  74.00%   0.050s    52.54s   flush_bulk_actions (helpers/es.py)
0.00%  17.00%   0.030s    34.05s   _evaluate_batch_for_outliers (analyzers/terms.py)
0.00%  17.00%   0.050s    34.02s   _evaluate_aggregator_for_outliers_within (analyzers/terms.py)
0.00%  17.00%   0.750s    33.82s   _evaluate_each_aggregator_for_outliers (analyzers/terms.py)
1.00%  13.00%   0.840s    22.01s   _create_outlier (analyzers/terms.py)
0.00%   0.00%   0.030s    21.70s   remove_all_outliers (helpers/es.py)
0.00%  11.00%   0.650s    19.72s   create_outlier (helpers/analyzer.py)
0.00%   9.00%   0.710s    16.30s   _prepare_outlier_parameters (helpers/analyzer.py)
1.00%   8.00%    1.58s    14.79s   extract_outlier_asset_information (helpers/utils.py)
6.00%   6.00%   14.15s    14.15s   get_dotkey_value (helpers/utils.py)
1.00%   7.00%    2.30s    11.32s   dict_contains_dotkey (helpers/utils.py)
0.00%   4.00%   0.330s    10.83s   is_whitelisted (helpers/outlier.py)
2.00%   4.00%    4.17s    10.50s   is_whitelisted_doc (helpers/outlier.py)
0.00%   0.00%   0.330s     7.09s   _compute_aggregator_and_target_value (analyzers/terms.py)
0.00%   0.00%    1.15s     6.67s   flatten_fields_into_sentences (helpers/utils.py)
  0.00%   0.00%    2.81s     4.31s   items (configparser.py)

Concretely, we come from 13.82s for is_whitelisted and 13.55s for is_whitelisted_doc. And now we are at 10.83s for is_whitelisted and 10.50s for is_whitelisted_doc.
Note also that dictionary_matches_specific_whitelist_item_regexp is not in this list. So we can conclude that it take less that 4.3s

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants