From a699cce6037ff038f036452193312681f16c3739 Mon Sep 17 00:00:00 2001
From: Stephen0620 <41546633+Stephen0620@users.noreply.github.com>
Date: Sun, 30 Jun 2019 18:12:23 -0700
Subject: [PATCH] Updated the documentation for time series (#158)
* Update readme with latest feedback (#39)
Updating readme with latest feedback.
* Add THIRD-PARTY-NOTICES.txt and move CONTRIBUTING.md to root. (#40)
* Initial checkin
* Move to Hosted Mac pool
* Update README.md
* Manually copied naming changes over from master.
* Revert "Merge remote-tracking branch 'upstream/temp/docs'"
This reverts commit 93c73476e42e687c48889b58eb678b826dcbc41e, reversing
changes made to 23500695a07b587f4b15420c874514940b42c74b.
* Improve documentation regarding contributors.
* Fix email address.
* Create CODE_OF_CONDUCT.md
* Update issue templates
* Create PULL_REQUEST_TEMPLATE.md
* Update issue templates
* Update issue templates
* Update issue templates
* Fixing link in CONTRIBUTING.md (#44)
* Update contributing.md link. (#43)
* Initial checkin for ML.NET 0.7 upgrade
* fix tests
* put back columndropper
* fix tests
* Update scikit-learn links to use https instead of http
* restart dotnetcore2 package work
* fix build
* fix mac & linux
* fix build
* fix build
* dbg build
* fix build
* fix build
* handle py 2.7
* handle py27
* fix py27
* fix build
* fix build
* fix build
* ensure dependencies
* ignore exceptions from ensure dependencies
* up version
* Update cv.py
add case for X is data frame
* Update cv.py
add a space
* add a test for cv with data frame
* set DOTNET_SYSTEM_GLOBALIZATION_INVARIANT to true to fix app domain error
* fix build
* up version
* Add instructions for editing docstrings. (#51)
* Add instructions for editing docstrings.
* Add footnote giving more information.
* Fix build failures caused by dotnetcore2 module. (#67)
* Fix importing of the dotnetcore2 module because it has inconsistent folder naming.
* Fix file check for unix platforms.
* Fix indentation levels.
* Reduce number of build legs for PR validations and add nightly build definition with more robust build matrix. (#69)
* Increase version to 0.6.5. (#71)
* Update clr helper function to search multiple folders for clr binaries. (#72)
* Update clr helper function to search multiple folders for clr binaries.
* Moved responsiblity for Python version checking to utility functions.
* Add clarifying comments.
* Fix call to get_nimbusml_libs()
* fix drop column param name
* Remove restricted permissions on build.sh script.
* Fix lightgbm test failures by updating runtime dependencies.
* fix TensorFlowScorer model_location paramter name
* Fix build.sh defaults so that it detects when running on a mac.
* Since OneHotHashVectorizer is broken for output kind Key in ML.NET 0.7, usse ToKey() for unit tests
* fix tests
* fix pyproj test
* fix win 3.6 build
* fix comments
* expose "parallel" to the fit/fit_transform function by including **param to the argument
* add a test for the parallel
* update parallel thread
* fix tests comparison
* Update thread, retry build
* modify tests
* specify pytest-cov version
* update pytest-cov version in build command for linux
* for windows use the latest pytest-cov
* Enabled strong naming for DoNetBridge.dll (to be used for InternalsVisibleTo in ML.NET)
* Changed the keys to be the same as other internal repos
* Changed the key filename
* Update to ML.NET 0.10.preview (#77)
* Updating ML.NET nugets to latest 0.9 preview.
* --generate_entrypoints phase 1
* Fixed Models.CrossValidator
* Updated all entrypoints
* New manifest.json, picket from Monte's branch
* Updated API codegen
* Replace ISchema and SchemaImpl with Schema and SchemaBuilder.
* Revert "Replace ISchema and SchemaImpl with Schema and SchemaBuilder."
This reverts commit dcd749d6a7d13c8768a62c4b8db377b3b8d62eaf.
* Refactor IRowCursor to RowCursor.
* Update ML.NET version in build.csproj.
* Update manifest.json to ml.net commit 92e762686989215ddf45d9db3f0a1c989ee54d11
* Updated RunGraph.cs to ml.net 0.10
* Refactor Vbuffer
* Added override to RowCursor methods
* Update to NimbusML-privileged nugets from ML.NET.
* Update to Microsoft.ML namespace without Runtime.
* Schema and VBuffer fixes in NativeDataInterop.
* API fixes for IRandom and IsText in RmlEnvironment and NativeDataView.
* Work on getting VBuffer pointers from Spans.
* Some VBuffer fixes
* fix some class names
* Fix Register Assembly names.
* Remove ML.PipelineInference
* fixed more classes
* Add back columndropper for backward compatability.
* Register Entrypoints assembly in environment.
* Fix homebrew update problem on VS Hosted Mac images.
* Updated all the nuget versions to be the same.
* Attempt to fix the dataframe unit tests
* Fixed test_pyproj
* Optimized VBuffer changes
* Changed bridge version value to 0.10
* Addressed PR comments
* Simplify by using six.string_types (#89)
* Simplify by using six.string_types
* Force a retest
* Removed ISchema from DotNetBridge (#90)
* Removed ISchema
* Fixed the tests
* Addressed PR comments
* Addressed Wei-Sheng's comments about documenting the purpose of Column.DetachedColumn.
* add configuration for python 3.7 (#101)
* add configuration for python 3.7
* fix broken unit test
* Update build.sh
* fix build for Windows
* Linux py3.7 build
* fix pytest version
* upgrade pytest
* fix pytest-cov version
* fix isinstance(., int) for python 2.7
* build urls for Mac
* final fixes
* fix libomp
* Removing 3.7 for now as its not in PyPI
* Upgrade to ML.NET version 1.0.0 (#100)
* ref v0.10 ML.NET
* fix build
* hook up to v0.11.0 ML.NET
* fix build errors
* fix build
* include Microsoft.Data.DataView.dll in build
* typo
* remove protobuf dll
* Regenerate code due to manifest changes
* fix missing ep
* Update to ML.NET 1.0.0-preview
* fix .net build
* update nuget for ML.NET
* remove Data namespace dll
* rollback nuget changes
* move to final RC ML.NET
* Regenerate classes as per updated manifest
* fix maximum_number_of_iterations param name
* fix parameter names
* fix names
* reference official v1.0 of ML.NET
* fix tests
* fix label column
* Fix tests
* fix lightgbm tests
* fix OLS
* fix tests
* fix more tests
* fix more tests
* fix weight column name
* more tests
* fix normalized metrics
* more errors
* Fix CV
* rename feature_column to feature_column_name
* fix cv ranker
* Fix lightgbm tests
* fix changes due to upgrade of NGramFeaturizer
* fix ngram featurizer
* fix FactorizationMachine assert error
* disable test which is not working now due to change in LightGbm version
* fix model name
* typo
* handle nan in arrays
* fix tests
* fix tests
* fix more tests
* fix data type
* fix AUC exception
* kick the build
* fix tests due to data change
* fix ngram test
* fix mutual info tests
* copy libiomp lib
* fix mac build
* disable SymSgdNative for now
* disable SymSgdBinary classifier tests for Linux
* fix linux tests
* fix linux tests
* try linux
* fix linux
* skip SymSgdBinaryClassifier checks
* fix entrypoint compiler
* fix entry point generation
* fix example tests run
* fix typo
* fix documentation regression
* fix parameter name
* fix examples
* fix examples
* fix tests
* fix tests
* fix linux
* kick build
* Fix code_fixer
* fix skip take filters
* fix estimator checks
* Fix latest Windows build issues. (#105)
* Fix build issue on Windows when VS2019 is installed.
Note: The -version option could not be added directly
to the FOR command due to a command script parsing issue.
* Add missing arguments to fix build issue with latest version of autoflake.
* Fixes #50 - summary() fails if called a second time. (#107)
* Fixes #50 - summary() fails if called a second time.
* Fixes #99. Do not use hardcoded file separator. (#108)
Fixes #99. Do not use hard coded file separator.
* Delete the cached summaries when refitting a pipeline or a predictor. (#109)
* Fix build issue on Windows when VS2019 is installed.
Note: The -version option could not be added directly
to the FOR command due to a command script parsing issue.
* Add missing arguments to fix build issue with latest version of autoflake.
* Delete the cached summaries when refitting a pipeline or a predictor.
Fixes #106
* Simplify the code that deletes cached summaries when calling fit.
* Fix signature import error when using latest version of scikit-learn. (#116)
* Fix signature import error when using latest version of scikit-learn.
Fixes #111
* Move the conditional import of the signature method in to the utils package.
* Package System.Drawing.Common.dll as its missing in dotnetcore2 (#120)
* package System.Drawings.Common.dll as its missing in dotnetcore2
* typo
* Add png for Image examples
* try linux fix
* rollback scikit learn version
* test
* debug
* rollback test
* rollback
* fix fontconfig err
* fix tests
* print platform
* get os names
* test
* test
* fix linux
* Upgrade the pytest-remotedata package to fix missing attribute error. (#121)
* Upgrade the pytest-remotedata package to fix missing attribute error.
Fixes #117
* Remove the RlsMacPy3.6 configuration from .vsts-ci.yml.
* Upgrade version (#122)
* package System.Drawings.Common.dll as its missing in dotnetcore2
* typo
* Add png for Image examples
* try linux fix
* rollback scikit learn version
* test
* debug
* rollback test
* rollback
* fix fontconfig err
* fix tests
* print platform
* get os names
* test
* test
* fix linux
* Upgrade version
* Support quoted strings by default (#124)
* upgrade to ML.NET 1.1 (#126)
* upgrade to ML.NET 1.1
* by default quote is +
* assert changes due to quote
* fix tensor flow example
* Put long running tests in to their own folder to shorten build times. (#136)
* Temporarily remove the dataframe examples from the test run
to see how much that effects the test length.
* Remove all examples from the tests to see how it impacts the CI run.
* Put long running tests in to their own folder to shorten build times.
* Update nimbusml.pyproj to reflect the newly moved test files.
Forgot to save the nimbusml.pyproj in visual studio.
* Expose ML.NET SSA & IID spike & changepoint detectors. (#135)
* Initial creation of the IidSpikeDetector files to see what works and
what doesn't.
* Import the Microsoft.ML.TimeSeries assembly in to the project.
* Use 'PassAs' in manifest.json to fix the source parameter name.
* Use float32 for data dtype in IidSpikeDetector example.
* Convert IidSpikeDetector to a standard transform. Add examples and tests.
* Add pre-transform to IidSpikeDetector to fix incompatible data types.
* Fix issues with the test_estimator_checks IidSpikeDetector tests.
* Remove unnecessary TypeConverter import in IidSpikeDetector example.
* Initial implementation of IidChangePointDetector.
* Initial implementation of SsaSpikeDetector.
* Initial implementation of SsaChangePointDetector.
* Fix incorrect SsaSpikeDetector instance in test_estimator_checks.
* Fix a few minor issues with time series unit tests and examples. (#139)
* Skip Image.py and Image_df.py tests for Ubuntu 14 (#149)
* * Fixed the script for generating the documentation (#144)
* Moved _static to ci_script to solve an error while using sphinx
* Removed amek_md.bat and merge the commands of it to make_yaml.bat
* Moved metrics.rst to concepts
* Rename time_series package to timeseries. (#150)
* Initial checkin
* Move to Hosted Mac pool
* Manually copied naming changes over from master.
* merge master to temp/docs for updating the documentation (#134)
* merge master to documentation branch
* fixed the ModuleNotFoundError for WordEmbedding_df.py
* Merge branch 'documentation' into temp/docs (#143)
* merge master to documentation branch
* fixed the ModuleNotFoundError for WordEmbedding_df.py
* Fixed the issue when generating the documentation guide and concepts
* Moved _static to the right folder, and change PY36 to PY37 now
* Made it work with Python3.6
* Put long running tests in to their own folder to shorten build times. (#136)
* Temporarily remove the dataframe examples from the test run
to see how much that effects the test length.
* Remove all examples from the tests to see how it impacts the CI run.
* Put long running tests in to their own folder to shorten build times.
* Update nimbusml.pyproj to reflect the newly moved test files.
Forgot to save the nimbusml.pyproj in visual studio.
* Added undersocres to the files of time series
---
src/python/nimbusml.pyproj | 10 +-
.../timeseries/_iidchangepointdetector.py | 107 +++++++++++++
.../core/timeseries/_iidspikedetector.py | 91 +++++++++++
.../timeseries/_ssachangepointdetector.py | 138 ++++++++++++++++
.../core/timeseries/_ssaspikedetector.py | 129 +++++++++++++++
src/python/nimbusml/timeseries/__init__.py | 10 +-
.../timeseries/_iidchangepointdetector.py | 119 ++++++++++++++
.../nimbusml/timeseries/_iidspikedetector.py | 101 ++++++++++++
.../timeseries/_ssachangepointdetector.py | 147 ++++++++++++++++++
.../nimbusml/timeseries/_ssaspikedetector.py | 136 ++++++++++++++++
10 files changed, 978 insertions(+), 10 deletions(-)
create mode 100644 src/python/nimbusml/internal/core/timeseries/_iidchangepointdetector.py
create mode 100644 src/python/nimbusml/internal/core/timeseries/_iidspikedetector.py
create mode 100644 src/python/nimbusml/internal/core/timeseries/_ssachangepointdetector.py
create mode 100644 src/python/nimbusml/internal/core/timeseries/_ssaspikedetector.py
create mode 100644 src/python/nimbusml/timeseries/_iidchangepointdetector.py
create mode 100644 src/python/nimbusml/timeseries/_iidspikedetector.py
create mode 100644 src/python/nimbusml/timeseries/_ssachangepointdetector.py
create mode 100644 src/python/nimbusml/timeseries/_ssaspikedetector.py
diff --git a/src/python/nimbusml.pyproj b/src/python/nimbusml.pyproj
index 6c3542a6..47809618 100644
--- a/src/python/nimbusml.pyproj
+++ b/src/python/nimbusml.pyproj
@@ -592,11 +592,11 @@
-
-
-
-
-
+
+
+
+
+
diff --git a/src/python/nimbusml/internal/core/timeseries/_iidchangepointdetector.py b/src/python/nimbusml/internal/core/timeseries/_iidchangepointdetector.py
new file mode 100644
index 00000000..ae874a1c
--- /dev/null
+++ b/src/python/nimbusml/internal/core/timeseries/_iidchangepointdetector.py
@@ -0,0 +1,107 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+IidChangePointDetector
+"""
+
+__all__ = ["IidChangePointDetector"]
+
+
+from ...entrypoints.timeseriesprocessingentrypoints_iidchangepointdetector import \
+ timeseriesprocessingentrypoints_iidchangepointdetector
+from ...utils.utils import trace
+from ..base_pipeline_item import BasePipelineItem, DefaultSignature
+
+
+class IidChangePointDetector(BasePipelineItem, DefaultSignature):
+ """
+
+ This transform detects the change-points in an i.i.d. sequence using
+ adaptive kernel density estimation and martingales.
+
+ .. remarks::
+ ``IIDChangePointDetector`` assumes a sequence of data points that are
+ independently sampled from one
+ stationary distribution. `Adaptive kernel density estimation
+ `_
+ is used to model the distribution.
+
+ This transform detects
+ change points by calculating the martingale score for the sliding
+ window based on the estimated distribution.
+ The idea is based on the `Exchangeability
+ Martingales `_ that
+ detects a change of distribution over a stream of i.i.d. values. In
+ short, the value of the
+ martingale score starts increasing significantly when a sequence of
+ small p-values are detected in a row; this
+ indicates the change of the distribution of the underlying data
+ generation process.
+
+ :param confidence: The confidence for change point detection in the range
+ [0, 100]. Used to set the threshold of the martingale score for
+ triggering alert.
+
+ :param change_history_length: The length of the sliding window on p-value
+ for computing the martingale score.
+
+ :param martingale: The type of martingale betting function used for
+ computing the martingale score. Available options are {``Power``,
+ ``Mixture``}.
+
+ :param power_martingale_epsilon: The epsilon parameter for the Power
+ martingale if martingale is set to ``Power``.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude::
+ /../nimbusml/examples/IidSpikeChangePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ confidence=95.0,
+ change_history_length=20,
+ martingale='Power',
+ power_martingale_epsilon=0.1,
+ **params):
+ BasePipelineItem.__init__(
+ self, type='transform', **params)
+
+ self.confidence = confidence
+ self.change_history_length = change_history_length
+ self.martingale = martingale
+ self.power_martingale_epsilon = power_martingale_epsilon
+
+ @property
+ def _entrypoint(self):
+ return timeseriesprocessingentrypoints_iidchangepointdetector
+
+ @trace
+ def _get_node(self, **all_args):
+ algo_args = dict(
+ source=self.source,
+ name=self._name_or_source,
+ confidence=self.confidence,
+ change_history_length=self.change_history_length,
+ martingale=self.martingale,
+ power_martingale_epsilon=self.power_martingale_epsilon)
+
+ all_args.update(algo_args)
+ return self._entrypoint(**all_args)
diff --git a/src/python/nimbusml/internal/core/timeseries/_iidspikedetector.py b/src/python/nimbusml/internal/core/timeseries/_iidspikedetector.py
new file mode 100644
index 00000000..00712d77
--- /dev/null
+++ b/src/python/nimbusml/internal/core/timeseries/_iidspikedetector.py
@@ -0,0 +1,91 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+IidSpikeDetector
+"""
+
+__all__ = ["IidSpikeDetector"]
+
+
+from ...entrypoints.timeseriesprocessingentrypoints_iidspikedetector import \
+ timeseriesprocessingentrypoints_iidspikedetector
+from ...utils.utils import trace
+from ..base_pipeline_item import BasePipelineItem, DefaultSignature
+
+
+class IidSpikeDetector(BasePipelineItem, DefaultSignature):
+ """
+
+ This transform detects the spikes in a i.i.d. sequence using adaptive
+ kernel density estimation.
+
+ .. remarks::
+ ``IIDSpikeDetector`` assumes a sequence of data points that are
+ independently sampled from one stationary
+ distribution. `Adaptive kernel density estimation
+ `_
+ is used to model the distribution.
+ The `p-value score
+ indicates the likelihood of the current observation according to
+ the estimated distribution. The lower its value, the more likely the
+ current point is an outlier.
+
+ :param confidence: The confidence for spike detection in the range [0,
+ 100].
+
+ :param side: The argument that determines whether to detect positive or
+ negative anomalies, or both. Available options are {``Positive``,
+ ``Negative``, ``TwoSided``}.
+
+ :param pvalue_history_length: The size of the sliding window for computing
+ the p-value.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/IidSpikePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ confidence=99.0,
+ side='TwoSided',
+ pvalue_history_length=100,
+ **params):
+ BasePipelineItem.__init__(
+ self, type='transform', **params)
+
+ self.confidence = confidence
+ self.side = side
+ self.pvalue_history_length = pvalue_history_length
+
+ @property
+ def _entrypoint(self):
+ return timeseriesprocessingentrypoints_iidspikedetector
+
+ @trace
+ def _get_node(self, **all_args):
+ algo_args = dict(
+ source=self.source,
+ name=self._name_or_source,
+ confidence=self.confidence,
+ side=self.side,
+ pvalue_history_length=self.pvalue_history_length)
+
+ all_args.update(algo_args)
+ return self._entrypoint(**all_args)
diff --git a/src/python/nimbusml/internal/core/timeseries/_ssachangepointdetector.py b/src/python/nimbusml/internal/core/timeseries/_ssachangepointdetector.py
new file mode 100644
index 00000000..297fae42
--- /dev/null
+++ b/src/python/nimbusml/internal/core/timeseries/_ssachangepointdetector.py
@@ -0,0 +1,138 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+SsaChangePointDetector
+"""
+
+__all__ = ["SsaChangePointDetector"]
+
+
+from ...entrypoints.timeseriesprocessingentrypoints_ssachangepointdetector import \
+ timeseriesprocessingentrypoints_ssachangepointdetector
+from ...utils.utils import trace
+from ..base_pipeline_item import BasePipelineItem, DefaultSignature
+
+
+class SsaChangePointDetector(BasePipelineItem, DefaultSignature):
+ """
+
+ This transform detects the change-points in a seasonal time-series
+ using Singular Spectrum Analysis (SSA).
+
+ .. remarks::
+ `Singular Spectrum Analysis (SSA)
+ `_ is a
+ powerful framework for decomposing the time-series into trend,
+ seasonality and noise components as well as forecasting the future
+ values of the time-series. In order to remove the
+ effect of such components on anomaly detection, this transform add
+ SSA as a time-series modeler component in the detection pipeline.
+
+ The SSA component will be trained and it predicts the next expected
+ value on the time-series under normal condition; this expected value
+ is
+ further used to calculate the amount of deviation from the normal
+ behavior at that timestamp.
+ The distribution of this deviation is then modeled using `Adaptive
+ kernel density estimation
+ `_.
+
+ This transform detects
+ change points by calculating the martingale score for the sliding
+ window based on the estimated distribution of deviations.
+ The idea is based on the `Exchangeability
+ Martingales `_ that
+ detects a change of distribution over a stream of i.i.d. values. In
+ short, the value of the
+ martingale score starts increasing significantly when a sequence of
+ small p-values detected in a row; this
+ indicates the change of the distribution of the underlying data
+ generation process.
+
+ :param training_window_size: The number of points, N, from the beginning
+ of the sequence used to train the SSA model.
+
+ :param confidence: The confidence for change point detection in the range
+ [0, 100].
+
+ :param seasonal_window_size: An upper bound, L, on the largest relevant
+ seasonality in the input time-series, which also
+ determines the order of the autoregression of SSA. It must satisfy 2
+ < L < N/2.
+
+ :param change_history_length: The length of the sliding window on p-value
+ for computing the martingale score.
+
+ :param error_function: The function used to compute the error between the
+ expected and the observed value. Possible values are:
+ {``SignedDifference``, ``AbsoluteDifference``, ``SignedProportion``,
+ ``AbsoluteProportion``, ``SquaredDifference``}.
+
+ :param martingale: The type of martingale betting function used for
+ computing the martingale score. Available options are {``Power``,
+ ``Mixture``}.
+
+ :param power_martingale_epsilon: The epsilon parameter for the Power
+ martingale if martingale is set to ``Power``.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/SsaChangePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ training_window_size=100,
+ confidence=95.0,
+ seasonal_window_size=10,
+ change_history_length=20,
+ error_function='SignedDifference',
+ martingale='Power',
+ power_martingale_epsilon=0.1,
+ **params):
+ BasePipelineItem.__init__(
+ self, type='transform', **params)
+
+ self.training_window_size = training_window_size
+ self.confidence = confidence
+ self.seasonal_window_size = seasonal_window_size
+ self.change_history_length = change_history_length
+ self.error_function = error_function
+ self.martingale = martingale
+ self.power_martingale_epsilon = power_martingale_epsilon
+
+ @property
+ def _entrypoint(self):
+ return timeseriesprocessingentrypoints_ssachangepointdetector
+
+ @trace
+ def _get_node(self, **all_args):
+ algo_args = dict(
+ source=self.source,
+ name=self._name_or_source,
+ training_window_size=self.training_window_size,
+ confidence=self.confidence,
+ seasonal_window_size=self.seasonal_window_size,
+ change_history_length=self.change_history_length,
+ error_function=self.error_function,
+ martingale=self.martingale,
+ power_martingale_epsilon=self.power_martingale_epsilon)
+
+ all_args.update(algo_args)
+ return self._entrypoint(**all_args)
diff --git a/src/python/nimbusml/internal/core/timeseries/_ssaspikedetector.py b/src/python/nimbusml/internal/core/timeseries/_ssaspikedetector.py
new file mode 100644
index 00000000..6a1097f8
--- /dev/null
+++ b/src/python/nimbusml/internal/core/timeseries/_ssaspikedetector.py
@@ -0,0 +1,129 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+SsaSpikeDetector
+"""
+
+__all__ = ["SsaSpikeDetector"]
+
+
+from ...entrypoints.timeseriesprocessingentrypoints_ssaspikedetector import \
+ timeseriesprocessingentrypoints_ssaspikedetector
+from ...utils.utils import trace
+from ..base_pipeline_item import BasePipelineItem, DefaultSignature
+
+
+class SsaSpikeDetector(BasePipelineItem, DefaultSignature):
+ """
+
+ This transform detects the spikes in a seasonal time-series using
+ Singular Spectrum Analysis (SSA).
+
+ .. remarks::
+ `Singular Spectrum Analysis (SSA)
+ `_ is a
+ powerful
+ framework for decomposing the time-series into trend, seasonality and
+ noise components as well as forecasting
+ the future values of the time-series. In order to remove the effect
+ of such components on anomaly detection,
+ this transform adds SSA as a time-series modeler component in the
+ detection pipeline.
+
+ The SSA component will be trained and it predicts the next expected
+ value on the time-series under normal condition; this expected value
+ is
+ further used to calculate the amount of deviation from the normal
+ (predicted) behavior at that timestamp.
+ The distribution of this deviation is then modeled using `Adaptive
+ kernel density estimation
+ `_.
+
+ The `p-value score for the
+ current deviation is calculated based on the
+ estimated distribution. The lower its value, the more likely the
+ current point is an outlier.
+
+ :param training_window_size: The number of points, N, from the beginning
+ of the sequence used to train the SSA
+ model.
+
+ :param confidence: The confidence for spike detection in the range [0,
+ 100].
+
+ :param seasonal_window_size: An upper bound, L, on the largest relevant
+ seasonality in the input time-series, which
+ also determines the order of the autoregression of SSA. It must
+ satisfy 2 < L < N/2.
+
+ :param side: The argument that determines whether to detect positive or
+ negative anomalies, or both. Available
+ options are {``Positive``, ``Negative``, ``TwoSided``}.
+
+ :param pvalue_history_length: The size of the sliding window for computing
+ the p-value.
+
+ :param error_function: The function used to compute the error between the
+ expected and the observed value. Possible
+ values are {``SignedDifference``, ``AbsoluteDifference``,
+ ``SignedProportion``, ``AbsoluteProportion``,
+ ``SquaredDifference``}.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/SsaSpikeDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ training_window_size=100,
+ confidence=99.0,
+ seasonal_window_size=10,
+ side='TwoSided',
+ pvalue_history_length=100,
+ error_function='SignedDifference',
+ **params):
+ BasePipelineItem.__init__(
+ self, type='transform', **params)
+
+ self.training_window_size = training_window_size
+ self.confidence = confidence
+ self.seasonal_window_size = seasonal_window_size
+ self.side = side
+ self.pvalue_history_length = pvalue_history_length
+ self.error_function = error_function
+
+ @property
+ def _entrypoint(self):
+ return timeseriesprocessingentrypoints_ssaspikedetector
+
+ @trace
+ def _get_node(self, **all_args):
+ algo_args = dict(
+ source=self.source,
+ name=self._name_or_source,
+ training_window_size=self.training_window_size,
+ confidence=self.confidence,
+ seasonal_window_size=self.seasonal_window_size,
+ side=self.side,
+ pvalue_history_length=self.pvalue_history_length,
+ error_function=self.error_function)
+
+ all_args.update(algo_args)
+ return self._entrypoint(**all_args)
diff --git a/src/python/nimbusml/timeseries/__init__.py b/src/python/nimbusml/timeseries/__init__.py
index 64e66add..626bcbc3 100644
--- a/src/python/nimbusml/timeseries/__init__.py
+++ b/src/python/nimbusml/timeseries/__init__.py
@@ -1,8 +1,8 @@
-from .iidspikedetector import IidSpikeDetector
-from .iidchangepointdetector import IidChangePointDetector
-from .ssaspikedetector import SsaSpikeDetector
-from .ssachangepointdetector import SsaChangePointDetector
-from .ssaforecaster import SsaForecaster
+from ._iidspikedetector import IidSpikeDetector
+from ._iidchangepointdetector import IidChangePointDetector
+from ._ssaspikedetector import SsaSpikeDetector
+from ._ssachangepointdetector import SsaChangePointDetector
+from ._ssaforecaster import SsaForecaster
__all__ = [
'IidSpikeDetector',
diff --git a/src/python/nimbusml/timeseries/_iidchangepointdetector.py b/src/python/nimbusml/timeseries/_iidchangepointdetector.py
new file mode 100644
index 00000000..0df53ba7
--- /dev/null
+++ b/src/python/nimbusml/timeseries/_iidchangepointdetector.py
@@ -0,0 +1,119 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+IidChangePointDetector
+"""
+
+__all__ = ["IidChangePointDetector"]
+
+
+from sklearn.base import TransformerMixin
+
+from ..base_transform import BaseTransform
+from ..internal.core.timeseries._iidchangepointdetector import \
+ IidChangePointDetector as core
+from ..internal.utils.utils import trace
+
+
+class IidChangePointDetector(
+ core,
+ BaseTransform,
+ TransformerMixin):
+ """
+
+ This transform detects the change-points in an i.i.d. sequence using
+ adaptive kernel density estimation and martingales.
+
+ .. remarks::
+ ``IIDChangePointDetector`` assumes a sequence of data points that are
+ independently sampled from one
+ stationary distribution. `Adaptive kernel density estimation
+ `_
+ is used to model the distribution.
+
+ This transform detects
+ change points by calculating the martingale score for the sliding
+ window based on the estimated distribution.
+ The idea is based on the `Exchangeability
+ Martingales `_ that
+ detects a change of distribution over a stream of i.i.d. values. In
+ short, the value of the
+ martingale score starts increasing significantly when a sequence of
+ small p-values are detected in a row; this
+ indicates the change of the distribution of the underlying data
+ generation process.
+
+ :param columns: see `Columns `_.
+
+ :param confidence: The confidence for change point detection in the range
+ [0, 100]. Used to set the threshold of the martingale score for
+ triggering alert.
+
+ :param change_history_length: The length of the sliding window on p-value
+ for computing the martingale score.
+
+ :param martingale: The type of martingale betting function used for
+ computing the martingale score. Available options are {``Power``,
+ ``Mixture``}.
+
+ :param power_martingale_epsilon: The epsilon parameter for the Power
+ martingale if martingale is set to ``Power``.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude::
+ /../nimbusml/examples/IidSpikeChangePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ confidence=95.0,
+ change_history_length=20,
+ martingale='Power',
+ power_martingale_epsilon=0.1,
+ columns=None,
+ **params):
+
+ if columns:
+ params['columns'] = columns
+ BaseTransform.__init__(self, **params)
+ core.__init__(
+ self,
+ confidence=confidence,
+ change_history_length=change_history_length,
+ martingale=martingale,
+ power_martingale_epsilon=power_martingale_epsilon,
+ **params)
+ self._columns = columns
+
+ def get_params(self, deep=False):
+ """
+ Get the parameters for this operator.
+ """
+ return core.get_params(self)
+
+ def _nodes_with_presteps(self):
+ """
+ Inserts preprocessing before this one.
+ """
+ from ..preprocessing.schema import TypeConverter
+ return [
+ TypeConverter(
+ result_type='R4')._steal_io(self),
+ self]
diff --git a/src/python/nimbusml/timeseries/_iidspikedetector.py b/src/python/nimbusml/timeseries/_iidspikedetector.py
new file mode 100644
index 00000000..51582ae8
--- /dev/null
+++ b/src/python/nimbusml/timeseries/_iidspikedetector.py
@@ -0,0 +1,101 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+IidSpikeDetector
+"""
+
+__all__ = ["IidSpikeDetector"]
+
+
+from sklearn.base import TransformerMixin
+
+from ..base_transform import BaseTransform
+from ..internal.core.timeseries._iidspikedetector import \
+ IidSpikeDetector as core
+from ..internal.utils.utils import trace
+
+
+class IidSpikeDetector(core, BaseTransform, TransformerMixin):
+ """
+
+ This transform detects the spikes in a i.i.d. sequence using adaptive
+ kernel density estimation.
+
+ .. remarks::
+ ``IIDSpikeDetector`` assumes a sequence of data points that are
+ independently sampled from one stationary
+ distribution. `Adaptive kernel density estimation
+ `_
+ is used to model the distribution.
+ The `p-value score
+ indicates the likelihood of the current observation according to
+ the estimated distribution. The lower its value, the more likely the
+ current point is an outlier.
+
+ :param columns: see `Columns `_.
+
+ :param confidence: The confidence for spike detection in the range [0,
+ 100].
+
+ :param side: The argument that determines whether to detect positive or
+ negative anomalies, or both. Available options are {``Positive``,
+ ``Negative``, ``TwoSided``}.
+
+ :param pvalue_history_length: The size of the sliding window for computing
+ the p-value.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/IidSpikePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ confidence=99.0,
+ side='TwoSided',
+ pvalue_history_length=100,
+ columns=None,
+ **params):
+
+ if columns:
+ params['columns'] = columns
+ BaseTransform.__init__(self, **params)
+ core.__init__(
+ self,
+ confidence=confidence,
+ side=side,
+ pvalue_history_length=pvalue_history_length,
+ **params)
+ self._columns = columns
+
+ def get_params(self, deep=False):
+ """
+ Get the parameters for this operator.
+ """
+ return core.get_params(self)
+
+ def _nodes_with_presteps(self):
+ """
+ Inserts preprocessing before this one.
+ """
+ from ..preprocessing.schema import TypeConverter
+ return [
+ TypeConverter(
+ result_type='R4')._steal_io(self),
+ self]
diff --git a/src/python/nimbusml/timeseries/_ssachangepointdetector.py b/src/python/nimbusml/timeseries/_ssachangepointdetector.py
new file mode 100644
index 00000000..3b02d49e
--- /dev/null
+++ b/src/python/nimbusml/timeseries/_ssachangepointdetector.py
@@ -0,0 +1,147 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+SsaChangePointDetector
+"""
+
+__all__ = ["SsaChangePointDetector"]
+
+
+from sklearn.base import TransformerMixin
+
+from ..base_transform import BaseTransform
+from ..internal.core.timeseries._ssachangepointdetector import \
+ SsaChangePointDetector as core
+from ..internal.utils.utils import trace
+
+
+class SsaChangePointDetector(
+ core,
+ BaseTransform,
+ TransformerMixin):
+ """
+
+ This transform detects the change-points in a seasonal time-series
+ using Singular Spectrum Analysis (SSA).
+
+ .. remarks::
+ `Singular Spectrum Analysis (SSA)
+ `_ is a
+ powerful framework for decomposing the time-series into trend,
+ seasonality and noise components as well as forecasting the future
+ values of the time-series. In order to remove the
+ effect of such components on anomaly detection, this transform add
+ SSA as a time-series modeler component in the detection pipeline.
+
+ The SSA component will be trained and it predicts the next expected
+ value on the time-series under normal condition; this expected value
+ is
+ further used to calculate the amount of deviation from the normal
+ behavior at that timestamp.
+ The distribution of this deviation is then modeled using `Adaptive
+ kernel density estimation
+ `_.
+
+ This transform detects
+ change points by calculating the martingale score for the sliding
+ window based on the estimated distribution of deviations.
+ The idea is based on the `Exchangeability
+ Martingales `_ that
+ detects a change of distribution over a stream of i.i.d. values. In
+ short, the value of the
+ martingale score starts increasing significantly when a sequence of
+ small p-values detected in a row; this
+ indicates the change of the distribution of the underlying data
+ generation process.
+
+ :param columns: see `Columns `_.
+
+ :param training_window_size: The number of points, N, from the beginning
+ of the sequence used to train the SSA model.
+
+ :param confidence: The confidence for change point detection in the range
+ [0, 100].
+
+ :param seasonal_window_size: An upper bound, L, on the largest relevant
+ seasonality in the input time-series, which also
+ determines the order of the autoregression of SSA. It must satisfy 2
+ < L < N/2.
+
+ :param change_history_length: The length of the sliding window on p-value
+ for computing the martingale score.
+
+ :param error_function: The function used to compute the error between the
+ expected and the observed value. Possible values are:
+ {``SignedDifference``, ``AbsoluteDifference``, ``SignedProportion``,
+ ``AbsoluteProportion``, ``SquaredDifference``}.
+
+ :param martingale: The type of martingale betting function used for
+ computing the martingale score. Available options are {``Power``,
+ ``Mixture``}.
+
+ :param power_martingale_epsilon: The epsilon parameter for the Power
+ martingale if martingale is set to ``Power``.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaSpikeDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/SsaChangePointDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ training_window_size=100,
+ confidence=95.0,
+ seasonal_window_size=10,
+ change_history_length=20,
+ error_function='SignedDifference',
+ martingale='Power',
+ power_martingale_epsilon=0.1,
+ columns=None,
+ **params):
+
+ if columns:
+ params['columns'] = columns
+ BaseTransform.__init__(self, **params)
+ core.__init__(
+ self,
+ training_window_size=training_window_size,
+ confidence=confidence,
+ seasonal_window_size=seasonal_window_size,
+ change_history_length=change_history_length,
+ error_function=error_function,
+ martingale=martingale,
+ power_martingale_epsilon=power_martingale_epsilon,
+ **params)
+ self._columns = columns
+
+ def get_params(self, deep=False):
+ """
+ Get the parameters for this operator.
+ """
+ return core.get_params(self)
+
+ def _nodes_with_presteps(self):
+ """
+ Inserts preprocessing before this one.
+ """
+ from ..preprocessing.schema import TypeConverter
+ return [
+ TypeConverter(
+ result_type='R4')._steal_io(self),
+ self]
diff --git a/src/python/nimbusml/timeseries/_ssaspikedetector.py b/src/python/nimbusml/timeseries/_ssaspikedetector.py
new file mode 100644
index 00000000..ad831a15
--- /dev/null
+++ b/src/python/nimbusml/timeseries/_ssaspikedetector.py
@@ -0,0 +1,136 @@
+# --------------------------------------------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT License.
+# --------------------------------------------------------------------------------------------
+# - Generated by tools/entrypoint_compiler.py: do not edit by hand
+"""
+SsaSpikeDetector
+"""
+
+__all__ = ["SsaSpikeDetector"]
+
+
+from sklearn.base import TransformerMixin
+
+from ..base_transform import BaseTransform
+from ..internal.core.timeseries._ssaspikedetector import \
+ SsaSpikeDetector as core
+from ..internal.utils.utils import trace
+
+
+class SsaSpikeDetector(core, BaseTransform, TransformerMixin):
+ """
+
+ This transform detects the spikes in a seasonal time-series using
+ Singular Spectrum Analysis (SSA).
+
+ .. remarks::
+ `Singular Spectrum Analysis (SSA)
+ `_ is a
+ powerful
+ framework for decomposing the time-series into trend, seasonality and
+ noise components as well as forecasting
+ the future values of the time-series. In order to remove the effect
+ of such components on anomaly detection,
+ this transform adds SSA as a time-series modeler component in the
+ detection pipeline.
+
+ The SSA component will be trained and it predicts the next expected
+ value on the time-series under normal condition; this expected value
+ is
+ further used to calculate the amount of deviation from the normal
+ (predicted) behavior at that timestamp.
+ The distribution of this deviation is then modeled using `Adaptive
+ kernel density estimation
+ `_.
+
+ The `p-value score for the
+ current deviation is calculated based on the
+ estimated distribution. The lower its value, the more likely the
+ current point is an outlier.
+
+ :param columns: see `Columns `_.
+
+ :param training_window_size: The number of points, N, from the beginning
+ of the sequence used to train the SSA
+ model.
+
+ :param confidence: The confidence for spike detection in the range [0,
+ 100].
+
+ :param seasonal_window_size: An upper bound, L, on the largest relevant
+ seasonality in the input time-series, which
+ also determines the order of the autoregression of SSA. It must
+ satisfy 2 < L < N/2.
+
+ :param side: The argument that determines whether to detect positive or
+ negative anomalies, or both. Available
+ options are {``Positive``, ``Negative``, ``TwoSided``}.
+
+ :param pvalue_history_length: The size of the sliding window for computing
+ the p-value.
+
+ :param error_function: The function used to compute the error between the
+ expected and the observed value. Possible
+ values are {``SignedDifference``, ``AbsoluteDifference``,
+ ``SignedProportion``, ``AbsoluteProportion``,
+ ``SquaredDifference``}.
+
+ :param params: Additional arguments sent to compute engine.
+
+ .. seealso::
+ :py:func:`IIDChangePointDetector
+ `,
+ :py:func:`IIDSpikeDetector
+ `,
+ :py:func:`SsaChangePointDetector
+ `.
+
+ .. index:: models, timeseries, transform
+
+ Example:
+ .. literalinclude:: /../nimbusml/examples/SsaSpikeDetector.py
+ :language: python
+ """
+
+ @trace
+ def __init__(
+ self,
+ training_window_size=100,
+ confidence=99.0,
+ seasonal_window_size=10,
+ side='TwoSided',
+ pvalue_history_length=100,
+ error_function='SignedDifference',
+ columns=None,
+ **params):
+
+ if columns:
+ params['columns'] = columns
+ BaseTransform.__init__(self, **params)
+ core.__init__(
+ self,
+ training_window_size=training_window_size,
+ confidence=confidence,
+ seasonal_window_size=seasonal_window_size,
+ side=side,
+ pvalue_history_length=pvalue_history_length,
+ error_function=error_function,
+ **params)
+ self._columns = columns
+
+ def get_params(self, deep=False):
+ """
+ Get the parameters for this operator.
+ """
+ return core.get_params(self)
+
+ def _nodes_with_presteps(self):
+ """
+ Inserts preprocessing before this one.
+ """
+ from ..preprocessing.schema import TypeConverter
+ return [
+ TypeConverter(
+ result_type='R4')._steal_io(self),
+ self]