Skip to content

Commit

Permalink
Fix minor typos in the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
alex-kennedy committed Jan 27, 2021
1 parent 1559aef commit 3342199
Show file tree
Hide file tree
Showing 13 changed files with 25 additions and 24 deletions.
4 changes: 2 additions & 2 deletions docs/text/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ FAQ

2. **Is it possible to extract features from rolling/shifted time series?**

Yes, the :func:`tsfresh.dataframe_functions.roll_time_series` function allows to conviniently create a rolled
time series datframe from your data. You just have to transform your data into one of the supported tsfresh
Yes, the :func:`tsfresh.dataframe_functions.roll_time_series` function allows to conveniently create a rolled
time series dataframe from your data. You just have to transform your data into one of the supported tsfresh
:ref:`data-formats-label`.
Then, the :func:`tsfresh.dataframe_functions.roll_time_series` give you a DataFrame with the rolled time series,
that you can pass to tsfresh.
Expand Down
4 changes: 2 additions & 2 deletions docs/text/feature_extraction_settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,12 +111,12 @@ By using feature selection algorithms you find out that only a subgroup of featu
Then, we provide the :func:`tsfresh.feature_extraction.settings.from_columns` method that constructs the `kind_to_fc_parameters`
dictionary from the column names of this filtered feature matrix to make sure that only relevant features are extracted.

This can save a huge amount of time because you prevent the calculation of uncessary features.
This can save a huge amount of time because you prevent the calculation of unnecessary features.
Let's illustrate that with an example:

.. code:: python
# X_tsfresh containes the extracted tsfresh features
# X_tsfresh contains the extracted tsfresh features
X_tsfresh = extract_features(...)
# which are now filtered to only contain relevant features
Expand Down
2 changes: 1 addition & 1 deletion docs/text/how_to_contribute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ or build the documentation with
The finished documentation can be found in the docs/_build/html folder.

On Github we use a Travis CI Folder that runs our test suite every time a commit or pull request is sent. The
configuration of Travi is controlled by the
configuration of Travis is controlled by the
`.travis.yml <https://github.com/blue-yonder/tsfresh/blob/main/.travis.yml>`_ file.


Expand Down
5 changes: 3 additions & 2 deletions docs/text/large_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Large Input Data
================

If you are dealing with large time series data, you are facing multiple problems.
Thw two most important ones are
The two most important ones are

* long execution times for feature extraction
* large memory consumptions, even beyond what a single machine can handle

Expand Down Expand Up @@ -79,6 +80,6 @@ No pivoting will be performed in this case.
PySpark
-------

Similar to dask, it is also possible to ass the feature extraction into a Spark
Similar to dask, it is also possible to pass the feature extraction into a Spark
computation graph.
You can find more information in the documentation of :func:`tsfresh.convenience.bindings.spark_feature_extraction_on_chunk`.
2 changes: 1 addition & 1 deletion docs/text/tsfresh_on_a_cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ The only thing that you will need to run *tsfresh* on a Dask cluster is the ip a
`dask-scheduler <http://distributed.readthedocs.io/en/latest/setup.html>`_.

Lets say that your dask scheduler is running at ``192.168.0.1:8786``, then we can easily construct a
:class:`~sfresh.utilities.distribution.ClusterDaskDistributor` that connects to the sceduler and distributes the
:class:`~sfresh.utilities.distribution.ClusterDaskDistributor` that connects to the scheduler and distributes the
time series data and the calculation to a cluster:

.. code:: python
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/convenience/relevant_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ def extract_relevant_features(timeseries_container, y, X=None,
:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str
Expand Down
16 changes: 8 additions & 8 deletions tsfresh/feature_extraction/feature_calculators.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,9 +135,9 @@ def _estimate_friedrich_coefficients(x, m, r):
:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param m: order of polynom to fit for estimating fixed points of dynamics
:param m: order of polynomial to fit for estimating fixed points of dynamics
:type m: int
:param r: number of quantils to use for averaging
:param r: number of quantiles to use for averaging
:type r: float
:return: coefficients of polynomial of deterministic dynamics
Expand Down Expand Up @@ -1283,7 +1283,7 @@ def cwt_coefficients(x, param):
where :math:`a` is the width parameter of the wavelet function.
This feature calculator takes three different parameter: widths, coeff and w. The feature calculater takes all the
This feature calculator takes three different parameter: widths, coeff and w. The feature calculator takes all the
different widths arrays and then calculates the cwt one time for each different width array. Then the values for the
different coefficient for coeff and width w are returned. (For each dic in param one feature is returned)
Expand Down Expand Up @@ -1948,16 +1948,16 @@ def friedrich_coefficients(x, param):
:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param param: contains dictionaries {"m": x, "r": y, "coeff": z} with x being positive integer,
the order of polynom to fit for estimating fixed points of
dynamics, y positive float, the number of quantils to use for averaging and finally z,
the order of polynomial to fit for estimating fixed points of
dynamics, y positive float, the number of quantiles to use for averaging and finally z,
a positive integer corresponding to the returned coefficient
:type param: list
:return: the different feature values
:return type: pandas.Series
"""
# calculated is dictionary storing the calculated coefficients {m: {r: friedrich_coefficients}}
calculated = defaultdict(dict)
# res is a dictionary containg the results {"m_10__r_2__coeff_3": 15.43}
# res is a dictionary containing the results {"m_10__r_2__coeff_3": 15.43}
res = {}

for parameter_combination in param:
Expand Down Expand Up @@ -1996,9 +1996,9 @@ def max_langevin_fixed_point(x, r, m):
:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param m: order of polynom to fit for estimating fixed points of dynamics
:param m: order of polynomial to fit for estimating fixed points of dynamics
:type m: int
:param r: number of quantils to use for averaging
:param r: number of quantiles to use for averaging
:type r: float
:return: Largest fixed point of deterministic dynamics
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/feature_selection/relevance.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def calculate_relevance_table(
:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/feature_selection/selection.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ def select_features(
:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str
Expand Down
4 changes: 2 additions & 2 deletions tsfresh/transformers/feature_selector.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def __init__(
:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str
Expand Down Expand Up @@ -150,7 +150,7 @@ def __init__(

def fit(self, X, y):
"""
Extract the information, which of the features are relevent using the given target.
Extract the information, which of the features are relevant using the given target.
For more information, please see the :func:`~tsfresh.festure_selection.festure_selector.check_fs_sig_bh`
function. All columns in the input data sample are treated as feature. The index of all
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/transformers/relevant_feature_augmenter.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ def __init__(
:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/utilities/dataframe_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ def add_sub_time_series_index(df_or_dict, sub_length, column_id=None, column_sor
- if column_id is None: for each kind (or if column_kind is None for the full dataframe) a new index built by
"sub-packaging" the data in packages of length "sub_length". For example if you have data with the
length of 11 and sub_length is 2, you will get 6 new packages: 0, 0; 1, 1; 2, 2; 3, 3; 4, 4; 5.
- if column_id is not None: the same as before, just for each id seperately. The old column_id values are added
- if column_id is not None: the same as before, just for each id separately. The old column_id values are added
to the new "id" column after a comma
You can use this functions to turn a long measurement into sub-packages, where you want to extract features on.
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/utilities/distribution.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ class ClusterDaskDistributor(IterableDistributorBaseClass):

def __init__(self, address):
"""
Sets up a distributor that connects to a Dask Scheduler to distribute the calculaton of the features
Sets up a distributor that connects to a Dask Scheduler to distribute the calculation of the features
:param address: the ip address and port number of the Dask Scheduler
:type address: str
Expand Down

0 comments on commit 3342199

Please sign in to comment.