Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix minor typos in the docs #802

Merged
merged 1 commit into from
Jan 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/text/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ FAQ

2. **Is it possible to extract features from rolling/shifted time series?**

Yes, the :func:`tsfresh.dataframe_functions.roll_time_series` function allows to conviniently create a rolled
time series datframe from your data. You just have to transform your data into one of the supported tsfresh
Yes, the :func:`tsfresh.dataframe_functions.roll_time_series` function allows to conveniently create a rolled
time series dataframe from your data. You just have to transform your data into one of the supported tsfresh
:ref:`data-formats-label`.
Then, the :func:`tsfresh.dataframe_functions.roll_time_series` give you a DataFrame with the rolled time series,
that you can pass to tsfresh.
Expand Down
4 changes: 2 additions & 2 deletions docs/text/feature_extraction_settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,12 +111,12 @@ By using feature selection algorithms you find out that only a subgroup of featu
Then, we provide the :func:`tsfresh.feature_extraction.settings.from_columns` method that constructs the `kind_to_fc_parameters`
dictionary from the column names of this filtered feature matrix to make sure that only relevant features are extracted.

This can save a huge amount of time because you prevent the calculation of uncessary features.
This can save a huge amount of time because you prevent the calculation of unnecessary features.
Let's illustrate that with an example:

.. code:: python

# X_tsfresh containes the extracted tsfresh features
# X_tsfresh contains the extracted tsfresh features
X_tsfresh = extract_features(...)

# which are now filtered to only contain relevant features
Expand Down
2 changes: 1 addition & 1 deletion docs/text/how_to_contribute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ or build the documentation with
The finished documentation can be found in the docs/_build/html folder.

On Github we use a Travis CI Folder that runs our test suite every time a commit or pull request is sent. The
configuration of Travi is controlled by the
configuration of Travis is controlled by the
`.travis.yml <https://github.com/blue-yonder/tsfresh/blob/main/.travis.yml>`_ file.


Expand Down
5 changes: 3 additions & 2 deletions docs/text/large_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Large Input Data
================

If you are dealing with large time series data, you are facing multiple problems.
Thw two most important ones are
The two most important ones are

* long execution times for feature extraction
* large memory consumptions, even beyond what a single machine can handle

Expand Down Expand Up @@ -79,6 +80,6 @@ No pivoting will be performed in this case.
PySpark
-------

Similar to dask, it is also possible to ass the feature extraction into a Spark
Similar to dask, it is also possible to pass the feature extraction into a Spark
computation graph.
You can find more information in the documentation of :func:`tsfresh.convenience.bindings.spark_feature_extraction_on_chunk`.
2 changes: 1 addition & 1 deletion docs/text/tsfresh_on_a_cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ The only thing that you will need to run *tsfresh* on a Dask cluster is the ip a
`dask-scheduler <http://distributed.readthedocs.io/en/latest/setup.html>`_.

Lets say that your dask scheduler is running at ``192.168.0.1:8786``, then we can easily construct a
:class:`~sfresh.utilities.distribution.ClusterDaskDistributor` that connects to the sceduler and distributes the
:class:`~sfresh.utilities.distribution.ClusterDaskDistributor` that connects to the scheduler and distributes the
time series data and the calculation to a cluster:

.. code:: python
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/convenience/relevant_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ def extract_relevant_features(timeseries_container, y, X=None,

:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str

Expand Down
16 changes: 8 additions & 8 deletions tsfresh/feature_extraction/feature_calculators.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,9 +135,9 @@ def _estimate_friedrich_coefficients(x, m, r):

:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param m: order of polynom to fit for estimating fixed points of dynamics
:param m: order of polynomial to fit for estimating fixed points of dynamics
:type m: int
:param r: number of quantils to use for averaging
:param r: number of quantiles to use for averaging
:type r: float

:return: coefficients of polynomial of deterministic dynamics
Expand Down Expand Up @@ -1283,7 +1283,7 @@ def cwt_coefficients(x, param):

where :math:`a` is the width parameter of the wavelet function.

This feature calculator takes three different parameter: widths, coeff and w. The feature calculater takes all the
This feature calculator takes three different parameter: widths, coeff and w. The feature calculator takes all the
different widths arrays and then calculates the cwt one time for each different width array. Then the values for the
different coefficient for coeff and width w are returned. (For each dic in param one feature is returned)

Expand Down Expand Up @@ -1948,16 +1948,16 @@ def friedrich_coefficients(x, param):
:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param param: contains dictionaries {"m": x, "r": y, "coeff": z} with x being positive integer,
the order of polynom to fit for estimating fixed points of
dynamics, y positive float, the number of quantils to use for averaging and finally z,
the order of polynomial to fit for estimating fixed points of
dynamics, y positive float, the number of quantiles to use for averaging and finally z,
a positive integer corresponding to the returned coefficient
:type param: list
:return: the different feature values
:return type: pandas.Series
"""
# calculated is dictionary storing the calculated coefficients {m: {r: friedrich_coefficients}}
calculated = defaultdict(dict)
# res is a dictionary containg the results {"m_10__r_2__coeff_3": 15.43}
# res is a dictionary containing the results {"m_10__r_2__coeff_3": 15.43}
res = {}

for parameter_combination in param:
Expand Down Expand Up @@ -1996,9 +1996,9 @@ def max_langevin_fixed_point(x, r, m):

:param x: the time series to calculate the feature of
:type x: numpy.ndarray
:param m: order of polynom to fit for estimating fixed points of dynamics
:param m: order of polynomial to fit for estimating fixed points of dynamics
:type m: int
:param r: number of quantils to use for averaging
:param r: number of quantiles to use for averaging
:type r: float

:return: Largest fixed point of deterministic dynamics
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/feature_selection/relevance.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def calculate_relevance_table(

:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str

Expand Down
2 changes: 1 addition & 1 deletion tsfresh/feature_selection/selection.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ def select_features(

:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str

Expand Down
4 changes: 2 additions & 2 deletions tsfresh/transformers/feature_selector.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def __init__(

:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str

Expand Down Expand Up @@ -150,7 +150,7 @@ def __init__(

def fit(self, X, y):
"""
Extract the information, which of the features are relevent using the given target.
Extract the information, which of the features are relevant using the given target.

For more information, please see the :func:`~tsfresh.festure_selection.festure_selector.check_fs_sig_bh`
function. All columns in the input data sample are treated as feature. The index of all
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/transformers/relevant_feature_augmenter.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ def __init__(

:param ml_task: The intended machine learning task. Either `'classification'`, `'regression'` or `'auto'`.
Defaults to `'auto'`, meaning the intended task is inferred from `y`.
If `y` has a boolean, integer or object dtype, the task is assumend to be classification,
If `y` has a boolean, integer or object dtype, the task is assumed to be classification,
else regression.
:type ml_task: str

Expand Down
2 changes: 1 addition & 1 deletion tsfresh/utilities/dataframe_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ def add_sub_time_series_index(df_or_dict, sub_length, column_id=None, column_sor
- if column_id is None: for each kind (or if column_kind is None for the full dataframe) a new index built by
"sub-packaging" the data in packages of length "sub_length". For example if you have data with the
length of 11 and sub_length is 2, you will get 6 new packages: 0, 0; 1, 1; 2, 2; 3, 3; 4, 4; 5.
- if column_id is not None: the same as before, just for each id seperately. The old column_id values are added
- if column_id is not None: the same as before, just for each id separately. The old column_id values are added
to the new "id" column after a comma

You can use this functions to turn a long measurement into sub-packages, where you want to extract features on.
Expand Down
2 changes: 1 addition & 1 deletion tsfresh/utilities/distribution.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ class ClusterDaskDistributor(IterableDistributorBaseClass):

def __init__(self, address):
"""
Sets up a distributor that connects to a Dask Scheduler to distribute the calculaton of the features
Sets up a distributor that connects to a Dask Scheduler to distribute the calculation of the features

:param address: the ip address and port number of the Dask Scheduler
:type address: str
Expand Down