Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Update scikit-learn website links to use https instead of http #56

Merged
merged 1 commit into from
Nov 20, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Update scikit-learn links to use https instead of http
  • Loading branch information
Gal Oshri committed Nov 19, 2018
commit 10bd895ed4621cc29d11769c1dad7ef44603d596
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@

ML.NET was originally developed in Microsoft Research and is used across many product groups in Microsoft like Windows, Bing, PowerPoint, Excel and others. `nimbusml` was built to enable data science teams that are more familiar with Python to take advantage of ML.NET's functionality and performance.

This package enables training ML.NET pipelines or integrating ML.NET components directly into Scikit-Learn pipelines (it supports `numpy.ndarray`, `scipy.sparse_cst`, and `pandas.DataFrame` as inputs).
This package enables training ML.NET pipelines or integrating ML.NET components directly into [scikit-learn](https://scikit-learn.org/stable/) pipelines (it supports `numpy.ndarray`, `scipy.sparse_cst`, and `pandas.DataFrame` as inputs).

Documentation can be found [here](https://docs.microsoft.com/en-us/NimbusML/overview) and additional notebook samples can be found [here](https://github.com/Microsoft/NimbusML-Samples).

@@ -48,7 +48,7 @@ pipeline.fit(train_data)
results = pipeline.predict(test_data)
```

Instead of creating an `nimbusml` pipeline, you can also integrate components into Scikit-Learn pipelines:
Instead of creating an `nimbusml` pipeline, you can also integrate components into scikit-learn pipelines:

```python
from sklearn.pipeline import Pipeline
2 changes: 1 addition & 1 deletion src/python/docs/sphinx/concepts/datasources.rst
Original file line number Diff line number Diff line change
@@ -122,7 +122,7 @@ Output Data Types of Transforms

The return type of all of the transforms is a ``pandas.DataFrame``, when they
are used inside a `sklearn.pipeline.Pipeline
<http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
<https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
or when they are used individually.

However, when used inside a :py:class:`nimbusml.Pipeline`, the outputs are often stored in
10 changes: 5 additions & 5 deletions src/python/docs/sphinx/concepts/experimentvspipeline.rst
Original file line number Diff line number Diff line change
@@ -9,15 +9,15 @@ nimbusml.Pipeline() versus sklearn.Pipeline()
.. contents::
:local:

This sections highlights the differences between using a `sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
This sections highlights the differences between using a `sklearn.Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
and :py:class:`nimbusml.Pipeline` to compose a sequence of transformers and/or trainers.


sklearn.Pipeline
----------------

``nimbusml`` transforms and trainers are designed to be compatible with
`sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_.
`sklearn.Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_.
For fully optimized performance and added functionality, it is recommended to use
:py:class:`nimbusml.Pipeline`. See below for more details.

@@ -38,15 +38,15 @@ files that are too large to fit into memory, there is no easy way to train estim
streaming the examples one at a time.

The :py:class:`nimbusml.Pipeline` module accepts inputs X and y similarly to
`sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_, but also
`sklearn.Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_, but also
inputs of type :py:class:`nimbusml.FileDataStream`, which is an optimized streaming file
reader class. This is highly recommended for large datasets. See [Data Sources](datasources.md#data-from-a-filedatastream) for an
example of using Pipeline with FileDataStream to read data in files.

Select which Columns to Transform
"""""""""""""""""""""""""""""""""

When using `sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
When using `sklearn.Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_
the data columns of X and y (of type``numpy.array`` or ``scipy.sparse_csr``)
are anonymous and cannot be referenced by name. Operations and transformations are
therefore performed on all columns of the data.
@@ -66,7 +66,7 @@ Optimized Chaining of Trainers/Transforms

Using NimbusML, trainers and transforms within a :py:class:`nimbusml.Pipeline` will
generally result in better performance compared to using them in a
`sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_.
`sklearn.Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_.
Data copying is minimized when processing is limited to within the C# libraries, and if all
components are in the same pipeline, data copies between C# and Python is reduced.

2 changes: 1 addition & 1 deletion src/python/docs/sphinx/concepts/types.rst
Original file line number Diff line number Diff line change
@@ -61,7 +61,7 @@ dataframe and therefore the column_name can still be used to refer to the Vector
efficiently without any conversion to a dataframe. Since the ``column_name`` of the vector is
also preserved, it is possible to refer to it by downstream transforms by name. However, when
transforms are used inside a `sklearn.pipeline.Pipeline()
<http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_, the output
<https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_, the output
of every transform is converted to a ``pandas.DataFrame`` first where the names of ``slots`` are
preserved, but the ``column_name`` of the vector is dropped.

2 changes: 1 addition & 1 deletion src/python/docs/sphinx/metrics.rst
Original file line number Diff line number Diff line change
@@ -58,7 +58,7 @@ This corresponds to evaltype='binary'.
in `ML.NET <https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet>`_).
This expression is asymptotically equivalent to the area under the curve
which is what
`scikit-learn <http://scikit-learn.org/stable/modules/generated/sklearn.metrics.auc.html>`_ computation.
`scikit-learn <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.auc.html>`_ computation.
computes
(see `auc <https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/metrics/ranking.py#L101>`_).
That explains discrepencies on small test sets.
2 changes: 1 addition & 1 deletion src/python/nimbusml/datasets/datasets.py
Original file line number Diff line number Diff line change
@@ -75,7 +75,7 @@ def as_df(self):

class DataSetIris(DataSet):
"""
`Iris dataset <http://scikit-learn.org/stable/auto_examples/datasets
`Iris dataset <https://scikit-learn.org/stable/auto_examples/datasets
/plot_iris_dataset.html>`_ dataset.
"""