Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user warning in test: tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict #749

Closed
a-szulc opened this issue Aug 23, 2024 · 2 comments

Comments

@a-szulc
Copy link
Contributor

a-szulc commented Aug 23, 2024

=================================== FAILURES ===================================
____________ MultiClassNeuralNetworkAlgorithmTest.test_fit_predict _____________

self = <tests.tests_algorithms.test_nn.MultiClassNeuralNetworkAlgorithmTest testMethod=test_fit_predict>

    def test_fit_predict(self):
        metric = Metric({"name": "logloss"})
        nn = MLPAlgorithm(self.params)
        nn.fit(self.X, self.y)
        y_predicted = nn.predict(self.X)
>       loss = metric(self.y, y_predicted)

tests/tests_algorithms/test_nn.py:153: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
supervised/utils/metric.py:408: in __call__
    return self.metric(y_true, y_predicted, sample_weight=sample_weight)
supervised/utils/metric.py:24: in logloss
    ll = log_loss(y_true, y_predicted.astype(np.float32), sample_weight=sample_weight)
venv/lib/python3.12/site-packages/sklearn/utils/_param_validation.py:213: in wrapper
    return func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

y_true = array([[1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1,...[0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1]])
y_pred = array([[0.51940924, 0.31767577, 0.33492753],
       [0.4552488 , 0.32350045, 0.32806817],
       [0.35579014, 0.316237... 0.32272235],
       [0.31430134, 0.18764104, 0.24486749],
       [0.28606585, 0.3815756 , 0.3253422 ]], dtype=float32)

    @validate_params(
        {
            "y_true": ["array-like"],
            "y_pred": ["array-like"],
            "normalize": ["boolean"],
            "sample_weight": ["array-like", None],
            "labels": ["array-like", None],
        },
        prefer_skip_nested_validation=True,
    )
    def log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None):
        r"""Log loss, aka logistic loss or cross-entropy loss.
    
        This is the loss function used in (multinomial) logistic regression
        and extensions of it such as neural networks, defined as the negative
        log-likelihood of a logistic model that returns ``y_pred`` probabilities
        for its training data ``y_true``.
        The log loss is only defined for two or more labels.
        For a single sample with true label :math:`y \in \{0,1\}` and
        a probability estimate :math:`p = \operatorname{Pr}(y = 1)`, the log
        loss is:
    
        .. math::
            L_{\log}(y, p) = -(y \log (p) + (1 - y) \log (1 - p))
    
        Read more in the :ref:`User Guide <log_loss>`.
    
        Parameters
        ----------
        y_true : array-like or label indicator matrix
            Ground truth (correct) labels for n_samples samples.
    
        y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)
            Predicted probabilities, as returned by a classifier's
            predict_proba method. If ``y_pred.shape = (n_samples,)``
            the probabilities provided are assumed to be that of the
            positive class. The labels in ``y_pred`` are assumed to be
            ordered alphabetically, as done by
            :class:`~sklearn.preprocessing.LabelBinarizer`.
    
            `y_pred` values are clipped to `[eps, 1-eps]` where `eps` is the machine
            precision for `y_pred`'s dtype.
    
        normalize : bool, default=True
            If true, return the mean loss per sample.
            Otherwise, return the sum of the per-sample losses.
    
        sample_weight : array-like of shape (n_samples,), default=None
            Sample weights.
    
        labels : array-like, default=None
            If not provided, labels will be inferred from y_true. If ``labels``
            is ``None`` and ``y_pred`` has shape (n_samples,) the labels are
            assumed to be binary and are inferred from ``y_true``.
    
            .. versionadded:: 0.18
    
        Returns
        -------
        loss : float
            Log loss, aka logistic loss or cross-entropy loss.
    
        Notes
        -----
        The logarithm used is the natural logarithm (base-e).
    
        References
        ----------
        C.M. Bishop (2006). Pattern Recognition and Machine Learning. Springer,
        p. 209.
    
        Examples
        --------
        >>> from sklearn.metrics import log_loss
        >>> log_loss(["spam", "ham", "ham", "spam"],
        ...          [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])
        0.21616...
        """
        y_pred = check_array(
            y_pred, ensure_2d=False, dtype=[np.float64, np.float32, np.float16]
        )
    
        check_consistent_length(y_pred, y_true, sample_weight)
        lb = LabelBinarizer()
    
        if labels is not None:
            lb.fit(labels)
        else:
            lb.fit(y_true)
    
        if len(lb.classes_) == 1:
            if labels is None:
                raise ValueError(
                    "y_true contains only one label ({0}). Please "
                    "provide the true labels explicitly through the "
                    "labels argument.".format(lb.classes_[0])
                )
            else:
                raise ValueError(
                    "The labels array needs to contain at least two "
                    "labels for log_loss, "
                    "got {0}.".format(lb.classes_)
                )
    
        transformed_labels = lb.transform(y_true)
    
        if transformed_labels.shape[1] == 1:
            transformed_labels = np.append(
                1 - transformed_labels, transformed_labels, axis=1
            )
    
        # If y_pred is of single dimension, assume y_true to be binary
        # and then check.
        if y_pred.ndim == 1:
            y_pred = y_pred[:, np.newaxis]
        if y_pred.shape[1] == 1:
            y_pred = np.append(1 - y_pred, y_pred, axis=1)
    
        eps = np.finfo(y_pred.dtype).eps
    
        # Make sure y_pred is normalized
        y_pred_sum = y_pred.sum(axis=1)
        if not np.allclose(y_pred_sum, 1, rtol=np.sqrt(eps)):
>           warnings.warn(
                "The y_pred values do not sum to one. Make sure to pass probabilities.",
                UserWarning,
            )
E           UserWarning: The y_pred values do not sum to one. Make sure to pass probabilities.

venv/lib/python3.12/site-packages/sklearn/metrics/_classification.py:2956: UserWarning
=========================== short test summary info ============================
FAILED tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict
============================== 1 failed in 1.99s ===============================
@a-szulc a-szulc changed the title error in test: tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict warning in test: tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict Aug 23, 2024
@a-szulc a-szulc changed the title warning in test: tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict user warning in test: tests/tests_algorithms/test_nn.py::MultiClassNeuralNetworkAlgorithmTest::test_fit_predict Aug 23, 2024
@pplonski
Copy link
Contributor

Could you please take a look at it @Marchlak

@pplonski
Copy link
Contributor

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants