Dimensionality reduction #8590

Diegomangasco · 2023-03-31T10:52:54Z

Describe your change:

Add an algorithm?
Fix a bug or typo in an existing algorithm?
Documentation change?

Checklist:

algorithms-keeper

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Contributing guidelines

Project Euler solution guidelines

Python:

Formatted string literals (f-strings)

Type hints

doctest

unittest

pytest

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

@algorithms-keeper review to trigger the checks for only added pull request files

@algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

machine_learning/dimensionality_reduction.py

for more information, see https://pre-commit.ci

machine_learning/dimensionality_reduction.py

rohan472000 · 2023-03-31T11:47:51Z

@Diegomangasco , run ruff . to rectify the failure of ruff

machine_learning/dimensionality_reduction.py

algorithms-keeper

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Contributing guidelines

Project Euler solution guidelines

Python:

Formatted string literals (f-strings)

Type hints

doctest

unittest

pytest

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

@algorithms-keeper review to trigger the checks for only added pull request files

@algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

machine_learning/dimensionality_reduction.py

algorithms-keeper · 2023-03-31T17:53:58Z

machine_learning/dimensionality_reduction.py

+    return covariance_sum / features.shape[1]
+
+
+def covariance_between_classes(


As there is no test file in this pull request nor any test function or class in the file machine_learning/dimensionality_reduction.py, please provide doctest for the function covariance_between_classes

"""

features = np.array([[1, 2, 3], [4, 5, 6]])
labels = np.array([0, 1, 0])
covariance_between_classes(features, labels, 2)
output : array([[-1.5, -1.5],[-1.5, -1.5]])

"""

Pytest discovery is not finding/running these

See the GitHub Actions output.

machine_learning/data_transformations.py .. [ 54%] machine_learning/decision_tree.py . [ 54%] machine_learning/k_means_clust.py . [ 54%] machine_learning/k_nearest_neighbours.py .. [ 55%] machine_learning/linear_discriminant_analysis.py ....... [ 55%] machine_learning/multilayer_perceptron_classifier.py . [ 55%] machine_learning/scoring_functions.py ..... [ 56%] machine_learning/self_organizing_map.py .. [ 56%] machine_learning/similarity_search.py ... [ 56%] machine_learning/support_vector_machines.py ... [ 56%] machine_learning/word_frequency_functions.py .... [ 57%] machine_learning/xgboost_classifier.py .. [ 57%] machine_learning/xgboost_regressor.py ... [ 57%] machine_learning/forecasting/run.py ..... [ 57%] machine_learning/local_weighted_learning/local_weighted_learning.py .... [ 58%]

why it is not running these???

machine_learning/dimensionality_reduction.py

for more information, see https://pre-commit.ci

machine_learning/dimensionality_reduction.py

Diegomangasco · 2023-04-01T11:23:29Z

@rohan472000 @cclauss @chriso345
I have found an interesting information on this thread: https://stats.stackexchange.com/questions/30348/is-it-acceptable-to-reverse-a-sign-of-a-principal-component-score.
I report here the answer:

The signs of the eigenvectors are essentially arbitrary; if a colleague were to run the same analyses on the same data but on a different computer it would not be surprising to see one or both eigenvectors (your PC1a, & PC2a) to have different signs. Computing the PCA using the same data on the same computer but via different software packages can also have the same effect.
As such you can quite happily change the sign of the eigenvectors without altering the PCA.

This can be helpful.

rohan472000 · 2023-04-01T13:30:22Z

@Diegomangasco , I also tried but getting -ve sign and some different outputs everytime.

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]],
but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

Diegomangasco · 2023-04-01T14:04:24Z

@Diegomangasco , I also tried but getting -ve sign and some different outputs everytime.

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]],
but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

Yes because the projection could be done in whatever direction (plus or minus sign), the important things are the values.

Diegomangasco · 2023-04-01T14:15:33Z

I think that the round errors may due to machine rounding.
Because all the operation that I wrote are mathematical-deterministic.

rohan472000 · 2023-04-01T14:16:16Z

yes but values that we are getting are also different.

Diegomangasco · 2023-04-01T23:51:34Z

yes but values that we are getting are also different.

Can you provide me the input you gave?
Because it seems strange that math operation, like dot product between matrices, give different results.

rohan472000 · 2023-04-02T03:10:53Z

def test_pca():
    features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    dimensions = 2
    expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3., 3., 3.]])
    output = principal_component_analysis(features, dimensions)
    assert np.allclose(expected_output, output), f"Expected {expected_output}, but got {output}"

test_pca()

error

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]], but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

Diegomangasco · 2023-04-02T17:05:44Z

@rohan472000 @cclauss
I share some trials on my local machine:

It seems to be deterministic, maybe there are some problems with doctests (?)

Diegomangasco · 2023-04-02T17:21:44Z

@rohan472000 @cclauss
I switched from doctest to an homemade test for pca, now all build tests pass.

rohan472000

I'm not sure about using different function to test the various cases comes under a good practice or not for this repo, other than that everything looks fine to me.

rohan472000 · 2023-03-31T18:33:06Z

machine_learning/dimensionality_reduction.py

+    return covariance_sum / features.shape[1]
+
+
+def covariance_between_classes(


"""

features = np.array([[1, 2, 3], [4, 5, 6]])
labels = np.array([0, 1, 0])
covariance_between_classes(features, labels, 2)
output : array([[-1.5, -1.5],[-1.5, -1.5]])

"""

machine_learning/dimensionality_reduction.py

rohan472000 · 2023-03-31T18:45:46Z

machine_learning/dimensionality_reduction.py

+    return covariance_sum / features.shape[1]
+
+
+def covariance_between_classes(


why it is not running these???

Shaquum · 2023-04-13T09:13:53Z

**_> projected_data = linear_discriminant_analysis(features, labels, classes, 3)

except AssertionError:
    pass
else:
    raise AssertionError("Did not raise AssertionError for dimensions > features")
def test_principal_component_analysis() -> None:
features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dimensions = 2
expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3.0, 3.0, 3.0]])
output = principal_component_analysis(features, labels)
assert np.allclose(
expected_output, output
), f"Expected {expected_output}." ### " {output}"

if name == "main":
import doctest
doctest.testmod()_**

Shaquum

projected_data = linear_discriminant_analysis(features, labels, classes, 3)
except AssertionError:
pass
else:
raise AssertionError("Did not raise AssertionError for dimensions > features")

def test_principal_component_analysis() -> None:
features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dimensions = 2
expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3.0, 3.0, 3.0]])
output = principal_component_analysis(features, labels)
assert np.allclose(
expected_output, output
), f"Expected {expected_output}." {output}"

if name ==dismissed "main":

machine_learning/dimensionality_reduction.py

for more information, see https://pre-commit.ci

Diegomangasco added 7 commits March 28, 2023 19:28

First commit for dimensionality_reduction.py

babd745

Some bug fixies

5476b7d

Added a TODO list

3d8c1be

Finish code for dimensionality_reduction.py

24a68e9

PCA and LDA finished and tested

eb50e28

Add Copyright

0eb4e10

Add links to Wikipedia

5509e7d

algorithms-keeper bot added awaiting reviews This PR is ready to be reviewed require tests Tests [doctest/unittest/pytest] are required labels Mar 31, 2023

algorithms-keeper bot reviewed Mar 31, 2023

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

041aa1d

for more information, see https://pre-commit.ci

algorithms-keeper bot added the tests are failing Do not merge until tests pass label Mar 31, 2023

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

Apply suggestions from code review

7e1fc35

algorithms-keeper bot reviewed Mar 31, 2023

View reviewed changes

TheAlgorithms deleted a comment from algorithms-keeper bot Mar 31, 2023

TheAlgorithms deleted a comment from Diegomangasco Mar 31, 2023

algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Mar 31, 2023

Reformat file

43e1f53

Diegomangasco force-pushed the master branch from 7e1fc35 to 43e1f53 Compare March 31, 2023 19:17

[pre-commit.ci] auto fixes from pre-commit.com hooks

19727cf

for more information, see https://pre-commit.ci

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Mar 31, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

Add test for principal_component_analysis

85f1730

algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Apr 2, 2023

Diegomangasco requested review from cclauss and rohan472000 April 3, 2023 14:18

rohan472000 approved these changes Apr 3, 2023

View reviewed changes

Shaquum approved these changes Apr 13, 2023

View reviewed changes

cclauss reviewed Apr 13, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Show resolved Hide resolved

cclauss reviewed Apr 13, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

cclauss reviewed Apr 13, 2023

View reviewed changes

machine_learning/dimensionality_reduction.py Outdated Show resolved Hide resolved

algorithms-keeper bot added the tests are failing Do not merge until tests pass label Apr 15, 2023

Updated tests

7f524e1

Diegomangasco force-pushed the master branch from da099ae to 7f524e1 Compare April 15, 2023 13:46

[pre-commit.ci] auto fixes from pre-commit.com hooks

6521ef1

for more information, see https://pre-commit.ci

algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Apr 15, 2023

Diegomangasco requested a review from cclauss April 15, 2023 13:50

chriso345 approved these changes Apr 16, 2023

View reviewed changes

chriso345 merged commit 54dedf8 into TheAlgorithms:master Apr 16, 2023

algorithms-keeper bot removed the awaiting reviews This PR is ready to be reviewed label Apr 16, 2023

tianyizheng02 pushed a commit to tianyizheng02/Python that referenced this pull request May 29, 2023

Dimensionality reduction (TheAlgorithms#8590)

37a9db1

isidroas mentioned this pull request Jan 25, 2025

dh isidroas/Python#1

Closed

14 tasks

		return covariance_sum / features.shape[1]


		def covariance_between_classes(

Uh oh!

Dimensionality reduction #8590

Dimensionality reduction #8590

Uh oh!

Conversation

Diegomangasco commented Mar 31, 2023

Describe your change:

Checklist:

Uh oh!

algorithms-keeper bot left a comment

Choose a reason for hiding this comment

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper actions can be triggered by commenting on this PR:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rohan472000 commented Mar 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

algorithms-keeper bot left a comment

Choose a reason for hiding this comment

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper actions can be triggered by commenting on this PR:

Uh oh!

Uh oh!

algorithms-keeper bot Mar 31, 2023

Choose a reason for hiding this comment

Uh oh!

rohan472000 Mar 31, 2023

Choose a reason for hiding this comment

Uh oh!

cclauss Mar 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rohan472000 Mar 31, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Diegomangasco commented Apr 1, 2023

Uh oh!

rohan472000 commented Apr 1, 2023

Uh oh!

Diegomangasco commented Apr 1, 2023

Uh oh!

Diegomangasco commented Apr 1, 2023

Uh oh!

rohan472000 commented Apr 1, 2023

Uh oh!

Diegomangasco commented Apr 1, 2023

Uh oh!

rohan472000 commented Apr 2, 2023 • edited by cclauss Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Diegomangasco commented Apr 2, 2023

Uh oh!

Diegomangasco commented Apr 2, 2023

Uh oh!

rohan472000 left a comment

Choose a reason for hiding this comment

Uh oh!

rohan472000 commented Mar 31, 2023 •

edited

Loading

cclauss Mar 31, 2023 •

edited

Loading

rohan472000 commented Apr 2, 2023 •

edited by cclauss

Loading

Shaquum left a comment •

edited

Loading