Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensionality reduction #8590

Merged
merged 22 commits into from
Apr 16, 2023
Merged

Dimensionality reduction #8590

merged 22 commits into from
Apr 16, 2023

Conversation

Diegomangasco
Copy link
Contributor

Describe your change:

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

@algorithms-keeper algorithms-keeper bot added awaiting reviews This PR is ready to be reviewed require tests Tests [doctest/unittest/pytest] are required labels Mar 31, 2023
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

@algorithms-keeper algorithms-keeper bot added the tests are failing Do not merge until tests pass label Mar 31, 2023
@rohan472000
Copy link
Contributor

rohan472000 commented Mar 31, 2023

@Diegomangasco , run ruff . to rectify the failure of ruff

Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

return covariance_sum / features.shape[1]


def covariance_between_classes(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/dimensionality_reduction.py, please provide doctest for the function covariance_between_classes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""

features = np.array([[1, 2, 3], [4, 5, 6]])
labels = np.array([0, 1, 0])
covariance_between_classes(features, labels, 2)
output : array([[-1.5, -1.5],[-1.5, -1.5]])

"""

Copy link
Member

@cclauss cclauss Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pytest discovery is not finding/running these

See the GitHub Actions output.

machine_learning/data_transformations.py ..                              [ 54%]
machine_learning/decision_tree.py .                                      [ 54%]
machine_learning/k_means_clust.py .                                      [ 54%]
machine_learning/k_nearest_neighbours.py ..                              [ 55%]
machine_learning/linear_discriminant_analysis.py .......                 [ 55%]
machine_learning/multilayer_perceptron_classifier.py .                   [ 55%]
machine_learning/scoring_functions.py .....                              [ 56%]
machine_learning/self_organizing_map.py ..                               [ 56%]
machine_learning/similarity_search.py ...                                [ 56%]
machine_learning/support_vector_machines.py ...                          [ 56%]
machine_learning/word_frequency_functions.py ....                        [ 57%]
machine_learning/xgboost_classifier.py ..                                [ 57%]
machine_learning/xgboost_regressor.py ...                                [ 57%]
machine_learning/forecasting/run.py .....                                [ 57%]
machine_learning/local_weighted_learning/local_weighted_learning.py .... [ 58%]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it is not running these???

@TheAlgorithms TheAlgorithms deleted a comment from algorithms-keeper bot Mar 31, 2023
@TheAlgorithms TheAlgorithms deleted a comment from Diegomangasco Mar 31, 2023
@algorithms-keeper algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Mar 31, 2023
@rohan472000
Copy link
Contributor

@Diegomangasco , I also tried but getting -ve sign and some different outputs everytime.

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]],
but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

@Diegomangasco
Copy link
Contributor Author

@Diegomangasco , I also tried but getting -ve sign and some different outputs everytime.

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]],
but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

Yes because the projection could be done in whatever direction (plus or minus sign), the important things are the values.

@Diegomangasco
Copy link
Contributor Author

I think that the round errors may due to machine rounding.
Because all the operation that I wrote are mathematical-deterministic.

@rohan472000
Copy link
Contributor

yes but values that we are getting are also different.

@Diegomangasco
Copy link
Contributor Author

yes but values that we are getting are also different.

Can you provide me the input you gave?
Because it seems strange that math operation, like dot product between matrices, give different results.

@rohan472000
Copy link
Contributor

rohan472000 commented Apr 2, 2023

def test_pca():
    features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    dimensions = 2
    expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3., 3., 3.]])
    output = principal_component_analysis(features, dimensions)
    assert np.allclose(expected_output, output), f"Expected {expected_output}, but got {output}"

test_pca()

error

AssertionError: Expected [[ 6.92820323 8.66025404 10.39230485]
[ 3. 3. 3. ]], but got [[ -6.92820323 -8.66025404 -10.39230485]
[ -2.2719232 -2.2719232 -2.2719232 ]]

@Diegomangasco
Copy link
Contributor Author

@rohan472000 @cclauss
I share some trials on my local machine:

image

It seems to be deterministic, maybe there are some problems with doctests (?)

@algorithms-keeper algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Apr 2, 2023
@Diegomangasco
Copy link
Contributor Author

@rohan472000 @cclauss
I switched from doctest to an homemade test for pca, now all build tests pass.

Copy link
Contributor

@rohan472000 rohan472000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about using different function to test the various cases comes under a good practice or not for this repo, other than that everything looks fine to me.

return covariance_sum / features.shape[1]


def covariance_between_classes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""

features = np.array([[1, 2, 3], [4, 5, 6]])
labels = np.array([0, 1, 0])
covariance_between_classes(features, labels, 2)
output : array([[-1.5, -1.5],[-1.5, -1.5]])

"""

return covariance_sum / features.shape[1]


def covariance_between_classes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it is not running these???

@Shaquum
Copy link

Shaquum commented Apr 13, 2023

**_> projected_data = linear_discriminant_analysis(features, labels, classes, 3)

except AssertionError:
    pass
else:
    raise AssertionError("Did not raise AssertionError for dimensions > features")

def test_principal_component_analysis() -> None:
features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dimensions = 2
expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3.0, 3.0, 3.0]])
output = principal_component_analysis(features, labels)
assert np.allclose(
expected_output, output
), f"Expected {expected_output}." ### " {output}"

if name == "main":
import doctest

doctest.testmod()_**

Copy link

@Shaquum Shaquum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projected_data = linear_discriminant_analysis(features, labels, classes, 3)
except AssertionError:
pass
else:
raise AssertionError("Did not raise AssertionError for dimensions > features")

def test_principal_component_analysis() -> None:
features = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dimensions = 2
expected_output = np.array([[6.92820323, 8.66025404, 10.39230485], [3.0, 3.0, 3.0]])
output = principal_component_analysis(features, labels)
assert np.allclose(
expected_output, output
), f"Expected {expected_output}." {output}"

if name ==dismissed "main":

@algorithms-keeper algorithms-keeper bot added the tests are failing Do not merge until tests pass label Apr 15, 2023
@algorithms-keeper algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Apr 15, 2023
@Diegomangasco Diegomangasco requested a review from cclauss April 15, 2023 13:50
@ChrisO345 ChrisO345 merged commit 54dedf8 into TheAlgorithms:master Apr 16, 2023
@algorithms-keeper algorithms-keeper bot removed the awaiting reviews This PR is ready to be reviewed label Apr 16, 2023
tianyizheng02 pushed a commit to tianyizheng02/Python that referenced this pull request May 29, 2023
sedatguzelsemme pushed a commit to sedatguzelsemme/Python that referenced this pull request Sep 15, 2024
@isidroas isidroas mentioned this pull request Jan 25, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants