Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

94 feat generalized transform norm and distance fct #95

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

younesStrittmatter
Copy link
Contributor

Description

Add generalized transform and metrics for multidementional entries in cells of pd.Dataframes.

Remarks (Optional)

Allthough this makes life a bit harder for collaboraters (they are discouraged to use np metrics directly since theses don't work on nested arrays), I can not think of a better way to do this if we want to keep our package modular and support other formats then single number cell entries

Copy link
Contributor

@musslick musslick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I only have a few minor comments regarding naming and the doc string.


def norms(arr: np.ndarray) -> np.ndarray:
"""
Calculate the norms along the first axis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to explain here in the doc string: The norm of what exactly? E.g., "norm of an nd array). Also, is it norm (singular) or norms (plural)?

return np.array([np.linalg.norm(np.ravel(row)) for row in arr])


def distances(arr_1: np.ndarray, arr_2: np.ndarray) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps consider renaming this function to reflect this is a euclidean distance? E.g., numpy does it the following way:

dst = distance.euclidean(a, b)

import numpy as np


def norms(arr: np.ndarray) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps consider renaming this to "norm" for singlular, also to mirror the naming convention in numpy (numpy.linalg.norm)

return np.array(_lst)


def to_array_flatten(arr: Union[pd.DataFrame, pd.Series, np.ndarray]) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we could either consider having a separate function flatten which can be combined with the to_array function. Alternatively, we could call it to_flat_array?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: generalized transform, norm, and distance fct
2 participants