Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

85 review docstrings #86

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

85 review docstrings #86

wants to merge 15 commits into from

Conversation

atravitz
Copy link
Contributor

Some typo and grammar fixes, as well as some questions.

@atravitz atravitz linked an issue Oct 25, 2024 that may be closed by this pull request
Copy link

codecov bot commented Oct 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.81%. Comparing base (8c62d86) to head (b66f0e1).

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #86   +/-   ##
=======================================
  Coverage   86.81%   86.81%           
=======================================
  Files          49       49           
  Lines        1577     1577           
=======================================
  Hits         1369     1369           
  Misses        208      208           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@atravitz atravitz marked this pull request as ready for review October 28, 2024 17:57
@atravitz atravitz requested a review from RiesBen October 28, 2024 18:00
@@ -40,7 +40,7 @@ def __init__(
clusterer: ComponentsDiversityClusterer = ComponentsDiversityClusterer(
featurize=RDKitFingerprintTransformer(), cluster=KMeans(n_clusters=3)
),
mappers: Union[AtomMapper, list[AtomMapper]] = None,
mappers: Union[AtomMapper, list[AtomMapper]] = None, # include None in this union?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess, I really don't know what the correct type annotation is if there is a default None. :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simply do Union[AtomMapper,list[AtomMapper],None]

However, it's not optional in the parent class, so I don't think it should be optional here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I don't understand if there are cases where you would want to allow not passing through an AtomMapper. Also the score, line below, is optional, while not optional in the parent class. Lines 98 and 106 would be the cases where you don't need the AtomMapper and scorer, but I don't understand yet what those would do. I have to look into this in more detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, it actually seems like mappers are not required in the parent's parent class, NetworkPlanner. I opened an issue to resolve this, since this is no longer a docstrings edit: #93

@@ -49,15 +49,15 @@ def __init__(
scorer=scorer,
network_generator=MstNetworkAlgorithm(),
n_processes=n_processes,
_initial_edge_lister=None,
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if i understand this comment correctly, as before and after looks the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?
_initial_edge_lister=_initial_edge_lister,

I'm asking if _initial_edge_lister should be passed through here, right now this is forcing _initial_edge_lister=None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that in the parent class, the _initial_edge_lister=None, however, in the doc string it says

_initial_edge_lister: LigandNetworkPlanner, optional
            this LigandNetworkPlanner is used to give the initial set of edges.
             For standard usage, the Maximal NetworPlanner is used.
            However in large scale approaches, it might be interesting to use
             the heuristicMaximalNetworkPlanner. (default: None)

In this class MSTConcatenator it is listed as _initial_edge_lister: NetworkConcatenator = None,, while I think it should be a LigandNetworkPlanner? Or are those two connected?

In lines 85-89, all possible edges between the networks are created, therefore I would think that the initial_edge_lister is not needed, so None could be ok?!

Copy link
Contributor Author

@atravitz atravitz Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is valid for it to be None, but the issue here is that if _initial_edge_lister is defined in the MstConcatenator class, it never gets passed to the super()__init__() call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we go about this, should we raise an issue and address this in a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @atravitz - if _initial_edge_lister is part of the signature it should be passed through instead of hard-coding to None. The argument should also be listed in the class docstring.

Furthermore, we should avoid "private" kwargs in a user-facing class method (i.e. anything starting with a "_" that's bad form imho).

Fixing in a separate issue/PR combo is fine, especially since it seems like we'll need to rename the variable.

@atravitz atravitz changed the title Draft: 85 review docstrings 85 review docstrings Nov 8, 2024
Copy link
Contributor Author

@atravitz atravitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannahbaumann could you take a look at these last two open conversations?

@@ -49,15 +49,15 @@ def __init__(
scorer=scorer,
network_generator=MstNetworkAlgorithm(),
n_processes=n_processes,
_initial_edge_lister=None,
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?
_initial_edge_lister=_initial_edge_lister,

I'm asking if _initial_edge_lister should be passed through here, right now this is forcing _initial_edge_lister=None

@@ -40,7 +40,7 @@ def __init__(
clusterer: ComponentsDiversityClusterer = ComponentsDiversityClusterer(
featurize=RDKitFingerprintTransformer(), cluster=KMeans(n_clusters=3)
),
mappers: Union[AtomMapper, list[AtomMapper]] = None,
mappers: Union[AtomMapper, list[AtomMapper]] = None, # include None in this union?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simply do Union[AtomMapper,list[AtomMapper],None]

However, it's not optional in the parent class, so I don't think it should be optional here.

Copy link

@hannahbaumann hannahbaumann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here my comments on the two open questions, will go through the rest later. Maybe we can also meet and discuss @atravitz , I'm also not fully sure yet.

@@ -49,15 +49,15 @@ def __init__(
scorer=scorer,
network_generator=MstNetworkAlgorithm(),
n_processes=n_processes,
_initial_edge_lister=None,
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that in the parent class, the _initial_edge_lister=None, however, in the doc string it says

_initial_edge_lister: LigandNetworkPlanner, optional
            this LigandNetworkPlanner is used to give the initial set of edges.
             For standard usage, the Maximal NetworPlanner is used.
            However in large scale approaches, it might be interesting to use
             the heuristicMaximalNetworkPlanner. (default: None)

In this class MSTConcatenator it is listed as _initial_edge_lister: NetworkConcatenator = None,, while I think it should be a LigandNetworkPlanner? Or are those two connected?

In lines 85-89, all possible edges between the networks are created, therefore I would think that the initial_edge_lister is not needed, so None could be ok?!

@@ -39,7 +39,7 @@ def __init__(
scoring function evaluating an atom mapping, and giving a score
between [0,1].
n_connecting_edges: int, optional
number of connecting edges. (default: 2)
maximum number of connecting edges. (default: 2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases where not exactly n_connecting_edges are being generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Kruskal's algorithm will not create a cycle, so it iteratively adds edges up until either its criteria is met, or it reaches the user-provided number of connecting edges. I think we should rename this to max_edges or similar, but that's out of the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rename pre-1.0, so please raise an issue for this!

the AtomMappers to use to propose mappings. At least 1 required,
but many can be given, in which case all will be tried to find the
lowest score edges
scorer : AtomMappingScorer
any callable which takes a AtomMapping and returns a float
Any callable which takes a AtomMapping and returns a float
network_generator:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add the explanation here? (I'm not yet sure what the AbstractNetworkAlgorithm is, will have to check).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this naming convention is confusing, I think we should rename this as a follow-up

@hannahbaumann hannahbaumann self-assigned this Nov 20, 2024
Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve as "this is an improvement", but there are follow-up work items that need to be dealt with. For example there's a ton of old references to Transformation that are likely innapropriate.

@@ -39,7 +39,7 @@ def __init__(
scoring function evaluating an atom mapping, and giving a score
between [0,1].
n_connecting_edges: int, optional
number of connecting edges. (default: 2)
maximum number of connecting edges. (default: 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rename pre-1.0, so please raise an issue for this!

@@ -49,15 +49,15 @@ def __init__(
scorer=scorer,
network_generator=MstNetworkAlgorithm(),
n_processes=n_processes,
_initial_edge_lister=None,
_initial_edge_lister=None, ## TODO: should this be _initial_edge_lister?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @atravitz - if _initial_edge_lister is part of the signature it should be passed through instead of hard-coding to None. The argument should also be listed in the class docstring.

Furthermore, we should avoid "private" kwargs in a user-facing class method (i.e. anything starting with a "_" that's bad form imho).

Fixing in a separate issue/PR combo is fine, especially since it seems like we'll need to rename the variable.

@@ -24,32 +24,34 @@ def __init__(
self,
mappers: Union[AtomMapper, list[AtomMapper]],
scorer,
## TODO: rename this to network_algorithm?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd have to consider what this means to the class names everywhere else, but I agree that it would make sense.

@@ -24,32 +24,34 @@ def __init__(
self,
mappers: Union[AtomMapper, list[AtomMapper]],
scorer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should type hint these with Callable?

The `Transformation` s of this graph are realized as `AtomMapping` s of pairwise `Component` s.
If not all mappings can be created, it will ignore the mapping failure and return a nearly fully connected graph.

Note: This approach is not recommended for Free Energy calculations in application cases, as it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using the ... note:: directive here?

> **Note**: the `MaximalNetworkGenerator` is parallelized and the number of CPUs can be given with `n_processes`.
The `MaximalNetworkGenerator` builds a fully connected graph for given set of `Component` s.
It assumes each `Component` can be connected to every other `Component`.
The `Transformation` s of this graph are realized as `AtomMapping` s of pairwise `Component` s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a separate issue, but we really need to avoid talking about Transformation especially capitalized.

This effectively refers to a gufe.Transformation, which Konnektor should have no knowledge of.

then remove edges to achieve the desired design.

This class is recommended as an `initial_edge_lister` for other approaches.
> **Note**: the `MaximalNetworkGenerator` is parallelized and the number of CPUs can be given with `n_processes`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this note differently formatted than the other note?

The `MinimalSpanningTreeNetworkGenerator`, builds an minimal spanning tree (MST) network for a given set of `Component` s. The `Transformation` s of the Network,
are represented by an `AtomMapping` s, which are scored by a `AtomMappingScorer`.
The `MinimalSpanningTreeNetworkGenerator`, builds an minimal spanning tree (MST) network for a given set of `Component` s.\
The `Transformation` s of the Network are represented by an `AtomMapping` s, which are scored by a `AtomMappingScorer`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above re: Transformation references.

@@ -25,11 +25,11 @@ def __init__(
Parameters
----------
mapper: AtomMapper
the atom mapper is required, to define the connection between two ligands.
Defines the connection between two ligands.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalization isn't consistent - here you capitalize, other places you dont.

Personally I prefer it if these are capitalized, that matches the NumPy docstyle we aim to use everywhere. I'm not sure if the non-capitalized approach folllows some other docstring style so maybe I'm missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Review docstrings
4 participants