Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX match only dense dataframes #145

Merged
merged 4 commits into from
Mar 18, 2024
Merged

FIX match only dense dataframes #145

merged 4 commits into from
Mar 18, 2024

Conversation

GeorgWa
Copy link
Collaborator

@GeorgWa GeorgWa commented Mar 16, 2024

SpecLibBase.available_fragment_dfs() would return ['_fragment_df', '_fragment_intensity_df', '_fragment_mz_df'] on a SpecLibFlat and fail on .remove_unused_fragments().

This makes sure it's really matching _fragment_[attribute_name]_df.

@GeorgWa GeorgWa requested review from jalew188 and mo-sameh March 16, 2024 13:17
@jalew188
Copy link
Collaborator

Looks nice

@GeorgWa
Copy link
Collaborator Author

GeorgWa commented Mar 18, 2024

I discussed the topic a bit with @mo-sameh and we reasoned that there is currently some ambiguity between a SpecLibBase and SpecLibFlat. As the SpecLibFlat is inherited and produced by a SpecLibBase it can still have dense representations (fragment_mz_df, fragment_intensity_df etc.) and inherited functions like remove_unused_fragments which don't operate on the flat representation. This is confusing as one would expect they affect your flat library.

Therefore I made some small changes to this PR:

  1. Renaming available_fragment_dfs to available_dense_fragment_dfs to make clear this only returns dense fragments matrices.
  2. Returning an empty list on SpecLibFlat.available_dense_fragment_dfs to make clear that dense representations are not supported as part of a SpecLibFlat.
  3. Get rid of dense representations when SpecLibBase.parse_base_library is called. If keep_original_frag_dfs is set to true, a deprecation warning is issued.
  4. Raise a not implemented error if SpecLibFlat.remove_unused_fragments is called.

I think to handle complexity we have to keep the different library types better separated.

Looking forward to hear your thoughts

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jalew188
Copy link
Collaborator

We should use SpecLibBase as the composite instead of the parent class.

@mo-sameh
Copy link
Collaborator

The save/load_hdf functions in flat.py are still using the dense DataFrames.
I agree that there's a need for refactoring the class structures and their relationships. Apart from that, for this PR everything looks good to me.

@GeorgWa
Copy link
Collaborator Author

GeorgWa commented Mar 18, 2024

I agree, or we should have a base class which only handles precursor operations and have this as parent class for SpecLibBase and SpecLibFlat. I think this would be a separate bigger refactoring involving Magnus.

@jalew188
Copy link
Collaborator

Yes, let's keep as it is now. You can merge I think

@@ -135,9 +136,9 @@ def fragment_intensity_df(self)->pd.DataFrame:
return self._fragment_intensity_df


def available_fragment_dfs(self)->list:
def available_dense_fragment_dfs(self)->list:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about move this into flat_lib? As a base lib must have dense frag dfs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only applicable to the dense lib. The flat lib will always return an empty list.

The issue is that it's inherited.

@GeorgWa GeorgWa merged commit 0c16aa7 into development Mar 18, 2024
8 checks passed
@GeorgWa GeorgWa deleted the available-dfs branch March 18, 2024 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants