Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automatic generation of metadata accessors #23

Merged
merged 9 commits into from
Jan 18, 2021

Conversation

mobiusklein
Copy link
Contributor

Implements the metaclass-based property generator and collection folding logic.

There are almost enough properties generated to justify the amount of extra metaprogramming code added.

@levitsky
Copy link
Owner

Wow, this is a lot more (and sooner) than I expected! Thank you!

This is not easy to soak in, to be honest, but I'll do my best to work through this code to appreciate the beauty of it. It would definitely help if you could briefly demonstrate all of the magic the parser does in terms of (meta)data access, though, especially with path parsing. I want to document it so others can enjoy it as well.

One question that I've already had since the previous discussion is this: do we want to collect lists of metadata from all base classes? That way no subclasses can ever remove any metadata, is that right? I do realize that having to repeat the list in subclasses would be undesirable.

@levitsky
Copy link
Owner

levitsky commented Jan 15, 2021

Thank you for extending the documentation. Apparently I misunderstood the code yesterday and thought that extract_path and related code was supposed to handle some user-supplied queries, while in fact it serves for collapsing of metadata into groups of collections.

I'm ready to merge as is, but I'm curious about one thing: if we traverse all base classes to collect metadata specifications, does it mean that there is no straightforward way to remove any metadata in subclasses? This is not a practical concern, though.

@mobiusklein
Copy link
Contributor Author

I've added a bit more documentation to the metaclass and added explicit override support. To prevent a metadata property from being added, you can explicitly set the name to some value or define a method/property with the same name.

class MyMzTab(mztab.MzTab):
      title = "foo"

This will not get a title property automatically generated. However, this removal isn't heritable. You can also turn off metadata property inheritance completely by setting __inherit_metadata_properties__ = False within the class.

The path parsing is something I'm not quite satisfied with since I implemented it to deal with complex collections in table headings e.g. search_engine_score[1]_ms_run[48] which after gathering would be Group(["search_engine_score", Group([(1, <x>), ..., (48, <y>), ...])]), so that the top-level group could say g.get_path("search_engine_score[1]_ms_run[48]") and get the same answer as saying g['search_engine_score'][1]['ms_run'][48], but it's doesn't mesh well with the current interface.

Furthermore, outside of using the tables in "raw" mode, nobody will get that behavior, just the existing dict or DataFrame interface on the individual tables. I've been pretty dismissive of mzTab in the past because it flattens the structural hierarchy unevenly and inconsistently. This was partially an exercise to reconstruct that hierarchy, but I lack a good example where it's compelling. I would still need to add "column processors", a system by which specific columns register as being parsed by a special function that interprets something specific about their format e.g. PSM's modifications: '6-CHEMMOD:15.9949146221, 8-CHEMMOD:15.9949146221' this column value is not really useful until it's split by lambda x: [(self._cast_value(y[0]), y[1]) for y in x.split(", ")]).

@levitsky
Copy link
Owner

Indeed, categorizing some data as lists and perhaps even some rich parsing, like for modifications, would be a significant enhancement. I don't have any specific applications for this, though. I guess we can keep it in mind as a direction for future development.

Thank you again for the work. Should I merge now?

@mobiusklein
Copy link
Contributor Author

mobiusklein commented Jan 18, 2021

I was distracted and didn't document the new raft of properties. This has been added by automatically setting the __doc__ attribute of the properties (not very detailed, but no source to read this from). By the power of autodoc these get noted automatically in the Sphinx documentation.

I wasn't sure how to add this to CHANGELOG, just start a new section?

All set to merge. Probably should have made this a draft.

@levitsky
Copy link
Owner

The no-members directive didn't seem to work, so class definitions were duplicated in Sphinx. I took the liberty of moving all of the meaningful content into the module docstring, leaving only the boilerplate code in RST. I also added a note in the changelog.

@mobiusklein
Copy link
Contributor Author

I thought I had tested that option with Sphinx 3.2.1 and it worked locally. It doesn't matter though. Thank you for adding the changelog entry. All set to merge.

@levitsky levitsky merged commit dc0cafb into levitsky:master Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants