Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alignedmapping repr & delete methods for anndata attrs #242

Merged
merged 4 commits into from
Nov 13, 2019

Conversation

ivirshup
Copy link
Member

@ivirshup ivirshup commented Nov 5, 2019

Previously, the aligned mapping repr would throw an error if called since it starting a recursion loop. This has been fixed and tested. To make sure reprs worked, I added some tests, but found out it'd be easier to do this if I could remove attributes. This led me to implement deleter methods for the main attributes (except raw).

Now you can do:

>>> adata.layers
Layers with keys: counts, log_norm

>>> del adata.layers

>>> adata.layers
Layers with keys:

@flying-sheep
Copy link
Member

Hmm, you know the plan is to unify layers and X. Should del adata.layers therefore delete X?

@ivirshup
Copy link
Member Author

ivirshup commented Nov 5, 2019

I know that's something you're looking at doing, but I don't think I have an idea of what exactly that means from an API stand-point. My inclination would be not to do that yet.

Any idea what's up with that black error on travis? I don't get that locally.

@flying-sheep
Copy link
Member

I don't think I have an idea of what exactly that means from an API stand-point.

  • X will be stored in the layers object instead of directly on AnnData
  • Iterating over layers will first yield key: None, value: X

The idea is that mostly people want to include X when iterating over layers, rarely not, and in any case:

for key, val in ad.layers.items():
    if key is None: continue
    ...
# or
for key, val in {k, v in ad.layers.items() if k is not None}:
    ...

is easier to get right and come up with than:

for key, val in ChainMap({None: ad.X}, ad.layers).items():
    ...
# or
for key, val in chain([(None, ad.X)], ad.layers.items()):
    ...

It’s relevant for this PR as everything cementing “X isn’t a real layer” makes migration to the new way harder.

Any idea what's up with that black error on travis? I don't get that locally.

It’s psf/black#1139, circumvented on anndata master.

@ivirshup
Copy link
Member Author

ivirshup commented Nov 6, 2019

Issue with black

👍

Behaviour of del adata.layers

I don't think it's intuitive for del adata.layers to remove values from adata.X, at least with my current mental model for AnnData objects. For what I've implemented, del adata.layers; adata.layers returns an empty aligned mapping object. I'm not sure if there's an appropriate "empty" value for .X.

In the context of this PR, I think the behavior defined here is good for now.

X as a layer

Could you open up a new issue for this idea? I have a few follow up questions, but I think this discussion deserves a dedicated thread. Some initial questions:

Iteration over layers

I don't think I've tried to do this before. What's your use case?

Also, if brevity is what you're looking for, what about this:

for k, v in {None: adata.X, **adata.layers}.items():
    ...

Disk representation

On disk, would X now be stored in layers? What would the on disk key be for this? I don't think None will be an option here.

Raw

Does this mean raw now needs layers? I don't think the behaviour/ structure of the two objects should diverge much, as it's already a support burden. I was thinking Raw and AnnData should potentially share a base class to reduce that burden, and simplify the implementation.

obs and var

In my mind, obs is to obsm as X is to layers. Is this part of your proposal?

None as a key

I don't like None as a key. For one, doesn't this make the argument layers=None ambiguous is scanpy functions? Now it's unclear if a user provided an option or not.

Are there other options for the key? "X"?

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I think we can increase maintainability a bit:

Comment on lines 1016 to 1017
if self.isview:
self._init_as_actual(self.copy())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see those lines repeated a lot. How about a decorator actualizes_view or so?

@layers.deleter
@actualizes_view
def layers(self):
    self._layers = Layers(self, vals=dict())

Copy link
Member Author

@ivirshup ivirshup Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that while I wrote this. I would prefer a private method, not a decorator, so you can validate the input before calling _init_as_actual. Many of the setter have examples of this.

The only other thing is that it might be better to modify the copy, then call _init_as_actual – just to make sure all possible checks occurred before modifying the object. This is harder to put in a method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this?

If it were to be a method, how about this being the no-argument version of self._init_as_actual()? The only thing I'm worried about there is whether's it's obvious that it would be a no-op if the object isn't a view.

Copy link
Member

@flying-sheep flying-sheep Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make sense! Just make _init_as_actual work without arguments and make it do nothing if self is actualized. Then it would be

self._init_as_actual()

instead of

if self.isview:
    self._init_as_actual(self.copy())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel bad every time I make the _init_as_* methods more complicated. Someday they'll be refactored. Someday...

@flying-sheep
Copy link
Member

OK, so I rebased this using

git rebase master
# collapse “Simplify deleters” into “Add deleter methods …”
git rebase -i $first_hash
# Make sure every commit is formatted
git filter-branch --tree-filter 'black .' -- $first_hash..HEAD

The last step got rid of the “Formatting” commit.

I think a procedure like this is simple enough to do so we can avoid a messy commit hierarchy.

@flying-sheep flying-sheep merged commit 31659e6 into scverse:master Nov 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants