Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CellariumAnnDataDataModule and CellariumModule #125

Merged
merged 5 commits into from
Mar 13, 2024
Merged

Conversation

ordabayevy
Copy link
Contributor

@ordabayevy ordabayevy commented Feb 6, 2024

  • Add a warning filter for transforming loaded anndata files
  • Save hyper-parameters by using lightning's save_hyperparameters
    • save_hyperparameters pulls init args from the locals(). Therefore, we can uninitialize torch.nn.Modules before calling save_hyperparameters.
    • Initialize torch.nn.Modules in the __init__ method if they were provided as class_path and init_args.
    • Remove config arg from the CellariumModule that was used to store hyper-parameters
  • Refactor CellariumAnnDataDataModule where both DistributedAnnDataCollection and AnnData can be passed
    • Remove a duplicated example in the docstring
  • Remove default lr (0.001) and default optimizer (Adam). For models like IncrementalPCA or OnePassMeanVarStd optimizer is not needed.

@ordabayevy ordabayevy changed the title Update CellariumAnnDataDataModule Update CellariumAnnDataDataModule and CellariumModule Feb 6, 2024
@ordabayevy ordabayevy force-pushed the module branch 2 times, most recently from 605f261 to 32cca79 Compare February 6, 2024 14:55
@ordabayevy ordabayevy requested a review from mbabadi February 6, 2024 14:55
@ordabayevy ordabayevy force-pushed the module branch 8 times, most recently from 95a2bca to 11c767f Compare February 13, 2024 19:31

_transforms, _model = transforms, model

transforms = [uninitialize_object(transform) for transform in _transforms] if _transforms is not None else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this to me?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also write two lines of comment about why you're uninitializing and reinitializing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment above explaining the logic behind un-initializing and initializing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model and transforms are not attributes of self (yet)-- does uninitializing them have any effect on the behavior of self.save_hyperparameters?

Copy link
Member

@mbabadi mbabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work, nothing strikes my eyes :) Small request for comment on a magical-looking part of the code.

@ordabayevy ordabayevy requested a review from mbabadi February 27, 2024 14:54
# In order to achieve this, we temporarily re-assign `dadc` to its un-initialized state
# and then call `save_hyperparameters` which will save these values as hparams.
# Then, we re-assign `dadc` back to its initialized state.
# `initialize_object` handles the case when the object was passed as a dictionary of class path and init args.
_dadc = dadc
dadc = uninitialize_object(_dadc)
self.save_hyperparameters(logger=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be done more neatly with a context manager construct? also, dadc not not an attrib of self yet-- how would uninitializing it on line 93 have an effect on self.save_hyperparameters call on line 94?

Copy link
Contributor Author

@ordabayevy ordabayevy Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a context manager is good idea, let me think about it more.
save_hyperparameters looks up the variable from the locals() in the current frame using an argument name from the function signature.

Copy link
Contributor Author

@ordabayevy ordabayevy Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach is to have model and transforms args of CellariumModule as dicts of class name and init args and delegate their initialization to CellariumModule instead of LightningCLI. In fact, there is a configure_model method of LightningModule that is used to initialize large models with FSDP and DeepSpeed strategies (https://lightning.ai/docs/pytorch/stable/advanced/model_init.html#model-parallel-training-fsdp-and-deepspeed). We can just always use it to initialize all our models. WDYT?

@ordabayevy ordabayevy force-pushed the module branch 4 times, most recently from 6f0fbf9 to 7cba8f2 Compare March 4, 2024 04:52
@ordabayevy ordabayevy requested a review from mbabadi March 4, 2024 15:11
Copy link
Member

@mbabadi mbabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff all around! I like initialization in meta device context + configure_model.

@ordabayevy ordabayevy merged commit 20b7bf8 into main Mar 13, 2024
5 checks passed
@ordabayevy ordabayevy deleted the module branch March 13, 2024 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants