Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📝 Explain architecture and reason for monkey patching the Record classes in lamindb #459

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Koncopd
Copy link
Member

@Koncopd Koncopd commented Nov 26, 2024

No description provided.

@Koncopd Koncopd requested a review from falexwolf November 26, 2024 17:18
@falexwolf falexwolf changed the title 📝 Developers not on monkey pathing and inheritance 📝 Explain architecture and reason for monkey patching the Record classes in lamindb Nov 26, 2024
@Koncopd
Copy link
Member Author

Koncopd commented Nov 27, 2024

@falexwolf is it good to merge?

README.md Outdated

This package implements models but leaves concrete implementations of non-ORM methods (such as `Artifact.cache`) empty. The methods are implemented in `lamindb` and then the model classes are monkey-patched there as well.

### Why monkey pathing?
Copy link
Member

@falexwolf falexwolf Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: patching

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

README.md Outdated

## Developers' Note

This package implements models but leaves concrete implementations of non-ORM methods (such as `Artifact.cache`) empty. The methods are implemented in `lamindb` and then the model classes are monkey-patched there as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package implements models but leaves concrete implementations of non-ORM methods (such as Artifact.cache) empty.

Can we say:

This package defines ORMs ...

Then it doesn't say "model" in one part and "ORM" in the other. In general I think "ORM" is superior because "lamindb.Record" is an ORM, and "django.db.models.Model" is an ORM and the SQLAlchemy etc. variants are also ORMs etc.

Re:

The methods are implemented in lamindb and then the model classes are monkey-patched there as well.

I'd rather do

The methods are implemented in lamindb and the implementations are added to the ORM classes dynamically ("monkey-patching").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edited


### Why monkey pathing?

We want to avoid implementing anything non-ORM related in this package and just have models here, so that concrete implementations are all in `lamindb`. This provides better code structure and clarity compared to having imports from `lamindb` needed if the implementations are done here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, the first sentence doesn't provide a reason. The second seems subjective. We need to be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason is auto-connect and the difficulties of setting up Django.

You cannot import Django models before the django apps are setup, which is a dynamic process in which Django ensures the integrity of all schema and auto-creates the dynamic fields; like the reverse many-to-many accessors.

This is simply something that can't be done statically, and so it makes sense that this step exists.

Now I'm thinking: What if we had .models.py and wouldn't import it upon importing lamindb but only in the places in which import lnschema_core? Maybe there is a design to get it all to work without lnschema-core. 🤔

Of course then you have this weird lamindb.models submodule which makes no sense at all but it could remain as a technical low-level submodule that's clearer marked for "developers only".

In that case we could solve the whole situation by eliminating lnschema-core. 🤔

The migration that would do this would be as painful as the migration that eliminated lnschema-bionty but it's doable.

I think we should consider this. It'd be great to get rid of lnschema-core if possible.

Copy link
Member Author

@Koncopd Koncopd Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason is auto-connect and the difficulties of setting up Django.

I don't think it is a reason. Because we actually can import lamindb inside methods of lnschema_core models and do the implementations in lnschema_core. And we had this some time ago. But the code structure and clarity was much worse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this means we could eliminate lnschema-core and just have everything in lamindb and work with lamindb itself as the most basic Django app.

There'll be some weirdness about lamindb-setup then depending on lamindb instead of lnschema-core and we should likely then also move lamindb-setup into lamindb; but it seems possible. 🤔

Of course, all of this would be considerable work in particular for creating the CI architecture of a mono-repo, but also that'd be possible.

Finally it'd be great to integrate lamin-cli and I don't yet see how that's possible without introducing latency but I'm pretty sure we could also overcome this. So, a lamindb mono-repo vision is conceivable. :D

All of this would need to be done during a Christmas break or so because of the disruptiveness of the update process. But once it's done we'd never want to change it again, I imagine.

Copy link
Member Author

@Koncopd Koncopd Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this means we could eliminate lnschema-core and just have everything in lamindb and work with lamindb itself as the most basic Django app.

Maybe it is possible, but requires some additional pretty complicated checks on lamindb import. But i don't really see how it follows from my comment actually. What i was talking about is importing things from lamindb inside lnchema_core, we had this, this is a pretty uncomplicated settings because at the point of import inside a methods everything was already setup. I am not sure it is easily adaptable to the single repo variant.


### Why not simple inheritance?

It is possible to replace monkey patching with simple inheritance for some model classes (like `Artifact`), but it is hard to do for all due to the situation when a model class inherits another model class within `lnschema-core`. This is true, for example, for `Artifact` inheriting `Record`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not true for Artifact due to its inheritance from Record, right? I'm confused reading this.

Copy link
Member Author

@Koncopd Koncopd Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Record is still monkey-patched, it is easy to replace monkey-patching of Artifact with simple inheritance. otherwise not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what i mean here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why don't we want that? 🤔

Copy link
Member Author

@Koncopd Koncopd Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this. You said you didn't want also a mix of monkey patching and simple inheritance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! 🤔


It is possible to replace monkey patching with simple inheritance for some model classes (like `Artifact`), but it is hard to do for all due to the situation when a model class inherits another model class within `lnschema-core`. This is true, for example, for `Artifact` inheriting `Record`.

If we want simple inheritance for `Artifact` in `lamindb`, we will also have to implement simple inheritance for `Record` there and make the `Artifact` child class inherit this `Record` child class, duplicating the inheritance structure and complicating the code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is somewhat an evident corrolary of the first paragraph. It's also not very clearly articulated and would need work. I suggest to remove this paragraph instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually clarifies the previous one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants