-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
📝 Explain architecture and reason for monkey patching the Record classes in lamindb
#459
base: main
Are you sure you want to change the base?
Conversation
lamindb
@falexwolf is it good to merge? |
README.md
Outdated
|
||
This package implements models but leaves concrete implementations of non-ORM methods (such as `Artifact.cache`) empty. The methods are implemented in `lamindb` and then the model classes are monkey-patched there as well. | ||
|
||
### Why monkey pathing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: patching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
README.md
Outdated
|
||
## Developers' Note | ||
|
||
This package implements models but leaves concrete implementations of non-ORM methods (such as `Artifact.cache`) empty. The methods are implemented in `lamindb` and then the model classes are monkey-patched there as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package implements models but leaves concrete implementations of non-ORM methods (such as
Artifact.cache
) empty.
Can we say:
This package defines ORMs ...
Then it doesn't say "model" in one part and "ORM" in the other. In general I think "ORM" is superior because "lamindb.Record" is an ORM, and "django.db.models.Model" is an ORM and the SQLAlchemy etc. variants are also ORMs etc.
Re:
The methods are implemented in
lamindb
and then the model classes are monkey-patched there as well.
I'd rather do
The methods are implemented in
lamindb
and the implementations are added to the ORM classes dynamically ("monkey-patching").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edited
|
||
### Why monkey pathing? | ||
|
||
We want to avoid implementing anything non-ORM related in this package and just have models here, so that concrete implementations are all in `lamindb`. This provides better code structure and clarity compared to having imports from `lamindb` needed if the implementations are done here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, the first sentence doesn't provide a reason. The second seems subjective. We need to be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason is auto-connect
and the difficulties of setting up Django.
You cannot import Django models before the django apps are setup, which is a dynamic process in which Django ensures the integrity of all schema and auto-creates the dynamic fields; like the reverse many-to-many accessors.
This is simply something that can't be done statically, and so it makes sense that this step exists.
Now I'm thinking: What if we had .models.py
and wouldn't import it upon importing lamindb but only in the places in which import lnschema_core
? Maybe there is a design to get it all to work without lnschema-core. 🤔
Of course then you have this weird lamindb.models
submodule which makes no sense at all but it could remain as a technical low-level submodule that's clearer marked for "developers only".
In that case we could solve the whole situation by eliminating lnschema-core
. 🤔
The migration that would do this would be as painful as the migration that eliminated lnschema-bionty
but it's doable.
I think we should consider this. It'd be great to get rid of lnschema-core if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason is auto-connect and the difficulties of setting up Django.
I don't think it is a reason. Because we actually can import lamindb
inside methods of lnschema_core
models and do the implementations in lnschema_core
. And we had this some time ago. But the code structure and clarity was much worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but this means we could eliminate lnschema-core and just have everything in lamindb and work with lamindb itself as the most basic Django app.
There'll be some weirdness about lamindb-setup then depending on lamindb instead of lnschema-core and we should likely then also move lamindb-setup into lamindb; but it seems possible. 🤔
Of course, all of this would be considerable work in particular for creating the CI architecture of a mono-repo, but also that'd be possible.
Finally it'd be great to integrate lamin-cli and I don't yet see how that's possible without introducing latency but I'm pretty sure we could also overcome this. So, a lamindb
mono-repo vision is conceivable. :D
All of this would need to be done during a Christmas break or so because of the disruptiveness of the update process. But once it's done we'd never want to change it again, I imagine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but this means we could eliminate lnschema-core and just have everything in lamindb and work with lamindb itself as the most basic Django app.
Maybe it is possible, but requires some additional pretty complicated checks on lamindb
import. But i don't really see how it follows from my comment actually. What i was talking about is importing things from lamindb
inside lnchema_core
, we had this, this is a pretty uncomplicated settings because at the point of import inside a methods everything was already setup. I am not sure it is easily adaptable to the single repo variant.
|
||
### Why not simple inheritance? | ||
|
||
It is possible to replace monkey patching with simple inheritance for some model classes (like `Artifact`), but it is hard to do for all due to the situation when a model class inherits another model class within `lnschema-core`. This is true, for example, for `Artifact` inheriting `Record`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not true for Artifact
due to its inheritance from Record
, right? I'm confused reading this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Record
is still monkey-patched, it is easy to replace monkey-patching of Artifact
with simple inheritance. otherwise not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what i mean here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why don't we want that? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this. You said you didn't want also a mix of monkey patching and simple inheritance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! 🤔
|
||
It is possible to replace monkey patching with simple inheritance for some model classes (like `Artifact`), but it is hard to do for all due to the situation when a model class inherits another model class within `lnschema-core`. This is true, for example, for `Artifact` inheriting `Record`. | ||
|
||
If we want simple inheritance for `Artifact` in `lamindb`, we will also have to implement simple inheritance for `Record` there and make the `Artifact` child class inherit this `Record` child class, duplicating the inheritance structure and complicating the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is somewhat an evident corrolary of the first paragraph. It's also not very clearly articulated and would need work. I suggest to remove this paragraph instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it actually clarifies the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. :D
No description provided.