-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make common DB-table components Mixins #82
Conversation
* Covers: * run__id, data, name, uniqueness of name together with run__id * Adapts tests since default order of columns changes
c3a271b
to
d2a4fef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I think that the mix-ins really make the code a lot easier to read!
Hm so if we really want to do this then I would have a couple of nitpicks: The "HasSomething" naming convention is straight from the sqlalchemy docs page on mixins: https://docs.sqlalchemy.org/en/20/orm/declarative_mixins.html "OptimizationDataMixin" should probably be in It also seems like at least "IndexSet" (also "Scalar"?) has now lost it's audit info, and that is kind of what I see as the main problem with this design pattern. In this case (other than with the repositories) there is no abstract base class or protocol for the model preventing you from forgetting to implement things. So while the code now might be "easier to read" it is only so when you are not trying to read code hidden in some mixin in some other file somewhere... |
I also generally feel that all the mixins comprised only of a single added column are better represented by customizing the class AClass(...):
run__id: types.RunId or class AClass(...):
run__id: types.Integer = db.run__id_fk But then again, I would not do this at all except for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to leave a comment indicating the requested changes.
I see your point about potentially making the code harder to read because one has to visit the location of mixin definitions to see what they actually contain. That's why I kept all of them in |
Yeah, doesnt really matter if they are in the same file or not you have to jump between files in any case...
Yes, imo thats a good thing. I would omit "Mixin" at the end for the things that are simply data-related, but a matter of taste i guess.
I guess so, yeah. But I still think the functionality to set these fields should be in the Mixin as well. Not only are we seperating functionality from its data but right now we are even associating it with /unrelated/ data.
Sorry I was mistaken I overlooked the inheritance in base.py. But this is also exactly my problem with this change... |
Okay, please help me understand what's going on here. I've been trying to adhere to the DRY principle with this PR and I think the proposed solution works well. True, you have to look at multiple files to fully understand what's going on, but I'm not sure this is a problem or that I know a similarly DRY solution without this. After all, my goal here is to make some common attributes more abstract so that they can elegantly be included in several locations. In our current implementation, you already have to jump between two files to fully understand our table definitions: the table's file and As for the audit information, I'll have to look at the code more closely again. Your latest comment sounds like we already have the functionality for setting these fields, in which case it should of course be included with the Mixin. If we already have it, I'll try to make it happen; if we don't already have it, please clarify: are you saying we can't have the functionality in the Mixin? |
Yes and I have replied by suggesting other ways to obtain the same result. Is there really so much of a difference from a DRY perspective between always inheriting some class or always having a declaration in the class body? (Other than that the former requires more refactoring for a change?)
Yes thats true, but that one field is required for all models so the Repository functionalties/api paradigms work. It also literally applies to all of the models so in essence its a ground truth and not something like a mixin. I imagine a world where all models inherit from "HasId" and I dont like it.
Hmm, maybe... It does kind of seem like the approach super market planners use to get people to visit every single aisle even though they only need a few things ^^
Yes the |
* Make Name, UniqueName and RunID part of the type annotation map * Make UniqueRunID and OptimizationData common base classes
Thank you @meksor, I'm afraid I will need a bit more clarification. I see your point about However, I think I don't fully understand what you mean by making At the moment, I have left the |
Of course.
Yes, I agree, I think this is not a good idea and we should not abstract this in that way, the code you proposed has the exact same problem ("Essentially, as they are now, the optimization BaseModels are named after the smallest common denominator."). I sent you this site under the explicit pretense that all these things are not to be done without compromise. I think there is no /programming bible/ and there never will be. We always have to keep the pragmatics in mind imo. I also think that's what's going on here: we are doing this because we want to DRY under any circumstance, are we not? class RegionRepository(
base.Creator[Region],
base.Deleter[Region],
base.Retriever[Region],
base.Enumerator[Region],
abstract.RegionRepository,
):
... Here this approach makes sense in my opinion. Though it is still not what people usually mean by composition, and if you remember: we identified the mixin pattern as described as a "dodge" in that text. Why does this "composing classes together on the top level" make sense here? Want to add bulk insert functionality? Add the So why not have So what would composition look like?
We also use actual composition already sometimes: Alright, I felt like it needed this tangent to get us on the same page. In the former proposed code we can see that all of the sudden, functionality from the part of the program that was explicitly put in the class UniqueRunIDBaseModel/Mixin(BaseModel):
@declared_attr.directive
def __table_args__(cls) -> tuple:
return (
# Mypy recognizes run__id as int, while it should be a MappedColumn[int]
db.UniqueConstraint(cls.name, cls.run__id), # type: ignore
) We also see that now the normal And now that I notice it: Please keep the current casing conventions for abbreviations
I honestly don't know how to specify this further here. May be best to talk in person if there is still open questions after this?
We've had this before. I think
I'm not sure if that link is wrong or I'm not seeing what you mean? This only tells us how to do what I proposed above with the
Yes, as said, I think the audit/creation info is the only thing that kind of makes sense as a mixin. It is used everywhere in the code base, so it can go on the top level. It has some potential for encapsulation, these fields need to always be set together and that can be done with a method in the mixin, seperating concerns further from the repository class that set them before: class RegionRepository(...):
...
def add(self, name: str, hierarchy: str) -> Region:
region = Region(name=name, hierarchy=hierarchy, **self.get_creation_info())
self.session.add(region)
return region becomes: class RegionRepository(...):
...
def add(self, name: str, hierarchy: str) -> Region:
region = Region(name=name, hierarchy=hierarchy)
region.set_creation_info(self.backend.auth_context)
self.session.add(region)
return region |
PS: More typing and type system concepts that might be interesting or relevant at your leisure with some grains of salt. |
Thanks for your patience and sharing your knowledge! I tried to address all your recommendations now and the tests are passing locally. Maybe to highlight some changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, lgtm once tests pass
* Make creation information a mixin * Adapts tests since default order of columns changes * Enable ruff formatting for some import-heavy files * Make Name, UniqueName and RunID part of the type annotation map * Make UniqueRunID and OptimizationData common base classes * Remove UniqueRunIDBaseModel * Rename RunId for consistency * Expand TimestampMixin as suggested and use test-wise on Region * Favor composition over inheritance for OptimizationData * Use ixmp4.data.types module for all model types * Refactor static methods to utils module * Replace mixed function with single-purpose one --------- Co-authored-by: Max Wolschlager <rxtx@ungut.at>
As noted by @danielhuppmann in #79 (comment), several table-classes in our DB layer have common columns/functionality. This will become especially obvious with
optimization.Table
andoptimization.Parameter
.However, I think it's better not to have
Parameter
inherit fromTable
because later changes toTable
might then cause non-obvious changes toParameter
. Instead, I've opted for a mixin approach with this PR: I have extracted several common columns to...Mixin
classes indata/db/base.py
and whichever subclass needs them can simply inherit from these mixins. This even works for@validates
andrelationship
s, at least in theory :)In practice, I couldn't quite extract the relationships between
Table
andColumn
elegantly. Please see the TODO marker and if you have a good idea for that, let me know.Also, these changes changed the order in which some columns in the result of
list()
operations appear. See e.g. core/test_table but also api/test_model. However, the former is not an elegant solution either way and the second is hardcoded with magic numbers, which we probably wouldn't recommend for production code, either. So I think these changes are fine.