-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop Django Backend #4985
Comments
strongly agree Additional commentNo discussion, no reasoning? I like this game 😁 |
weakly agree (after some deliberation I changed my initial response from "strongly agree" to "weakly agree") |
weakly agree. Additional commentTo be clear, I'm assuming the question to be: do you support dropping the django backend before the v2 release if it can be done without much effort and with minimal impact for most users? "Minimal impact for most users" to me means an automatic migration from django to sqla (that would need to be demonstrated on a large django database). |
To go slightly against my rule of no discussion 😬 (but a key point I want to highlight):
To a first approximation there is literally no difference with the actual postgres DB, and so no real migration needed (except syncing the schema version value), i.e. I don't see any difference in the tables/fields of the ORM classes, and also I tried:
(I would encourage you to try with your own database) |
I'm not sure that's the correct way to discuss issues - you give yourself the space to argue for yourself (multiple times) and not the others? :-) Anyway, I see you say "we can discuss further". Since already the preliminary answers indicate we might need further discussion, I'll go in discussion mode. After some consideration, I might be OK with the option of dropping Django, but with some caveats (that I'd like to discuss in person in a brief meeting, but I summarise below). Here's my reasoning (also to bring some historical perspective on why this was done, and avoid to fall back in the same pitfalls):
Two important questions for me remain:
In summary:
|
@giovannipizzi I am bit disappointed that you have chosen to specifically ignore my request; to get an initial consensus from the group on the issue, before diving in to technical discussion, but I will answer some of your points:
I disagree. I did not argue, I made a statement to gauge initial support (and made that clear). It is, in my experience, fruitless to go into these long, drawn out discussions, if people already have entrenched viewpoints.
This is exactly what I already mentioned in the initial comment. I completely understand there were historical reasons, but now it is just hindering aiida rather than helping it.
I never said to remove the backend concept
and so this is a false equivalence
Except the code is not well modularised, which is why we have to maintain two entirely separate implementations for database migrations and archive exports (for what is now exactly the same database), amongst others things, which is a large impediment for anyone trying to make any notable changes/improvements to aiida
As well, as
This is premature generalisation. If you want any of the features you mention, then I would suggest you should be writing them in an AEP and adding them to a roadmap.
Again, I am not asking for the backend concept to be removed, but we have now gone to the opposite extreme of having multiple layers and implementations of abstraction the point of backend abstraction should be to allow for plugging in multiple types of database, not having two semi-hardcoded implementations of accessing exactly the same database, with exactly the same schema |
Cross-posting a comment here from aiidateam/aiida-restapi#26 (comment), which I think is relevant to the backend abstraction/modularization discussion
basically, IMO, the |
Another possible thing to bear in mind when considering databases as a plugin, is how much should they be tied to the object store? As well as the node repository file structure now being stored in the DB, from 87ab7e3, the repository is now directly initialised within the database migrations. So presumably a new DB backend would need at least one "faux" migration to initialise the repository. |
Linking #2303 |
From a meeting with the team, the approach may be:
... |
Comparison of schemas, using: import tabulate
from aiida import load_profile
load_profile()
from aiida.manage.manager import get_manager
backend = get_manager().get_backend()
# https://www.postgresql.org/docs/12/infoschema-tables.html
tables_columns = ("table_type", "is_typed")
table_rows = backend.execute_raw(
f"SELECT table_name,{','.join(tables_columns)} FROM information_schema.tables WHERE table_schema = 'public'"
)
print(
tabulate.tabulate(sorted([list(r) for r in table_rows]),
headers=('table_name', *tables_columns)))
table_dict = {row[0]: dict(zip(tables_columns, row[1:])) for row in table_rows}
# https://www.postgresql.org/docs/12/infoschema-columns.html
field_columns = ("data_type", "is_nullable", "is_identity",
"character_maximum_length", "column_default")
convert = {"character_maximum_length": "char_max_len"}
for table_name in sorted(table_dict):
header = f"\nTable: {table_name}"
print(f"{header}\n{'='*len(header)}")
field_rows = backend.execute_raw(
f"SELECT column_name,{','.join(field_columns)} FROM information_schema.columns WHERE table_schema = 'public' AND table_name = '{table_name}'"
)
print(
tabulate.tabulate(sorted([list(r) for r in field_rows]),
headers=('column_name', *(convert.get(f, f)
for f in field_columns)))) Django:
SQLAlchemy:
|
Principally all the |
If converting from django to sqlalchemy, I assume these tables can just be silently ignore/removed:
and you could just directly write the
|
Comparison of indexes: Django:
SQLAlchemy:
So django creates an additional |
Ah I see @giovannipizzi already opened an issue about this #2303 Adding these with alembic is noted here: https://stackoverflow.com/a/52689029/5033292 |
Thanks Chris for looking into this. Indeed, the For the indices, indeed it would be good to uniform the two as mentioned in #2303, but also with an eye to which ones are really relevant to avoid slowdowns and used disk space with the indices never really used - see e.g. this discussion #2762 that also reports how the indices are used (those queries, however, need to be run on big production DBs, e.g. the Materials Cloud ones or the ones of @sphuber and @mbercx - in small DBs, PSQL might always decide not to use the index as a scan is always faster). |
yep cheers @giovannipizzi, just figured out how to add them to the ORM classes and migrations, draft PR opened; #5097 |
FYI, these aiida-core/aiida/backends/djsite/settings.py Lines 103 to 105 in 4174e5d
Given these lines haven't been touched for 9 years, and we don't use either, I'm guessing they are both just legacy things |
They are required for typical Django applications that are web applications, but we never used it as such. We only used their ORM. So yeah, we can get rid of this. |
Ok I am going to make a very unnuanced statement to gauge opinion, and would appreciate if you simply respond with: "strongly agree", "weakly agree", "no opinion", "weakly disagree" or "strong disagree"; no discussion, no reasoning.
If the majority of responses are "strongly disagree", we will close this issue and never speak of it again, if not we can discuss further:
There were legacy reasons for maintaining both django and sqlalchemy (e.g. JSONB), but the reasons have all now gone, and so the only thing it serves to do is effectively doubles the development, testing and maintenance costs in many key areas of aiida-core. We can't drop sqlalchemy, since it is the only backend for which the QueryBuilder is/can be implemented (i.e. even when you select django as the backend, you are still calling sqlalchemy for queries), so it has to be django. It has to be now or never, whilst we are moving from v1 to v2, and so breaking changes are allowed. It would be relatively straight-foward to do, and would have minimal impact for most developers/users.
cc @giovannipizzi @sphuber @ltalirz @ramirezfranciscof @CasperWA @mbercx @csadorf
The text was updated successfully, but these errors were encountered: