Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqliteDosStorage: Make the migrator compatible with SQLite #6429

Merged
merged 1 commit into from
Jul 5, 2024

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented May 28, 2024

The majority of the SqliteDosStorage piggy-backs off of the PsqlDosStorage plugin. It also uses the PsqlDosMigrator as-is to perform the database migrations. This is not safe however, as PostgreSQL and SQLite do not have exactly the same syntax.

An example is the main_0002 revision which was added to drop the hashes of certain nodes. This uses the #- operator which is JSONB specific syntax of PostgreSQL and is not supported by SQLite. Since this migration was added before the SqliteDosStorage plugin was added, this has never caused a problems as all profiles would be new, would not have any nodes and therefore the SQL code of the migration would not actually be executed.

In preparation for any future migrations that may need to be added, the SqliteDosStorage now uses the SqliteDosMigrator. This subclasses the PsqlDosMigrator as it can still use most of the functionality, but it changes a few critical things. Most notably the location of the schema versions which now are kept individually and are no longer lent from the core.psql_dos plugin. The exception is the original schema which is identical and so is symlinked.

@sphuber sphuber force-pushed the feature/sqlite-migrations branch from 4db4bce to 4da021a Compare May 29, 2024 19:43
Copy link

codecov bot commented May 29, 2024

Codecov Report

Attention: Patch coverage is 79.59184% with 10 lines in your changes missing coverage. Please review.

Project coverage is 77.79%. Comparing base (ef60b66) to head (67603b6).
Report is 115 commits behind head on main.

Files with missing lines Patch % Lines
src/aiida/storage/sqlite_dos/backend.py 80.44% 9 Missing ⚠️
src/aiida/storage/sqlite_zip/migrations/env.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6429      +/-   ##
==========================================
+ Coverage   77.51%   77.79%   +0.29%     
==========================================
  Files         560      562       +2     
  Lines       41444    41858     +414     
==========================================
+ Hits        32120    32561     +441     
+ Misses       9324     9297      -27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sphuber sphuber force-pushed the feature/sqlite-migrations branch 3 times, most recently from 32543b0 to 4a9c6a9 Compare May 30, 2024 20:23
@sphuber sphuber requested review from danielhollas and mbercx May 31, 2024 11:13
@sphuber sphuber force-pushed the feature/sqlite-migrations branch 2 times, most recently from 5524a9f to c2e565a Compare June 5, 2024 07:57
@danielhollas
Copy link
Collaborator

danielhollas commented Jun 5, 2024

@sphuber I punted on reviewing this since afaik it's not a blocker for 2.6.0 release. LMK I case it became a priority, otherwise I'll get to it after 20th June, unless someone else reviews first.

@sphuber
Copy link
Contributor Author

sphuber commented Jun 5, 2024

Thanks @danielhollas . It is not critical for the release. Enjoy your holidays!

@sphuber sphuber force-pushed the feature/sqlite-migrations branch from c2e565a to 619a6fc Compare July 2, 2024 10:35
@sphuber
Copy link
Contributor Author

sphuber commented Jul 3, 2024

@danielhollas @mbercx now that v2.6 is released, this is ready for review

Copy link
Collaborator

@danielhollas danielhollas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am definitely not the best person to review this as most of it flies over my head and I currently don't have time to delve into to storage implementation more. I mostly have questions. :-)

Comment on lines 66 to 68
"""Storage implementation using Sqlite database and disk-objectstore container.

This storage backend is not recommended for use in production. The sqlite database is not the most performant and it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, it looks like this docstring should not be here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this belongs under the SqliteDosStorage class which already has an updated docstring. So will just remove this.

class SqliteDosMigrator(PsqlDosMigrator):
"""Class for validating and migrating `sqlite_dos` storage instances.

.. important:: This class should only be accessed via the storage backend class (apart from for test purposes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to somewhere include a high-level description of how is the SqliteDosMigrator different from its base implementation (based off PsqlDosMigrator) i.e. what you have written up on the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not so much that the mechanism is different, it is just that it provides a separate set of actual migrations, which will have to be sqlite or psql specific since they can use direct SQL features that are only supported by one or the other. Before, all plugins used the same migrations which were written for PSQL and so failed for sqlite databases.

Only other difference is that the new sqlite-migrator does not have the complicated additional logic to deal with migrating from legacy Django databases as this was introduced after Django had already been dropped and there were just sqlalchemy databases. I will add something to the docstring.

@@ -83,6 +84,7 @@ def storage_cls(*args, **kwargs):
assert result.exit_code is ExitCode.CRITICAL


@pytest.mark.requires_psql
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this mark required here now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is explicitly using the PsqlDosMigrator, so that will fail if the test profile is run with a core.sqlite_dos profile, i.e. with the presto marker


The connection should have been passed to the config, which we use to configue the migration context.
"""
from aiida.storage.sqlite_zip.models import SqliteBase
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should extract SqliteBase and related out of sqlite_zip to something like sqlite_common or sqlite_base at some point.

def run_migrations_online():
"""Run migrations in 'online' mode.

The connection should have been passed to the config, which we use to configue the migration context.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The connection should have been passed to the config, which we use to configue the migration context.
The connection should have been passed to the config, which we use to configure the migration context.

# For further information on the license, see the LICENSE.txt file #
# For further information please visit http://www.aiida.net #
###########################################################################
"""Upper level SQLAlchemy migration functions."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what this file is about, perhaps expanding a bit this docstring would be nice for some high-level explanation.

The majority of the `SqliteDosStorage` piggy-backs off of the
`PsqlDosStorage` plugin. It also uses the `PsqlDosMigrator` as-is to
perform the database migrations. This is not safe however, as PostgreSQL
and SQLite do not have exactly the same syntax.

An example is the `main_0002` revision which was added to drop the
hashes of certain nodes. This uses the `#-` operator which is JSONB
specific syntax of PostgreSQL and is not supported by SQLite. Since this
migration was added before the `SqliteDosStorage` plugin was added, this
has never caused a problems as all profiles would be new, would not have
any nodes and therefore the SQL code of the migration would not actually
be executed.

In preparation for any future migrations that may need to be added, the
`SqliteDosStorage` now uses the `SqliteDosMigrator`. This subclasses the
`PsqlDosMigrator` as it can still use most of the functionality, but it
changes a few critical things. Most notably the location of the schema
versions which now are kept individually and are no longer lent from the
`core.psql_dos` plugin.

The initial version `main_0001_initial.py` is taken from the migration
`main_0000_initial.py` of the `core.sqlite_zip` storage plugin. The only
difference is that UUID fields are declared as `String(32)` instead of
`CHAR(32)`. The SQLAlchemy models that are automatically generated for
SQLite from the PostgreSQL-based models actually use the latter type.
See `aiida.storage.sqlite_zip.models:pg_to_sqlite`.
@sphuber sphuber force-pushed the feature/sqlite-migrations branch from 619a6fc to 67603b6 Compare July 5, 2024 07:17
@sphuber
Copy link
Contributor Author

sphuber commented Jul 5, 2024

I am definitely not the best person to review this as most of it flies over my head and I currently don't have time to delve into to storage implementation more. I mostly have questions. :-)

Thanks regardless for the review @danielhollas . I don't think there is anyone else on the team that is more familiar with this part. I have tested in practice by actually adding a migration for the sqlite storage and it works as expected. So I am just going to merge this. If there are problems, we will come across this when we actually ever need to add a migration in the future

@sphuber sphuber merged commit 6196dcd into aiidateam:main Jul 5, 2024
11 checks passed
@sphuber sphuber deleted the feature/sqlite-migrations branch July 5, 2024 07:49
mikibonacci pushed a commit to mikibonacci/aiida-core that referenced this pull request Sep 3, 2024
…am#6429)

The majority of the `SqliteDosStorage` piggy-backs off of the
`PsqlDosStorage` plugin. It also uses the `PsqlDosMigrator` as-is to
perform the database migrations. This is not safe however, as PostgreSQL
and SQLite do not have exactly the same syntax.

An example is the `main_0002` revision which was added to drop the
hashes of certain nodes. This uses the `#-` operator which is JSONB
specific syntax of PostgreSQL and is not supported by SQLite. Since this
migration was added before the `SqliteDosStorage` plugin was added, this
has never caused a problems as all profiles would be new, would not have
any nodes and therefore the SQL code of the migration would not actually
be executed.

In preparation for any future migrations that may need to be added, the
`SqliteDosStorage` now uses the `SqliteDosMigrator`. This subclasses the
`PsqlDosMigrator` as it can still use most of the functionality, but it
changes a few critical things. Most notably the location of the schema
versions which now are kept individually and are no longer lent from the
`core.psql_dos` plugin.

The initial version `main_0001_initial.py` is taken from the migration
`main_0000_initial.py` of the `core.sqlite_zip` storage plugin. The only
difference is that UUID fields are declared as `String(32)` instead of
`CHAR(32)`. The SQLAlchemy models that are automatically generated for
SQLite from the PostgreSQL-based models actually use the latter type.
See `aiida.storage.sqlite_zip.models:pg_to_sqlite`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants